Introduction

The primary goal of this book is to introduce you, thinBasic developer, to the concept of Git in connection with GitHub.

The secondary goal is to set some boundaries and common practices for Git interaction across our growing community.

No previous knowledge of Git is expected or required in order to read this book.

All opinions expressed in this book are a view of the author, who has 5-year experience with Git. This is not an official Git guide.

This book is practically oriented on the common day-to-day tasks. Once you get confident with them and understand the benefits of Git for them, further study of Git is highly encouraged.

This book is currently work-in-progress

What do I gain learning Git?

Using Git through your development will allow you the following:

  • track changes from the start of the project till its current state
  • compare changes, line-by-line, between two versions of the code
  • work on multiple variants without the need for crazy copy-paste
  • feel more confident during bigger changes, because you can always safely go back
  • collaborate on single code with others in a standardized, organized way
  • have code accessible anywhere, because it can be stored in the cloud

Git is a version control system which became the de-facto standard, used by both individuals, developers of operating systems and big companies.

By learning Git, you learn a tool which you can re-use with different languages and in different situations.

What is GitHub?

GitHub is one of the providers of storage for the Git data. There are many alternatives, such as GitLab, SourceForge or Stash.

GitHub is chosen as a single example for this book, for the purpose of simplicity, while providing free accounts for your experiments.

How to read this book?

This book is written in a goal-oriented fashion. The best way to get the most out of this book is to start applying what you read immediately, even if on a dummy project.

Core concepts

This chapter explains the core concepts of the Git world.

Repository - what repository is, for what it should and shouldn't be used and how to configure it.

Fork - clone of an already existing repository.

Remote - how you connect your local repository to the cloud.

Commit - set of changes.

Branch - explains the relationship between branch and commit.

Push - explains the updating of remote repository.

Fetch - explains the retrieval of the information about the remote repository.

Merge - explains the change integration.

Rebase - explains how to keep branches up to date with the base branch.

Pull - explains the operation of syncing the local repository with the remote.

Pull request - explains the concept of change proposal.

Repository

Git repository is a directory, containing your project or its subpart.

Scope

A repository should be used for a single project, or better, its smallest functional part.

To give a specific example - thinBasic consists of an interpreter, many modules, thinAir integrated development environment, and help file.

While one could be tempted to have it all in a single repository, it would have the following issues:

  • the repository size would grow bigger at a faster pace compared to separate repositories
  • when you would need to revert a change in the interpreter, you would go back in time also for other components

In this particular case, the project should be split into multiple repositories this way:

  • interpreter
  • one repository for each module
  • thinAir
  • help file

System files

Compared to a standard system directory, a repository contains:

  • hidden .git directory, which is created by Git and contains the history of changes of your project
  • git configuration files, if needed

A repository should also contain readme, license and adhere to some normalized structure. This will be discussed in the practical part.

Technical restrictions

The Git repositories are optimized to work with text data, which makes them ideal for storing source code. Git allows to display changes between any two revisions of text data and store the history of changes efficiently.

While it is technically possible to add binary files to the repository, such as EXE/DLL files, PDF documents, and other common file formats, please be aware of the fact that this might significantly increase the size of your repository. The history of text files is stored via difference between two versions, however, each change in the binary file is stored as a new copy.

Fork

The fork is a clone of an already existing repository.

When is this needed?

Usually, when you join existing projects, you don't have the right to modify them directly.

By creating a fork, a clone of the original, you may start working on your own version or create changes in the fork and propose them to the original repository via so-called pull requests.

You will not need to create a fork when working on your own project alone.

Remote

Pointer to the remote repository on the server.

Why is it good?

Linking your local Git repository with remote allows you to:

  • backup your source code on the server
  • continue work on the same project from any PC with access to the server
  • share your work with others

For the purpose of this book, the remote repository will be a repository on the GitHub.

This will allow you to work on the same project from any PC with an Internet connection.

One or many?

Remote repository and remote are two different terms.

Typically, for your private projects, you will have one remote, called always origin, does not matter how the remote repository is really named.

Should you work on fork of an existing project, you will have two remotes:

  • origin, which is your remote forked version
  • upstream, which is the original repository from which you forked

Commit

A single set of changes.

What can it contain?

A single commit might contain addition and/or modification and/or removal of files.

How does it work?

Imagine a repository with two commits:

  • the first commit adds files a.txt and b.txt.
  • the second commit removes file a.txt, modifies b.txt and adds c.txt.

The repository now contains modified b.txt and c.txt. If you want, you can go back in time to the first commit - anytime. And without losing the ability to travel back to the last commit again.

Time travel is made possible thanks to the .git directory, which contains the history of changes in an efficient format.

How can I reference specific commit?

Each commit has so-called commit hash. It is unique across the repository.

Each commit has also a commit message, which is a brief summary of changes specified by you.

Commit messages are referenced in this book like: "Commit message"

Branch

Unit of the organization in a repository, basically a pointer to a commit.

Branches are indicated in brackets in this book: [branch]

Master branch

The branch present in any new repository is called master.

[master] should not contain work in progress code. It should be used for storing the finished version of the code for a given application/solution version.

Additional branches

Any new modifications - features, fixes, refactoring, documentation adjustments or tests - should be developed in separate branches, before promoting them to [master].

Unlike [master], it is perfectly fine if the branch contains work in progress code.

Push

The process of moving changes from your local repository to remote repository.

Example

Imagine you have 1 local repository with 1 remote repository, referenced as origin.

Both the local repository contains the "Initial commit" in the [master] branch.


o "Initial commit" [master][origin/master]    o "Initial commit" [master]
(local)                                       (origin)

On the left, you can see the local repository with local [master] branch, while the local repository is aware of the fact there is a master in origin as well.

Then, you create a new feature branch on the local repository, and commit some code into it:


o "New feature 1"  [feature1]
|
o "Initial commit" [master][origin/master]    o "Initial commit" [master]
(local)                                       (origin)

You can see the local repository changed, but the remote repository didn't.

How to sync them?

You need to push the feature1 branch to the remote.


o "New feature 1"  [feature1][origin/feature1] o "New feature 1"  [feature1]
|                                              |
o "Initial commit" [master][origin/master]     o "Initial commit" [master]
(local)                                        (origin)

After the push, both repositories are in sync for the given branch.

Fetch

The operation to make your Git aware of the changes in the remote repository.

Example

Imagine you have 1 local repository with 1 remote repository, referenced as origin.

Both the local repository contains the "Initial commit" in the [master] branch.


o "Initial commit" [master][origin/master]    o "Initial commit" [master]
(local)                                       (origin)

Then, some other collaborator updates the remote repository:


o "Initial commit" [master]                    o "Initial commit" [master]
(local)                                        (origin)

What? What change?

If you would like to see the change, you first need to make your Git aware of changes in the remote repository.

How to do it?

Fetch!


                                               o "New feature 1"  [feature1]
                                               |
o "Initial commit" [master][origin/master]     o "Initial commit" [master]
(local)                                        (origin)

Once you do this, you have an idea about what was the state of the remote repository at the time of fetch:

Please note fetching does not update your local repository with the remote changes. You have to pull to sync!

Merge

The act of promotion of changes by merging them with the existing state.

Example

Imagine you have master branch and two branches feature1 and feature2.

Both branches were created from master at the same time and do not contain any changes yet.

Because branch is basically a pointer to commit, the initial state looks like this:


o "Initial commit" [master] [feature1] [feature2]

Then you commit some changes to feature1:


o "New feature 1"  [feature 1]
|                  
o "Initial commit" [master] [feature2]

Then you merge the changes from feature1 to master, modifying the repository to this:


o "New feature 1"  [master] [feature1]
|
o "Initial commit" [feature2]

Rebase

The act of changing the base commit.

Example

Let's return to our example from Merge chapter.

There were master, feature1 and feature2 branches originally, then feature1 got modified and then merged to master, making the whole situation look like this::


o "New feature 1"  [master] [feature1]
|
o "Initial commit" [feature2]

Imagine we now add some new functionality to feature2:


o "New feature 1"  [master] [feature1]
|
| o "New feature 2" [feature2]
|/
o "Initial commit"

You can see the feature2 branch is no longer at the "Initial commit", but it created a true separate branch in the commit tree.

So far so good, but you might notice the feature2 no longer contains the changes from the updated master, which is now on the New feature 1 commit.

How to make feature2 contain the current master code + the changes in New feature 2 commit?

The answer is, to rebase the feature2 on the current master. That is the same as rebasing feature2 on the New feature1 commit, by the way.

After the rebase on master, the commit tree looks like this:


o "New feature 2"  [feature2]
|
o "New feature 1"  [master] [feature1]
|
o "Initial commit"

Looking at the commit tree, we can see it is linear sequence again, with "Initial commit" being first, "New feature 1" second and "New feature 2" the last commit.

The code at the "New feature 2" contains the cumulative contents of the two previous commits, and it is ready to be merged to master.

If we would try to merge feature2 without rebase, we would revert changes in master!

Pull

The process of getting changes from your remote repository to the local one.

Example

Imagine you have 1 local and 1 remote repository, both with [master] on initial commit.

Then, somebody updates the remote repository with [feature1] branch.

In order to retrieve the branch, you first need to make your local repository aware of the branch via fetch:


                                               o "New feature 1"  [feature1]
                                               |
o "Initial commit" [master][origin/master]     o "Initial commit" [master]
(local)                                        (origin)

Then you pull the [feature1] branch to your local repository:


o "New feature 1"  [feature1][origin/feature1] o "New feature 1"  [feature1]
                                               |
o "Initial commit" [master][origin/master]     o "Initial commit" [master]
(local)                                        (origin)

Always fetch before you pull!

Pull request

Request for a change, based on an existing branch.

Meaning

While on your local repository, you can always merge one branch to another, when working on a project with somebody else, you should never do it directly, but via pull request instead.

Doing pull request will allow you to specify where you want to integrate your change, while it will allow the reviewer to check the change, possibly propose changes and then the reviewer can do the merge.

While this dance around the integration might seem redundant, it has many benefits:

  • the code will be seen, your collaborator will be better aware of the change
  • the code can be improved before integration, giving you valuable coding lesson