Hi all,

several Apache projects have moved parts of their development to GitHub, some
more (no more Jira), some less (issues in Jira, code on GitHub).

I have used GitHub more and more lately and find its processes quite convenient.
This mostly revolves around the concept of "pull requests". People submit
"pull-requests" instead of patches. Based on these, GitHub allows to do:

* convenient code reviews
* leave comments (line-based as well as general)
* approve/request changes
* discuss the pull request as it evolves
* even exporting a pull request as a patch is easy

Pull requests are essentially branches and it is possible for
interested developers to:

* check out these branches
* collaborate on them
* finally merge them into the master (trunk) branch when ready.

I have also made some experience setting up Jenkins in such a way that
it can automatically check pull requests:

* an admin developer leaves a comment on a pull request such as
  "Jenkins, test this please"
* Jenkins sees this comment and triggers a build of that particular
  contribution
* Jenkins updates a "build status" that is visible directly in the
  discussion thread of the pull request on GitHub. Depending on the
  result of the build, it will give a red or green light.
* That is very useful to do an integration test of a contribution after
  an initial code screening - no need to locally integrate a patch,
  locally test it, etc. The initial screening can happen by looking at
  a side-by-side div on the GitHub website - again no local effort.

Finally, GitHub allows to enforce certain processes if desired. E.g.
one can *optionally* enforce that:

* limit access to specific branches (e.g. master/trunk or maintenance branches)
* no changes can go into master (trunk) without being reviewed first as
  pull requests (a very strong 4 eyes principle, has ups and downs)
* pull requests must have passed a Jenkins check before they can be merged

All in all, I believe our project would benefit from moving to GitHub.
After an initial transition and adaption phase, our processes should become
smoother, easier and as a result more interactive. The main hurdle for
us developers would in my view be to learn how do to source code management
using a distributed approach. The way one has to think about code management
in git (or probably any other DSCM tool) is different from a centralized
approach like SVN. To take full advantage, branches move into the center
of attention and branching/merging becomes a much more common operation
than in SVN. Fortunately, git supports such operations very well.

The hurdle for contributions would also be lower (more and more people are
familiar with GitHub processes, and already have GitHub accounts.
Juggling with patch files is (at least for me) a great annoyance).

Of course, there is also the draw back that GitHub may shut down 
at some point (like Google Code did) or start charging or introduce
terms-of-service that are incompatible with our goals. But due to the
strong interest of various Apache projects on using the GitHub infrastructure,
INFRA has been working on various integrations e.g. mirroring GitHub
repositories (not sure how they handle it when Apache projects use GitHub
issues though). Also at present GitHub appears to be very community-oriented
when it comes to the terms-of-service, e.g. making text updates available to
the community early, soliciting and incorporating community feedback, etc.

On the bottom line, my personal judgement at this point in time is, that the
benefits of smoother process and easier community involvement seriously
outweigh such risks and that many of the risks have been mitigated by INFRA
anyway.

What do you think?

Best,

-- Richard

Reply via email to