Hi Jan,

Jan Holesovsky wrote:
Hi Heiner,

On Monday 15 January 2007 16:58, Jens-Heiner Rechtien wrote:

the time is ripe to finally retire the old working horse CVS, no doubt
about it. Incidentally I've been evaluating the technical aspects of
potential candidates for our next SCM lately. I've specifically looked
at Mercurial (a distributed SCM) and Subversion (a centralized SCM). I'm
planning to have a look at git as well.

Sure - one thing that could talk against git is that it contains all the history locally - it could be just too much for OOo. I'll see what I'll get after the conversion.

For Mercurial the repository size is about 3.4 GiB, trunk only (no branches) and only the code directories, which is probably the minimum history preserving import possibility which makes sense. Not exactly handy if something like this must be cloned via the network. Git is rumored to be more efficient but we'll see. Of course a subversion repository for this case is about 5.8 GiB, but one never copies that to the local hard disk. A full import (all branches, all projects, all historic modules) in subversion results in a repository size of around 50 GiB. For a central repository this is still easy manageable. For Mercurial this kind of import was not possible, because branches are not well supported (a branch is a separate repository and we got ~5000 of them). The small import took several weeks anyway, so probably I wouldn't have the patience for a full import :-)


For sure I'll have to split out things like the project www pages, etc. Maybe I'll have to do even something similar to the recent source tarball split into -core, -binfilter, -l10n, etc.

All the current code modules should be in one repository, I certainly wouldn't like pulling changesets for one CWS from different repositories and you loose the benefit of being able to move the files around etc. And yes, this includes binfilter :-). But the www directories and the localization projects can be in one or more different repositories.


For OpenOffice.org I think Subversion suits well, much better than
Mercurial. Some points in favor of Subversion will probably still hold
if Git is pitted against Subversion.

- Clytie Siddall argues that Subversion is very similar CVS and quite
   common known to developers. I wouldn't dismiss this argument, it's
   important but, of course, not paramount.

Yes, but as I wrote, Cogito gives us the familiar (CVS-like) environment as well.

- the CWS system lives from published branches. Now we get these
   branches easily in Subversion whereas we would need to setup a whole

Don't forget that SVN has no branches! ;-)

   new publishing infrastructure with any DSCM system whether it's
Mercurial or Git. No, getting changesets via email is not an option :-)

:-) Well, this is a point where I was probably not clear enough in my previous mail.

From my point of view, we do not have to setup any new infrastructure. We can use EIS exactly for publishing the branches. Let me show the scenario 'a developer develops a feature':

- git clone git://the.master.server/ooo.git openoffice.org
  - will create openoffice.org with the repository from ooo.git
- cd openoffice.org && git branch myfeature01
- git checkout myfeature01
  - now I work on the myfeature01 branch
- develop the feature
  - do 'git commit -a' from time to time (will commit the changes locally!)
- git push openoffice.org git://any.server/kendy.git
  - will publish the changes

And exactly here we would need to setup a full authentication scheme for the committers which are allowed to push things upstream. That's not impossible of course but subversion does that already for us. We would work in a centralized way with a distributed SCM which somehow feels wrong - and requires us to re-engineer some the features of a centralized SCM ourselves.

Alternatively we could set up a publishing scheme and release engineering pulls changesets for integration from trusted committer URLs. Possible but getting a hundred or so developers to agree on standard scheme isn't easy :-)


At any time I can open EIS entry with status 'new' saying, 'I'm developing feature XY in git://any.server/kendy.git, branch 'myfeature01'.

One of the advantages of a DSCM is that it doesn't need net connectivity all the time. This is far worse ... you'll need a server which is up and serving so that others are able to take an early look of your CWS (or just the tinderboxes and buildbots which are building your CWS). Again, possible, but with a centralized SCM you get that basically for free and only one server needs to be up all the time.


When finished, the developer sets it to 'Ready for QA', and QA will set it 'Approved by QA' when successfully QA'd. And now the one of the release engineers (who are the only ones who could push to git://the.maseter.server/ooo.git):

- git clone git://the.master.server/ooo.git openoffice.org
- later the RE will probably still have the 'master' handy, and will just pull from other branches there
- cd openoffice.org
- git pull git://any.server/kendy.git myfeature01
  - it was written in the EIS
- git push git://the.master.server/ooo.git

Looks like Git and Mercurial work pretty much about the same way (even the commands are the same) so they will share at least conceptually the same advantages and disadvantages.


The best thing about this is, that from what I know, we could use both the CVS and git at the same time temporarily (forget the release engineer's part ;-) - that would still be CVS) - using git-cvsexportcommit (http://issaris.blogspot.com/2005/11/cvs-to-git-and-back.html), no outage due to 'we are moving to a new SCM'.

I've thought a bit about how to move to a new SCM system and came to the conclusion that the best way to do it is big bang style and be done with it in a month without sleep or so :-) I could be wrong, though.


But of course, all of this must be tested, I could be wrong, etc.

- we don't want to loose our history. I did a trial import in Mercurial
   and beside the fact that it was dog slow (I've heard Git is better in
   this respect) it couldn't cope with the existing branches and resulted
   in a huge blob (several GB's) as repository which wasn't nice to
   handle. I'll check what Git can do in this repsect.

So far I've been testing the import on a small portion of the repository, and with a slightly modified git-cvsimport I'm getting beautiful merge history shown in gitk ;-)

- the SCM system is only a part of the development infrastructure. We
   use Collab.Net's CEE. Subversion is nicely integrated with CEE (no
   wonder, Collab.Net sponsored Subversion), thus we can use Subversion
   as a drop in replacement for CVS. Things would be infinitely harder
   with Git, admittedly not only for technical reasons.

The www pages would not be part of the git repository anyway. We can save us a lot of pain, and could have them still in CVS - I don't think it would be a problem for anyone.

The good news is, things are now moving. See also Nils Fuhrmanns blog
(http://blogs.sun.com/GullFOSS/entry/is_subversion_ooo_s_next). I'm
pretty certain that we'll have a new SCM system in the future. I just
takes a bit of time to change course with a behemoth like OpenOffice.org
and a lot of planning.

:-) Yes. I hope we'll have more input when I'll be able to show an imported tree so that people can test it/play with it. Of course, it can also show us that git is really not for us...

I'm looking forward to play around with a git OpenOffice.org repository.

Heiner

--
Jens-Heiner Rechtien
[EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to