On 11-03-24 06:12 PM, Aryeh Gregor wrote:
> On Tue, Mar 22, 2011 at 10:46 PM, Tim Starling<tstarl...@wikimedia.org>  
> wrote:
>> If we split up the extensions directory, each extension having its own
>> repository, then this will discourage developers from updating the
>> extensions in bulk. This affects both interface changes and general
>> code maintenance. I'm sure translatewiki.net can set up a script to do
>> the necessary 400 commits per day, but I'm not sure if every developer
>> who wants to fix unused variables or change a core/extension interface
>> will want to do the same.
> I've thought about this a bit.  We want bulk code changes to
> extensions to be easy, but it would also be nice if it were easier to
> host extensions "officially" to get translations, distribution, and
> help from established developers.  We also don't want anyone to have
> to check out all extensions just to get at trunk.  Localization, on
> the other hand, is entirely separate from development, and has very
> different needs -- it doesn't need code review, and someone looking at
> the revision history for the whole repository doesn't want to see
> localization updates.  (Especially in extensions, where often you have
> to scroll through pages of l10n updates to get to the code changes.)
>
> Unfortunately, git's submodule feature is pretty crippled.  It
> basically works like SVN externals, as I understand it: the larger
> repository just has markers saying where the submodules are, but their
> actual history is entirely separate.  We could probably write a script
> to commit changes to all extensions at once, but it's certainly a less
> ideal solution.
git's submodule feature is something like svn-externals but has a big 
fundamental difference.
svn externals tracks only a repo. so you update you get the latest 
version of that repo.
git submodules tracks a repo and a commit id, always. So when you update 
you always get the same commit id. Changing that commit id requires 
making a commit to the git repo to update it. You can also checkout an 
old commit and submodule update will checkout the commmit id of the 
submodule that was committed at that point in time.
But yes, for both of them it's merely references, they do not store the 
actual history. They're glorified helper scripts essentially, they don't 
alleviate the task of downloading each repo separately. They just make 
the vcs do it for you, instead of you running a script in some other 
language to do it for you.

In my honest opinion, submodules was not designed for what we are trying 
to shove into it. And given that one of it's key features (tracking a 
specific commit id to ensure the same version is always checked out) is 
actually the opposite of what we want, I believe the actual 
functionality of git submodules in this situation is no better than what 
we could build ourself with a few simple custom scripts. In fact I 
believe we could build something better for our purposes without too 
much effort. And we could check it into a git repo in place of the repo 
that submodules would be put in. If you dig through the git discussions 
I believe I listed a number of features we could add that would make it 
even more useful. Instead of a second repo, we could just put the tool 
itself inside mw's repo so that by checking out phase3 you get the tools 
needed to work with extensions.

> If we moved to git, I'd tentatively say something like
>
> * Separate out the version control of localization entirely.
> Translations are already coordinated centrally on translatewiki.net,
> where the wiki itself maintains all the actual history and
> permissions, so the SVN checkin right now is really a needless
> formality that keeps translations less up-to-date and spams revision
> logs.  Keep the English messages with the code in git, and have the
> other messages available for checkout in a different format via our
> own script.  This checkout should always grab the latest
> translatewiki.net messages, without the need for periodic commits.  (I
> assume translatewiki.net already does automatic syntax checks and so
> on.)  Of course, the tarballs would package all languages.
+1
> * Keep the core code in one repository, each extension in a separate
> repository, and have an additional repository with all of them as
> submodules.  Or maybe have extensions all be submodules of core (you
> can check out only a subset of submodules if you want).
> * Developers who want to make mass changes to extensions are probably
> already doing them by script (at least I always do), so something like
> "for EXTENSION in extensions/*; do cd $EXTENSION; git commit -a -m
> 'Boilerplate message'; cd ..; done" shouldn't be an exceptional
> burden.  If it comes up often enough, we can write a script to help
> out.

> * We should take the opportunity to liberalize our policies for
> extension hosting.  Anyone should be able to add an extension, and get
> commit access only to that extension.  MediaWiki developers would get
> commit access to all hosted extensions, and hooking into our
> localization system should be as simple as making sure you have a
> properly-formatted ExtensionName.i18n.php file.  If any human
> involvement is needed, it should only be basic sanity checks.
I LOVE this idea too, it's been on my mind for awhile.

Brion mentioned that there is some prior art in git farming. Gitorious' 
codebase is open source. Wikimedia could host a copy of it for the 
purposes of hosting git repos for MediaWiki and extensions/
Built in management of pubkeys, projects and project repos (Say, 
MediaWiki, extensions as projects, and some groups of extensions like 
SMW could be put in one project), teams (put core devs in a team and 
give them access to the trunk like MediaWiki core repo; we can also add 
teams like smw-devs that let us open up groups of extensions to groups 
of people collaborating on them), team clones (make wmf a team and make 
the wmf branch a clone of the MediaWiki repo for access control), 
personal clones (so users without access to core can still make a clone, 
keep it in a place tied with potential code review, and participate by 
sending merge requests back to core so devs can pick them up and put 
them in; is this a form of pre-commit review?), and of course the code 
for letting someone sign up, not have commit to everything, but create 
their own project repo for an extension and start committing to it.

Oh, as a little bonus. Theoretically we may be able to make some 
moderate tweaks to Gitorious and build in a simple api that'll list all 
extensions, as tagged. You can already get something close by using .xml 
on the project view (since it's a rails app). Using that data we could 
easily build a tool that would clone all extensions, and from there let 
you batch commit/push/checkout,branch/updateremote/etc. And we could 
easily build it to take account of labeling, meaning you could 
potentially checkout all extensions in TWN, or all extensions tagged as 
SMW, or all extensions tagged as 'UsedOnWMF'. Naturally of course it 
would be trivial to make it checkout the repo for an extension by name.

I'd love git being first class and Wikimedia hosted. I'd probably take 
monaco-port (which is on GitHub right now) and make the repo on 
Wikimedia the primary repo.
> * Code review should migrate to an off-the-shelf tool like Gerrit.  I
> don't think it's a good idea at all for us to reinvent the code-review
> wheel.  To date we've done it poorly.
>
> This is all assuming that we retain our current basic development
> model, namely commit-then-review with a centrally-controlled group of
> people with commit access.  One step at a time.
A mixed format might be possible too. Where the bulk of developers can 
commit to one repo, but we have a second repo for post-review code which 
is considered to be a more stable trunk. And naturally whatever we do we 
can make it easier for non-devs to submit code by publishing their own 
public clone.
> On Wed, Mar 23, 2011 at 2:51 PM, Diederik van Liere<dvanli...@gmail.com>  
> wrote:
>> The Python Community recently switched to a DVCS and they have
>> documented their choice.
>> It compares Git, Mercurial and Bzr and shows the pluses and minuses of
>> each. In the end, they went for Mercurial.
>>
>> Choosing a distributed VCS for the Python project:
>> http://www.python.org/dev/peps/pep-0374/
> They gave three reasons:
>
> 1) git's Windows support isn't as good as Mercurial's.  I don't know
> how much merit that has these days, so it bears investigation.  I have
> the impression that the majority of MediaWiki developers use
> non-Windows platforms for development, so as long as it works well
> enough, I don't know if this should be a big deal.
For cli there's mysgit. For gui there is TortoiseGit and gitextensions.
I hear comments that TortoiseGit lacks some of gits features, namely 
interaction with the index. However it's supposed to feel fairly similar 
to TortoiseSVN (which if we have svn Windows users using a GUI, I expect 
they're probably using, so that might be helpful). However gitextensions 
looks fairly interesting, I'm not a Windows user anymore so I haven't 
looked at it in depth:
http://sourceforge.net/projects/gitextensions/

That pep was from a year ago, so git's Windows support can only have 
gotten better.
> 2) Python developers preferred Mercurial when surveyed.  Informally,
> I'm pretty certain that most MediaWiki developers with a preference
> prefer git.
Thanks in part to GitHub, git is definitely as someone else mentioned 
the 'flavor of the week', though to be fair, in a sense I believe svn 
was similar to that aspect. I do believe that we are likely to find a 
lot more MW devs that are comfortable with git than with other dvcs.
> 3) Mercurial is written in Python, and Python developers want to use
> stuff written in Python.  Not really relevant to us, even those of us
> who like Python a lot.  :)  (FWIW, despite being a big Python fan, I'm
> a bit perturbed that Mercurial often prints out a Python stack trace
> when it dies instead of a proper error message . . .)
>
> GNOME also surveyed available options, and they decided to go with
> git:<http://blogs.gnome.org/newren/2009/01/03/gnome-dvcs-survey-results/>
>   Although of course, (1) would be a bit of a nonissue for them.
-- 
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to