Re: Welcome to Rebase Hell!

Assaf Arkin Mon, 02 Mar 2009 09:45:12 -0800

On Mon, Mar 2, 2009 at 7:57 AM, Daniel Spiewak <[email protected]> wrote:

> Welcome back from lurk-mode!  :-)
>
> I think this is an interesting issue beyond just Git vs SVN and how the
> project is hosted.  As Assaf said, the Git repository is still synced to
> the
> SVN (and vice versa), so there isn't any real side-stepping going on.  In
> fact, committers could use SVN directly if they so choose (without any
> negative impact on an SVN-based workflow), it's merely personal taste which
> drives everyone to Git.  The interesting point though is that while in
> incubation, Vic's GitHub repo really became the "unofficially canonical"
> one.  While Buildr's site did point everyone to the SVN *first*, the
> culture
> was such that Git was really the convention/standard across the board.
> That's not to say that SVN was discouraged in any way, it just wasn't used
> (except as the main store-point once commits were made).
>
> The larger issue here is what does it mean to be the "primary" source when
> all sources give the same artifacts?  Is it the repository officially
> recommended by the project?  Is it the repository decided by the "wisdom of
> the masses" in the development team?  As solutions like Git, Mercurial and
> Bazaar catch on, I think we're going to see more and more projects raising
> this issue: what does it mean to have multiple "canonical" sources and when
> does it really cause problems?

Apache has two interesting principles.

The first, you need a "licensed and reviewed contributions" repository. The
repository blessed by Apache so to speak, from which you cut releases.
There's a lot of wisdom of the years insisting on having this repository,
and also insisting that it run on Apache hardware (meaning, guarded and
monitored by Apache).

The second, all development have to be done in the open, that way everyone
can participate. It's the open development part of open source. (The context
here is the project and its community, not anything you do downstream.) Once
code has been brought to our attention -- JIRA patch, mailing list
discussions, etc -- we continue working on it in a public forum with as much
visibility as possible.

In the SVN model this job is handled by the master repository, but what
would we do if there was only Git?

You will still have one "licensed and reviewed contributions" Git
repository, and it will still be hosted on Apache hardware, and it will
still be only writable to committers. Again all those wisdoms of the years
will take us there. The difference is, you can also have any number of
perfectly synchronized clones, so development can happen elsewhere.

Now we get into the open development question. If development can happen
anywhere, how do I keep track of all the places where it happens? And where
do I find one place where it happens, so I can start tracking it? You need
to have at least one place that everyone can point to, a common ground, and
it needs to have certain guarantees: be an accurate clone of the
contribution repo, get synchronized quickly enough, not be the weakest link
(security wise), etc.

It doesn't have to be a single place, but having too many places could be a
problem. Where do you start? Are they all as well maintained? Will they last
long enough to be permalinks?

The second question is, once code has been brought to our attention, what
places do we have to accommodate it before it ends up in
the contribution repo. Our focus here is for everyone to be able to follow
changes to that code, say as a result of discussion on the mailing list, and
for committers to be able to pull it in as a contribution. This should be at
least as good as what we have right now; for the record, what we have right
now are JIRA patches.

In my experience, you can have as many canonical sources as you want, but as
a project you're responsible for their quality/reliability, it actually
works better to have as few as possible. One must be the "licensed and
reviewed contribution" repo, which Apache takes care of.

For people who want to track development, view the source code, branch off,
or start contributing, which repository would you point them to? Let's call
this one "the town hall repo", the one place we get to socialize around
code.

I absolutely agree that this should be a decision for the individual
project. It doesn't have to run on Apache infrastructure, as long as it
follows certain guidelines (public access, restricted writes, timely
updates, etc), it just has to work very well for that project. It therefore
can't be Apache official repository because the ASF can't govern other
people's infrastructure. But if the project agrees to supervise it for the
purposes I outlined above, why can't it be the project's primary public
repository?

To put it in context, all mailing list discussions have to pass through
Apache's servers, and Apache maintains the contributing archive. If in
doubt, look there. But projects can tell people to search in other archives,
like markmail or nabble. Can one of these off-site archives be the primary
point of reference for the project? If so, why can't we do the same for
source control?

Assaf

> Daniel
>
> On Mon, Mar 2, 2009 at 3:07 AM, Martijn Dashorst <
> [email protected]
> > wrote:
>
> > On Mon, Mar 2, 2009 at 1:34 AM, Assaf Arkin <[email protected]> wrote:
> > > I'm with you in using Github as the main repository:
> >
> > As an ASF Member I must protest against the direction that you are
> > taking this project. GitHub can not be used as the main repository for
> > any ASF Project. The canonical resource for Apache project's code must
> > be hosted on Apache hardware. Since the only repository that is
> > supported by Infrastructure is SVN, you'll have to maintain the
> > primary source for your project *in* SVN. Not somewhere else, not
> > bypassing ASF authorizations, not bypassing Apache policy.
> >
> > Martijn
> >
>

Re: Welcome to Rebase Hell!

Reply via email to