I don't think you will get a volunteer until someone sums up the discussion
with a proposal that someone is not going to veto or something. We can't
expect everyone to read the same tea leaves and come to the same
conclusion.

Perhaps a stripped down mirror is the consensus. I'd rather we had some
agreement on what we were going to do though, rather than an agreement to
investigate. If we think stripping down is a technically feasible, and no
one is going to violently disagree still, then let's decide to do that.

- Mark



On Tue, Dec 15, 2015 at 11:39 AM Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:

> I thought the general consensus at minimum was to investigate a git mirror
> that stripped some artifacts out (jars etc) to lighten up the work of the
> process. If at some point the project switched to git, such a mirror might
> be a suitable git repo for the project with archived older versions in SVN.
>
> I think probably what is lacking is a volunteer to figure it all out.
>
>
> -Doug
>
> On Tue, Dec 15, 2015 at 11:32 AM, Mark Miller <markrmil...@gmail.com>
> wrote:
>
>> Anyone willing to lead this discussion to some kind of better resolution?
>> Did that whole back and forth help with any ideas on the best path forward?
>> I know it's a complicated issue, git / svn, the light side, the dark side,
>> but doesn't GitHub also depend on this mirroring? It's going to be super
>> annoying when I can no longer pull from a relatively up to date git remote.
>>
>> Who has boiled down the correct path?
>>
>> - Mark
>>
>> On Wed, Dec 9, 2015 at 6:07 AM Dawid Weiss <dawid.we...@gmail.com> wrote:
>>
>>> FYI.
>>>
>>> - All of Lucene's SVN, incremental deltas, uncompressed: 5.0G
>>> - the above, tar.bz2: 1.2G
>>>
>>> Sadly, I didn't succeed at recreating a local SVN repo from those
>>> incremental dumps. svnadmin load fails with a cryptic error related to
>>> the fact that revision number of node-copy operations refer to
>>> original SVN numbers and they're apparently renumbered on import.
>>> svnadmin isn't smart enough to somehow keep a reference of those
>>> original numbers and svndumpfilter can't work with incremental dump
>>> files... A seemingly trivial task of splitting a repo on a clean
>>> boundary seems incredibly hard with SVN...
>>>
>>> If anybody wishes to play with the dump files, here they are:
>>> http://goo.gl/m6q3J8
>>>
>>> Dawid
>>>
>>> On Tue, Dec 8, 2015 at 10:49 PM, Upayavira <u...@odoko.co.uk> wrote:
>>> > You can't avoid having the history in SVN. The ASF has one large repo,
>>> and
>>> > won't be deleting that repo, so the history will survive in perpetuity,
>>> > regardless of what we do now.
>>> >
>>> > Upayavira
>>> >
>>> > On Tue, Dec 8, 2015, at 09:24 PM, Doug Turnbull wrote:
>>> >
>>> > It seems you'd want to preserve that history in a frozen/archiced
>>> Apache Svn
>>> > repo for Lucene. Then make the new git repo slimmer before switching.
>>> Folks
>>> > that want very old versions or doing research can at least go through
>>> the
>>> > original SVN repo.
>>> >
>>> > On Tuesday, December 8, 2015, Dawid Weiss <dawid.we...@gmail.com>
>>> wrote:
>>> >
>>> > One more thing, perhaps of importance, the raw Lucene repo contains
>>> > all the history of projects that then turned top-level (Nutch,
>>> > Mahout). These could also be dropped (or ignored) when converting to
>>> > git. If we agree JARs are not relevant, why should projects not
>>> > directly related to Lucene/ Solr be?
>>> >
>>> > Dawid
>>> >
>>> > On Tue, Dec 8, 2015 at 10:05 PM, Dawid Weiss <dawid.we...@gmail.com>
>>> wrote:
>>> >>> Don’t know how much we have of historic jars in our history.
>>> >>
>>> >> I actually do know. Or will know. In about ~10 hours. I wrote a script
>>> >> that does the following:
>>> >>
>>> >> 1) git log all revisions touching
>>> https://svn.apache.org/repos/asf/lucene
>>> >> 2) grep revision numbers
>>> >> 3) use svnrdump to get every single commit (revision) above, in
>>> >> incremental mode.
>>> >>
>>> >> This will allow me to:
>>> >>
>>> >> 1) recreate only Lucene/ Solr SVN, locally.
>>> >> 2) measure the size of SVN repo.
>>> >> 3) measure the size of any conversion to git (even if it's one-by-one
>>> >> checkout, then-sync with git).
>>> >>
>>> >> From what I see up until now size should not be an issue at all. Even
>>> >> with all binary blobs so far the SVN incremental dumps measure ~3.7G
>>> >> (and I'm about 75% done). There is one interesting super-large commit,
>>> >> this one:
>>> >>
>>> >> svn log -r1240618 https://svn.apache.org/repos/asf/lucene
>>> >>
>>> ------------------------------------------------------------------------
>>> >> r1240618 | gsingers | 2012-02-04 22:45:17 +0100 (Sat, 04 Feb 2012) | 1
>>> >> line
>>> >>
>>> >> LUCENE-2748: bring in old Lucene docs
>>> >>
>>> >> This commit diff weights... wait for it... 1.3G! I didn't check what
>>> >> it actually was.
>>> >>
>>> >> Will keep you posted.
>>> >>
>>> >> D.
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Doug Turnbull | Search Relevance Consultant | OpenSource Connections,
>>> LLC |
>>> > 240.476.9983
>>> > Author:Relevant Search
>>> > This e-mail and all contents, including attachments, is considered to
>>> be
>>> > Company Confidential unless explicitly stated otherwise, regardless of
>>> > whether attachments are marked as such.
>>> >
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>> --
>> - Mark
>> about.me/markrmiller
>>
>
>
>
> --
> *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections
> <http://opensourceconnections.com>, LLC | 240.476.9983
> Author: Relevant Search <http://manning.com/turnbull>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
-- 
- Mark
about.me/markrmiller

Reply via email to