One more thing, perhaps of importance, the raw Lucene repo contains
all the history of projects that then turned top-level (Nutch,
Mahout). These could also be dropped (or ignored) when converting to
git. If we agree JARs are not relevant, why should projects not
directly related to Lucene/ Solr be?

Dawid

On Tue, Dec 8, 2015 at 10:05 PM, Dawid Weiss <dawid.we...@gmail.com> wrote:
>> Don’t know how much we have of historic jars in our history.
>
> I actually do know. Or will know. In about ~10 hours. I wrote a script
> that does the following:
>
> 1) git log all revisions touching https://svn.apache.org/repos/asf/lucene
> 2) grep revision numbers
> 3) use svnrdump to get every single commit (revision) above, in
> incremental mode.
>
> This will allow me to:
>
> 1) recreate only Lucene/ Solr SVN, locally.
> 2) measure the size of SVN repo.
> 3) measure the size of any conversion to git (even if it's one-by-one
> checkout, then-sync with git).
>
> From what I see up until now size should not be an issue at all. Even
> with all binary blobs so far the SVN incremental dumps measure ~3.7G
> (and I'm about 75% done). There is one interesting super-large commit,
> this one:
>
> svn log -r1240618 https://svn.apache.org/repos/asf/lucene
> ------------------------------------------------------------------------
> r1240618 | gsingers | 2012-02-04 22:45:17 +0100 (Sat, 04 Feb 2012) | 1 line
>
> LUCENE-2748: bring in old Lucene docs
>
> This commit diff weights... wait for it... 1.3G! I didn't check what
> it actually was.
>
> Will keep you posted.
>
> D.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to