Grant's 1.3gb record commit was adding HTML files with JavaDocs to the cms.... probably not that relevant either.
svn log -v -r1240618 https://svn.apache.org/repos/asf/lucene It's fun exploring, actually... I bet with a few proper exclusions one can get down to manageable size. As always with conversions between version management systems, the question remains how to map tags/ branches to their corresponding git concepts, etc. D. On Tue, Dec 8, 2015 at 10:16 PM, Dawid Weiss <dawid.we...@gmail.com> wrote: > One more thing, perhaps of importance, the raw Lucene repo contains > all the history of projects that then turned top-level (Nutch, > Mahout). These could also be dropped (or ignored) when converting to > git. If we agree JARs are not relevant, why should projects not > directly related to Lucene/ Solr be? > > Dawid > > On Tue, Dec 8, 2015 at 10:05 PM, Dawid Weiss <dawid.we...@gmail.com> wrote: >>> Don’t know how much we have of historic jars in our history. >> >> I actually do know. Or will know. In about ~10 hours. I wrote a script >> that does the following: >> >> 1) git log all revisions touching https://svn.apache.org/repos/asf/lucene >> 2) grep revision numbers >> 3) use svnrdump to get every single commit (revision) above, in >> incremental mode. >> >> This will allow me to: >> >> 1) recreate only Lucene/ Solr SVN, locally. >> 2) measure the size of SVN repo. >> 3) measure the size of any conversion to git (even if it's one-by-one >> checkout, then-sync with git). >> >> From what I see up until now size should not be an issue at all. Even >> with all binary blobs so far the SVN incremental dumps measure ~3.7G >> (and I'm about 75% done). There is one interesting super-large commit, >> this one: >> >> svn log -r1240618 https://svn.apache.org/repos/asf/lucene >> ------------------------------------------------------------------------ >> r1240618 | gsingers | 2012-02-04 22:45:17 +0100 (Sat, 04 Feb 2012) | 1 line >> >> LUCENE-2748: bring in old Lucene docs >> >> This commit diff weights... wait for it... 1.3G! I didn't check what >> it actually was. >> >> Will keep you posted. >> >> D. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org