Hi Julien, >> (a) svn copy NutchBase from GitHub to the nutchbase branch in >> http://svn.apache.org/repos/asf/nutch/branches/nutchbase bringing the ASF >> branch up to date. > > this seems like an unnecessary step. There has been an enormous amount of > changes between the nutchbase branch and the version on GitHub - pretty much > EVERY class has been modified + a lot of classes have been removed etc... the > nutchbase branch on svn is completely obsolete so I suggest that simply get > rid of the nutchbase branch and move the code to trunk directly (after doing c > below of course)
That's the problem. If _every_ class has been modified like you state, then it's been modified outside of Apache and there is no SVN commit history and therefore no public log of the code that's been modified. We need to rectify that somehow... >> (b) Once the GORA license issues are figured out (they must be >> compatible with the ASF or we cannot use it), then we update Nutch to depend >> on the GORA jars via Ivy? > > see comment above about the license. Right, the license is one part of it, the jars are the other. Does GORA produce a jar file? >> (c) svn tag current Nutch trunk as 1.2-branch >> (d) svn merge nutchbase branch with nutch trunk > > see above I'm not sure I get it? How does seeing above deal with (c)? In terms of nutchbase branch merging with trunk, again, I'm a bit worried here since the three of you (Julien, Enis, and Doğacan) are the only ones that were significantly involved (correct me if I'm wrong) with the development of Nutchbase at Github, right, yet there are 7 Nutch PMC members and committers (one of which does not include Enis). How do you expect us to maintain the code you bring over unless those of us that were not involved in the Github development have some history/notion of what's been done vis-à-vis Github and Apache SVN? > > >> (e) roll the version # in nutch trunk to 2.0-dev >> (f) all issues in JIRA should be updated to reflect 2.0-dev fixes where >> it makes sense >> (g) a 2.1 version is added to mark anything that we don't want in 2.0 >> and we file post 2.0 issues there >> (h) Nutch 2.0 trunk is fixed, and brought up to speed and old code is >> removed. All unit tests should pass regression where it makes sense. >> (i) Nutch documentation is brought up to date on wiki and checked into >> SVN >> (j) We roll a 2.0 release > > OK with everything else Great. I'll wait to hear back on my questions/comments above. Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.mattm...@jpl.nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++