> Thanks for the FYI guys. > > I've got this on my open source radar, along with > reviewing the Airavata release (incubating), and > the MRUnit release (incubating) for this week. > > I'll git er' done. Also, since the release updates for rc #2 > were largely aesthetic (aka packaging and naming > of the output folder, I might not even have to create a > new code branch of entry in repository.apache.org > for the Maven artifacts). Yay!
Indeed, otherwise just export that one revision if you must. Thanks and looking forward :) cheers > > Should be next day or two for rc #2 spin up. Also I pointed > Lewis at the OODT release guide (which is basically my > generic Apache release guide for most Java projects), > and he has updated the release wiki for Nutch to be > based off of this. > > Cheers, > Chris > > On Nov 16, 2011, at 9:41 AM, Markus Jelsma wrote: > >> Chris, > >> > >> Any idea of when you'll be able to push a new RC for 1.4? > >> Note : I think some stuff marked as 1.5 has been committed - we might > >> need to check the CHANGES > > > > Definately, i've committed several items. When i did my first trunk was > > already prepared for 1.5. > > > > Here's the list of changes since 1.4, please note that CHANGES also > > already contained the release note and date. > > > > This is the first rev. for 1.5: 1200344 (NUTCH-1153) > > This is the last rev. for 1.4: 1197319 (NUTCH-1195) > > > > If this has caused any inconvenience then i apologize. > > > > Thanks > > > > * NUTCH-1090 InvertLinks should inform when ignoring internal links > > (Marek Backmann via markus) > > > > * NUTCH-1174 Outlinks are not properly normalized (markus) > > > > * NUTCH-1203 ParseSegment to show number of milliseconds per parse > > (markus) > > > > * NUTCH-1185 Decrease solr.commit.size to 250 (markus) > > > > * NUTCH-1180 UpdateDB to backup previous CrawlDB (markus) > > > > * NUTCH-1173 DomainStats doesn't count db_not_modified (markus) > > > > * NUTCH-1155 Host/domain limit in generator is generate.max.count+1 > > (markus) > > > > * NUTCH-1061 Migrate MoreIndexingFilter from Apache ORO to > > java.util.regex (markus) > > > > * NUTCH-1178 Incorrect CSV header CrawlDatumCsvOutputFormat (markus) > > > > * NUTCH-1142 Normalization and filtering in WebGraph (markus) > > > > * NUTCH-1153 LinkRank not to log all keys and not to write Hadoop > > _SUCCESS file (markus) > > > >> Thanks > >> > >> Julien > >> > >> On 9 November 2011 10:21, Mattmann, Chris A (388J) < > >> > >> chris.a.mattm...@jpl.nasa.gov> wrote: > >>> Hi Julien, > >>> > >>> Thanks. OK, so I will respin an RC for 1.4 that > >>> fixes the naming screw up. I already created the KEYS file > >>> so we're fine there. > >>> > >>> Hopefully will get it done this week while at ApacheCon NA. > >>> BTW, had a great time meeting Lewis in person today, nice > >>> to meet you dude! > >>> > >>> Cheers, > >>> Chris > >>> > >>> On Nov 8, 2011, at 3:27 AM, Julien Nioche wrote: > >>>> Hi Chris > >>>> > >>>> > >>>> Thanks for the review. Would you consider the below blockers, or > >>>> would-be-nice-to-fix? If none are blockers I propose fixing them in > >>>> 1.5 and pushing 1.4. Thoughts? > >>>> > >>>> see below > >>>> > >>>> > >>>> I agree on the naming, sorry for the screw-up. > >>>> > >>>> no probs. Do you think this could be fixed for 1.4? > >>>> > >>>> The KEYS file isn't really needed, > >>>> since we just maintain a global keys file at > >>> > >>> http://www.apache.org/dist/nutch/KEYS. > >>> > >>>> 1.4? would need to modify build.xml > >>>> > >>>> Odd on the bin version containing the pom.xml file -- wonder why it's > >>> > >>> not part of the > >>> > >>>> src -- I just did an SVN export? > >>>> > >>>> strange indeed. > >>>> > >>>> > >>>> About the runtime/local thing, I think we can do that for 1.5, but I > >>>> am > >>> > >>> totally +1 for it. > >>> > >>>> OK for 1.5 > >>>> > >>>> Thanks a lot > >>>> > >>>> Julien > >>>> > >>>> > >>>> > >>>> Let me know what you think. Thanks! > >>>> > >>>> Cheers, > >>>> Chris > >>>> > >>>> On Nov 7, 2011, at 7:59 AM, Julien Nioche wrote: > >>>>> Thanks Chris, > >>>>> > >>>>> * it would be good to have the same folder name for the src and bin > >>>>> versions. They are currently 'nutch-1.4' and 'apache-nutch-1.4' > >>>>> * do we really need to include the KEYS file? > >>>>> * bin version contains pom.xml, src version does not. Either include > >>>>> in both or remove altogether > >>>>> * What about having the content of 'runtime/local' as a ready-to-use > >>> > >>> 'bin' > >>> > >>>>> distrib instead? Doesnt make sense to have runtime/deploy as the > >>> > >>> content of > >>> > >>>>> the job file (e.g. nutch-site.xml) would have to be generated from > >>>>> the source anyway. > >>>>> > >>>>> Julien > >>>>> > >>>>> On 5 November 2011 01:03, Mattmann, Chris A (388J) < > >>>>> > >>>>> chris.a.mattm...@jpl.nasa.gov> wrote: > >>>>>> Hi Folks, > >>>>>> > >>>>>> A candidate for the Nutch 1.4 release is available at: > >>>>>> http://people.apache.org/~mattmann/apache-nutch-1.4/rc1/ > >>>>>> > >>>>>> The release candidate is a zip and tar.gz archive of the sources in: > >>>>>> http://svn.apache.org/repos/asf/nutch/tags/release-1.4/ > >>>>>> > >>>>>> And a binary build suitable for deployment. > >>> > >>>>>> A staged Maven repository is available here: > >>> https://repository.apache.org/content/repositories/orgapachenutch-161/ > >>> > >>>>>> Please vote on releasing this package as Apache Nutch 1.4. > >>>>>> The vote is open for the next 72 hours and passes if a majority of > >>>>>> at least three +1 Nutch PMC votes are cast. > >>>>>> > >>>>>> [ ] +1 Release this package as Apache Nutch 1.4 > >>>>>> [ ] -1 Do not release this package because... > >>>>>> > >>>>>> Thanks! > >>>>>> > >>>>>> Cheers, > >>>>>> Chris > >>>>>> > >>>>>> P.S. Here's my +1. > >>>>>> > >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>>>> Chris Mattmann, Ph.D. > >>>>>> Senior Computer Scientist > >>>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >>>>>> Office: 171-266B, Mailstop: 171-246 > >>>>>> Email: chris.a.mattm...@nasa.gov > >>>>>> WWW: http://sunset.usc.edu/~mattmann/ > >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>>>> Adjunct Assistant Professor, Computer Science Department > >>>>>> University of Southern California, Los Angeles, CA 90089 USA > >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>>> > >>>>> -- > >>>>> * > >>>>> *Open Source Solutions for Text Engineering > >>>>> > >>>>> http://digitalpebble.blogspot.com/ > >>>>> http://www.digitalpebble.com > >>>> > >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>> Chris Mattmann, Ph.D. > >>>> Senior Computer Scientist > >>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >>>> Office: 171-266B, Mailstop: 171-246 > >>>> Email: chris.a.mattm...@nasa.gov > >>>> WWW: http://sunset.usc.edu/~mattmann/ > >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>> Adjunct Assistant Professor, Computer Science Department > >>>> University of Southern California, Los Angeles, CA 90089 USA > >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> > >>>> Open Source Solutions for Text Engineering > >>>> > >>>> http://digitalpebble.blogspot.com/ > >>>> http://www.digitalpebble.com > >>> > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>> Chris Mattmann, Ph.D. > >>> Senior Computer Scientist > >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >>> Office: 171-266B, Mailstop: 171-246 > >>> Email: chris.a.mattm...@nasa.gov > >>> WWW: http://sunset.usc.edu/~mattmann/ > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>> Adjunct Assistant Professor, Computer Science Department > >>> University of Southern California, Los Angeles, CA 90089 USA > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: chris.a.mattm...@nasa.gov > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++