Hey Julien, I'll take care of the ones with my name on them below (NUTCH-564 and NUTCH-825).
Cheers, Chris On Jan 5, 2011, at 8:36 AM, Julien Nioche (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/NUTCH-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12977828#action_12977828 > ] > > Julien Nioche commented on NUTCH-951: > ------------------------------------- > > NUTCH-894 : has been written for 2.0 and would need some effort to backport > to 1.3 > I suggest that we leave it there. > > The list of things that IMHO are worth porting to 1.3 are now > > * NUTCH-564 External parser supports encoding attribute (Antony Bowesman, > mattmann) > * NUTCH-825 Publish nutch artifacts to central maven repository (mattmann) > * NUTCH-872 Change the default fetcher.parse to FALSE (ab). > * NUTCH-876 Remove remaining robots/IP blocking code in lib-http (ab) > * NUTCH-884 FetcherJob should run more reduce tasks than default (ab) > * NUTCH-921 Reduce dependency of Nutch on config files (ab) > > Any volunteers? > > >> Backport changes from 2.0 into 1.3 >> ---------------------------------- >> >> Key: NUTCH-951 >> URL: https://issues.apache.org/jira/browse/NUTCH-951 >> Project: Nutch >> Issue Type: Task >> Affects Versions: 1.3 >> Reporter: Julien Nioche >> Priority: Blocker >> Fix For: 1.3 >> >> >> I've compared the changes from 2.0 with 1.3 and found the following >> differences (excluding anything specific to 2.0/GORA) >> * NUTCH-564 External parser supports encoding attribute (Antony >> Bowesman, mattmann) >> * NUTCH-714 Need a SFTP and SCP Protocol Handler (Sanjoy Ghosh, mattmann) >> * NUTCH-825 Publish nutch artifacts to central maven repository >> (mattmann) >> * NUTCH-851 Port logging to slf4j (jnioche) >> * NUTCH-861 Renamed HTMLParseFilter into ParseFilter >> * NUTCH-872 Change the default fetcher.parse to FALSE (ab). >> * NUTCH-876 Remove remaining robots/IP blocking code in lib-http (ab) >> * NUTCH-880 REST API for Nutch (ab) >> * NUTCH-883 Remove unused parameters from nutch-default.xml (jnioche) >> * NUTCH-884 FetcherJob should run more reduce tasks than default (ab) >> * NUTCH-886 A .gitignore file for Nutch (dogacan) >> * NUTCH-894 Move statistical language identification from indexing to >> parsing step >> * NUTCH-921 Reduce dependency of Nutch on config files (ab) >> * NUTCH-930 Remove remaining dependencies on Lucene API (ab) >> * NUTCH-931 Simple admin API to fetch status and stop the service (ab) >> * NUTCH-932 Bulk REST API to retrieve crawl results as JSON (ab) >> Let's go through this and decide what to port to 1.3 > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++