Send an email to dev-unsubscr...@nutch.apache.org and follow the instructions from there...
On 6/24/10 9:36 PM, "Vimal Varghese" <vimal.vargh...@tcs.com> wrote: Vimal Varghese -----Claus Schröter (JIRA) wrote: ----- To: dev@nutch.apache.org From: Claus Schröter (JIRA) <j...@apache.org> Date: 06/25/2010 01:59AM Subject: [jira] Commented: (NUTCH-655) Injecting Crawl metadata [ https://issues.apache.org/jira/browse/NUTCH-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882313#action_12882313 ] Claus Schröter commented on NUTCH-655: -------------------------------------- Hi Julien, thanks for this patch... is there any way to inherit the metadata or parts of it to suburls while crawling? I fiddled around with a scoring filter but with no success. Cheers Claus > Injecting Crawl metadata > ------------------------ > > Key: NUTCH-655 > URL: https://issues.apache.org/jira/browse/NUTCH-655 > Project: Nutch > Issue Type: Improvement > Components: injector > Reporter: Julien Nioche > Assignee: Julien Nioche > Priority: Minor > Fix For: 1.1 > > Attachments: Injector.patch, NUTCH-655.v2 > > > the patch attached allows to inject metadata into the crawlDB. The input file > has to contain fields separated by tabs, with the URL being on the first > column. The metadata names and values are separated by '='. A input line > might look like this: > http://www.myurl.com <http://www.myurl.com/> \t categ=value1 \t > categ2=value2 > This functionality can be useful to store external knowledge and index it > with a custom plugin ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.mattm...@jpl.nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++