[jira] Updated: (NUTCH-412) plugin to parse the feed-url (rss/atom) of a blog
[ http://issues.apache.org/jira/browse/NUTCH-412?page=all ] Renaud Richardet updated NUTCH-412: --- Attachment: plugin_parse-feedUrl.diff unified diff against head (Rev: 481445) > plugin to parse the feed-url (rss/atom) of a blog > - > > Key: NUTCH-412 > URL: http://issues.apache.org/jira/browse/NUTCH-412 > Project: Nutch > Issue Type: New Feature >Affects Versions: 0.9.0 >Reporter: Renaud Richardet >Priority: Minor > Attachments: plugin_parse-feedUrl.diff > > > A plugin that extracts the feed-url (rss/atom) of a blog by retrieving the > href from the element (if found), and stores it in metadata. > The meta can be accessed with > parse.getData().getMeta("feedUrl"); > you can test this plugin with the main method of HtmlParser. > Thanks for a feedback. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (NUTCH-412) plugin to parse the feed-url (rss/atom) of a blog
plugin to parse the feed-url (rss/atom) of a blog - Key: NUTCH-412 URL: http://issues.apache.org/jira/browse/NUTCH-412 Project: Nutch Issue Type: New Feature Affects Versions: 0.9.0 Reporter: Renaud Richardet Priority: Minor A plugin that extracts the feed-url (rss/atom) of a blog by retrieving the href from the element (if found), and stores it in metadata. The meta can be accessed with parse.getData().getMeta("feedUrl"); you can test this plugin with the main method of HtmlParser. Thanks for a feedback. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Phrase query analysis-fr
Hi, When I use analysis-fr for indexing and searching, I'm not able to search by phrase query. I'm using nutch-0.8.1. Could someone help ? Best regards
Re: What's the status of Nutch-GUI?
Hi Sami, I quess you refer to these: • LocalJobRunner: • Run as kind of singelton • Have a kind of jobQueue • Implement JobSubmissionProtocol status-report methods • implement killJob method Right! -how about writing a nutchrunner that just extends the functionality of localjobrunner? That would be one solution, however I still hope that the hadoop developer understand that it would be general benefit to improve the local jobrunner. Since it would be somehow duplicated code it does not feel right, but I also think better this way as never get this issue solved. -scheduling (jobQueue) could be completely outside of jobrunner? We solved that with Quarz and file based JobStore we implemented back than. Stefan