[ https://issues.apache.org/jira/browse/NUTCH-1129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205750#comment-13205750 ]
Lewis John McGibbney commented on NUTCH-1129: --------------------------------------------- Hi Markus. I'm really gutted about this one, I've not had time to sort it out. I want to say the following things though. - Any23 is now available on repository.apache.org [1], however I think we need to change our ivy resolver to fetch these 0.7.0-snapshots. Should be pretty trivial though. - Any23 already has a crawler plugin implementation (nothing like the stuff we offer in Nutch ;0)) I'm not aware of the code, but it might be worth a swatch? [2] Unfortunately the documentation is not great at all as I'm sure you'll agree. [1] https://repository.apache.org/index.html#nexus-search;quick~org.apache.any23 [2] https://svn.apache.org/viewvc/incubator/any23/trunk/plugins/basic-crawler/ > Any23 Nutch plugin > ------------------ > > Key: NUTCH-1129 > URL: https://issues.apache.org/jira/browse/NUTCH-1129 > Project: Nutch > Issue Type: New Feature > Components: parser > Reporter: Lewis John McGibbney > Assignee: Lewis John McGibbney > Priority: Minor > Fix For: 1.5 > > > This plugin should build on the Any23 library to provide us with a plugin > which extracts RDF data from HTTP and file resources. Although as of writing > Any23 not part of the ASF, the project is working towards integration into > the Apache Incubator. Once the project proves its value, this would be an > excellent addition to the Nutch 1.X codebase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira