I'm about to use nutch to crawl semantic data. Links to semantic data files (RDF, OWL, etc.) can be placed in two places: (1) HEAD <link>; (2) BODY <a href...>. Does nutch crawler follows the HEAD <link>?
I'm also creating a semantic data publishing tool, I would appreciate any suggestion regarding the best way to make RDF files visible to nutch crawler. There was a brief discussion last year on the topic of crawling semantic web. I believe this is a growing area. I would like to make nutch a component of the new semantic data publishing and crawling system that I'm working on. It would be great if any nutch expert can share some pointers as to how nutch can optimally support such system or how such system should be designed to optimally take advantage of nutch. Best, AJ
_______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
