Hi ,

If I go to http://wiki.apache.org/nutch/AboutPlugins  ,here  it shows me
HTMLParseFilter is extension point for adding custom metadata to HTML and
its 'Filter' method's signature is 'public ParseResult filter(Content
content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment
doc)'  but its in api 1.4 doc.

I am on Nutch 2.2 and there is no class by name of HTMLParseFilter in  v2.2
api doc
http://nutch.apache.org/apidocs-2.2/allclasses-noframe.html.

So please tell me which class to use in v2.2 api for adding my custom rule
to extract some data from HTML page (is it ParseFilter ?) and add it to
HMTL metadata so later then I could add it to my Solr using indexfilter
plugin.


Thanks,
Tony.

Reply via email to