Hi,

I crawl / index PDF files just fine, but I get the following warning.

parse.ParserFactory - ParserFactory: Plugin: parse-pdf mapped to contentType application/pdf via parse-plugins.xml, but not enabled via plugin.includes in nutch-default.xml.

I've got the value protocol-http|urlfilter-regex|parse-(html|tika|js|msexcel|mspowerpoint|msword|oo|pdf|swf|zip)|index-(basic|anchor)|scoring-opic|urlnormalizer-(pass|regex|basic) for plugin.includes property in nutch-default.xml. What am I missing?

Regards,

Reply via email to