Expose Tika's boilerpipe support
--------------------------------

                 Key: NUTCH-961
                 URL: https://issues.apache.org/jira/browse/NUTCH-961
             Project: Nutch
          Issue Type: New Feature
          Components: parser
            Reporter: Markus Jelsma
             Fix For: 1.3, 2.0


Tika 0.8 comes with the Boilerpipe content handler which can be used to extract 
boilerplate content from HTML pages. We should see how we can expose 
Boilerplate in the Nutch cofiguration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to