Remove deprecated parse plugins
-------------------------------

                 Key: NUTCH-836
                 URL: https://issues.apache.org/jira/browse/NUTCH-836
             Project: Nutch
          Issue Type: Task
          Components: parser
    Affects Versions: 1.1
            Reporter: Julien Nioche
            Assignee: Julien Nioche
             Fix For: 2.0
         Attachments: NUTCH-836.patch

Some of the parser plugins in 1.1 are covered by the parse-tika plugin. These 
plugins have been kept in 1.1 but should be removed from 2.0 where we'll rely 
on parse-tika almost exclusively. Some existing plugins might be kept when 
there is no equivalent in Tika (to be discussed). The following plugins are 
removed : 
* parse-html
* parse-msexcel
* parse-mspowerpoint
* parse-msword
* parse-pdf
* parse-oo
* parse-text
* lib-jakarta-poi
* lib-parsems

The patch does not (yet) remove :

* parse-js
* parse-rss
* parse-swf
* parse-zip
* feed

Please review the patch and vote for its inclusion in the trunk.




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to