[ 
https://issues.apache.org/jira/browse/NUTCH-840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-840:
---------------------------------------

    Attachment: NUTCH-840.patch

Hi Julien. I have absolutely no idea how or when I ended up working on this, 
but I think the attachment nearly addresses this issue. It is from a while back 
and to be honest I can't really remeber working on it...

Anyway, I think the parse-tika tests fail as it is not quite working properly 
yet. The patch also changes the directory structure to o.a.n.p.tika rather than 
existing o.a.n.tika which is inconsistent with other parser plugin 
implementation we ship with Nutch.

Sorry for hijacking this one slightly.
                
> Port tests from parse-html to parse-tika
> ----------------------------------------
>
>                 Key: NUTCH-840
>                 URL: https://issues.apache.org/jira/browse/NUTCH-840
>             Project: Nutch
>          Issue Type: Task
>          Components: parser
>    Affects Versions: 1.1
>            Reporter: Julien Nioche
>            Assignee: Julien Nioche
>             Fix For: nutchgora
>
>         Attachments: NUTCH-840.patch, NUTCH-840.patch
>
>
> We don't have test for HTML in parse-tika so I'll copy them from the old 
> parse-html plugin

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to