[ 
https://issues.apache.org/jira/browse/TIKA-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638887#comment-13638887
 ] 

Niels Beekman commented on TIKA-1086:
-------------------------------------

This is broken since revision 1369624 (fixes for TIKA-968).
Attached is a patch that adds an optional import for the org.w3c.dom package.
                
> Tika-bundle 1.3 does not import org.w3c.dom package
> ---------------------------------------------------
>
>                 Key: TIKA-1086
>                 URL: https://issues.apache.org/jira/browse/TIKA-1086
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.3
>            Reporter: Gaurav
>             Fix For: 1.2
>
>         Attachments: TIKA-1086.svn.diff
>
>
> The tika-bundle 1.3 version does not import org.w3c.dom package, as a result 
> it is not able to parse DOM based documents such as Microsoft Word (docx) 
> documents.
> This issue does not have in version 1.2 as it does import the necessary 
> package and therefore the parsing of the documents work fine.
> Can someone please look into the issue, as Microsoft Word is a very popular 
> document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to