[ 
https://issues.apache.org/jira/browse/TIKA-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019383#comment-17019383
 ] 

Andrey Nizienko commented on TIKA-2294:
---------------------------------------

[~nick] thanks for the detailed explanation.
I've tried with adding the Tika Parsers jar to the project dependency and this 
really changed the result for the [first mentioned 
example|https://issues.apache.org/jira/browse/TIKA-2294?focusedCommentId=17018613&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17018613]
So there are to possible approaches to resolve this issue: use mime-magic in 
the Tika Core or include the Tika Parsers jar to the project.

Thanks and regards,
Andrii

> Tika inconsistently detects ooxml files as zip file sometimes
> -------------------------------------------------------------
>
>                 Key: TIKA-2294
>                 URL: https://issues.apache.org/jira/browse/TIKA-2294
>             Project: Tika
>          Issue Type: Bug
>          Components: mime
>    Affects Versions: 1.11
>         Environment: linux
>            Reporter: chanchal
>            Assignee: Tim Allison
>            Priority: Major
>         Attachments: google_doc.docx
>
>
> Tika sometimes incorrectly detects  ooxml file as zip and sometimes correctly 
> detects as docx/pptx/xlsx.
> Is there a possibility of it happening and how?
> I cannot share the file as it has sensitive content.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to