[ 
https://issues.apache.org/jira/browse/TIKA-484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906555#action_12906555
 ] 

Nick Burch commented on TIKA-484:
---------------------------------

I've just tried this file with Tika-App (which passes the filename into the 
detector), and it get the content type correct:
  Content-Type: 
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

When working with container based files such as .xlsx, you either need to pass 
in the file name, or use the ContainerAwareDetector. If you ask the normal 
mime-magic detector, without a filename hint, it won't be able to figure it out.

Could you please confirm what steps you're taking that cause it to not work for 
you, and ensure you are passing in the filename?

> xlsx files created with open office are detected as application/zip
> -------------------------------------------------------------------
>
>                 Key: TIKA-484
>                 URL: https://issues.apache.org/jira/browse/TIKA-484
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.7
>         Environment: Ubuntu
>            Reporter: Victor Kazakov
>            Priority: Minor
>         Attachments: openofficexlsxfile.xlsx
>
>
> Create an xlsx file in open office. 
> Parse the file using a org.apache.tika.parser.AutoDetectParser
> It gets recognized as a zip file.
> Note: I have only tried this with open office running on ubuntu.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to