[
https://issues.apache.org/jira/browse/TIKA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Simon Tyler updated TIKA-391:
-----------------------------
Attachment: MimeTypes.java
Attached is an updated version of MimeTypes.java based on the 0.6 code base.
This is tested and solves the problem. The resource name and content type hints
now pick a match from the returned list.
The only changes are the addition of the getMimeTypes method and it's usage in
the detect method.
A fuller fix for this issue should probably address all the other forms of
getMimeType. We could also consider what happens if the two hints hit different
magic matches.
Simon
> Intermittent errors detectig xls files
> --------------------------------------
>
> Key: TIKA-391
> URL: https://issues.apache.org/jira/browse/TIKA-391
> Project: Tika
> Issue Type: Bug
> Components: mime
> Affects Versions: 0.6
> Reporter: Simon Tyler
> Attachments: MimeTypes.java
>
>
> I am doing some testing of Tika 0.6 and noticed some odd results for the
> testEXCEL.xls file included in the test suite.
> 100 calls to the following code:
>
> is = new BufferedInputStream(new FileInputStream(filename));
>
> Metadata metadata = new Metadata();
> metadata.set(Metadata.RESOURCE_NAME_KEY, filename);
>
> String type = tika.detect(is, metadata);
>
> Results in different matches as application/msword or
> application/vnd.ms-excel seemingly at random.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.