[
https://issues.apache.org/jira/browse/TIKA-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris A. Mattmann resolved TIKA-366.
------------------------------------
Resolution: Fixed
- fixed in r901033
> Increase buffer size for mime type sniffing
> -------------------------------------------
>
> Key: TIKA-366
> URL: https://issues.apache.org/jira/browse/TIKA-366
> Project: Tika
> Issue Type: Improvement
> Components: mime
> Affects Versions: 0.5
> Environment: My local MacBook pro laptop.
> Reporter: Chris A. Mattmann
> Assignee: Chris A. Mattmann
> Fix For: 0.6
>
>
> While working on TIKA-357 to address a similar problem for charset detection,
> I found an issue with mime identification having to do with the same general
> problem. Tika right now only deals with the first MimeTypes#getMinLength()
> bytes of a magic header to do the sniffing of mime type. With the example
> file attached from Ken Krugler, it's clear that the current min length size
> of 4 * 1024 bytes isn't enough. Extending it to 8K (8 * 1024 bytes) addresses
> this issue and seems to open up more opportunity for mime detection at little
> overhead cost.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.