[
https://issues.apache.org/jira/browse/TIKA-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jukka Zitting updated TIKA-75:
------------------------------
Attachment: TIKA-75.jukka.patch
Here's a slightly modified patch with the following improvements:
1) Uses MimeTypes.getMinLength() instead of a hardcoded constant
2) Works correctly even if InputStream.read(byte[]) doesn't fill the whole
buffer
3) Doesn't use new File("/test-documents/" + filename).toString() to avoid
problems on Windows
> Provide a MimeUtils.getType(URL) method that will determine MIME type based
> on the stream and, if necessary, the name.
> ----------------------------------------------------------------------------------------------------------------------
>
> Key: TIKA-75
> URL: https://issues.apache.org/jira/browse/TIKA-75
> Project: Tika
> Issue Type: Improvement
> Components: general
> Affects Versions: 0.1-incubator
> Reporter: Keith R. Bennett
> Priority: Minor
> Fix For: 0.1-incubator
>
> Attachments: TIKA-75.jukka.patch, tika-75.patch
>
>
> We have a MimeUtils method that returns a MIME type based solely on the name.
> It would be very helpful to also have a method that examines the header as
> well. I've added a method (patch coming) that does this. It opens a stream
> from the URL, reads the header, closes the stream, and then calls the
> existing method.
> This may not be usable in the course of parsing, since it violates our
> decision to read a stream only once. However, it is very useful as a way to
> test our MIME type determination, and as a non-parse service to our users (as
> recently discussed on the forum).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.