Hi,

On 10/18/07, Keith R. Bennett <[EMAIL PROTECTED]> wrote:
> If I understand correctly, we already have what we need in MimeUtils:
>     public String getType(String typeName, String url, byte[] data) { ... }

The current MimeUtils.getType relies only on magic header matching,
and should be fixed.

The main reason why I decided to implement my own version of the code
based on MimeTypes in AutoDetectParser was that I was somewhat
confused about the separation of concerns across MimeTypes and
MimeUtils. The MimeTypes class already has a number of utility methods
like getMimeType(String, byte[]) and getMimeType(URL), so I'm not sure
why we need MimeUtils.

> Jukka, should I modify AutoDetectParser to call this method instead of its
> own?

OK once the method has been fixed.

> However, the bigger issue is, is the assessment that header based detection
> fails with certain file types correct?

Magic detection can never be 100% correct or complete, but there's a
lot that we could still do to improve the current status in Tika.

BR,

Jukka Zitting

Reply via email to