Hi,
On 10/18/07, Keith R. Bennett <[EMAIL PROTECTED]> wrote:
> If I understand correctly, we already have what we need in MimeUtils:
> public String getType(String typeName, String url, byte[] data) { ... }
The current MimeUtils.getType relies only on magic header matching,
and should be fixed.
The main reason why I decided to implement my own version of the code
based on MimeTypes in AutoDetectParser was that I was somewhat
confused about the separation of concerns across MimeTypes and
MimeUtils. The MimeTypes class already has a number of utility methods
like getMimeType(String, byte[]) and getMimeType(URL), so I'm not sure
why we need MimeUtils.
> Jukka, should I modify AutoDetectParser to call this method instead of its
> own?
OK once the method has been fixed.
> However, the bigger issue is, is the assessment that header based detection
> fails with certain file types correct?
Magic detection can never be 100% correct or complete, but there's a
lot that we could still do to improve the current status in Tika.
BR,
Jukka Zitting