Chris - Thanks for offering to look into this.
Recently I added unit tests to the AutoDetectParserTest class that generate these failures. I commented out the tests that didn't work, but left them there so they could be used later, and hopefully reenabled when they succeed (please see assertAutoDetect(String resource, String type, String content)). If you uncomment them, then the errors will cause the test to fail, and you will see the behavior I'm describing. The patch I provided in the previous message calls the MimeUtils and the AutoDetectParser methods that determine MIME type to illustrate that it is not just an AutoDetectParser problem. Here is one way to approach this: 1) Apply the patch in my previous message to a fresh copy of Tika. 2) Remove the "//" from the println's in getMimeType2() if you'd like to get debug output. 3) Run the unit test; in Intellij Idea, I just right click inside the test method (AutoDetectParserTest.testWord()) and select "Run". You can also just run "mvn test" on the command line. Feel free to get in touch anytime if there's anything else I can do to clarify or help. Regards, Keith Chris Mattmann wrote: > > Keith: > > Where exactly is it failing? In the unit tests? Or in your code? Could you > be more specific so that I (or someone else) can track it down? > > Thanks, > Chris > > > > On 10/22/07 2:00 PM, "Keith R. Bennett" <[EMAIL PROTECTED]> wrote: > >> >> All - >> >> We're still having the problem that MIME type detection from byte headers >> is >> failing. I'll try to look into it, but if anyone else could also take a >> look, that would be great. >> >> I'm attaching a patch that: >> >> * adds an alternate method for determining the MIME type that calls the >> new >> MimeUtils.getMimeType() method. >> * calls the regular method and the alternate method and asserts that they >> return the same result >> * enables only one type of test so that the output is more manageable >> * the alternate method turns the original one upside down; it's simpler >> to >> me because if a type is found via the byte header, the other methods are >> not >> attempted, etc.; let me know what you think. >> >> This patch is not intended to ever be committed, but is just for review. >> >> Thanks, >> - Keith >> >> >> >> http://www.nabble.com/file/p13352486/diag.patch diag.patch >> > > -- View this message in context: http://www.nabble.com/MIME-Type-Detection-from-Byte-Header-Failing-tf4673629.html#a13352854 Sent from the Apache Tika - Development mailing list archive at Nabble.com.
