Hi... I am using Tika 0.9 to detect various types of files and formats, but not getting the expected behavior. More specifically:
- For XML files, sometimes the returned type is "text/xml" and some other types it is "application/xml". The second case happens intermittently and has occurred rarely, so it is not reproducible. Perhaps a class loading issue? - For various application files (e.g., images or MS-Office files) the detected type is the generic "application/octet-stream", as opposed to the specific MIME type for the application. The detection is made via a simple call to new Tika().detect(inputStream); where "inputStream" is the Java InputStream object used for reading from the corresponding data file. Is there any additional configuration (or other usage pattern) needed to achieve the desired behavior? Thanks!
