[ https://issues.apache.org/jira/browse/TIKA-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13870087#comment-13870087 ]
Peter Ansell commented on TIKA-1217: ------------------------------------ [~jukkaz] The rationale for checking first on filename, in a Java-7 context, was that Path objects do not hold File Descriptors. Hence, a content type detection method taking a Path object may also be able to avoid getting a File Descriptor. However, if there is an unacceptable loss in fidelity by checking first on the filename then feel free to remove that clause, as it isn't critical to the functionality for me. There cannot, however, easily be two different implementations in the same module, as java.util.ServiceLoader isn't ordered so it cannot preference one over the other. In addition, there are no OpenOptions or LinkOptions attached to Files.probeContentType as there are with other methods such as Files.isRegularFile. That makes it difficult for users to pass in their preferences about how Files.probeContentType should operate (ie, whether it should try to avoid getting a file descriptor if possible, or not to follow symbolic links). If we wanted to do a second implementation that always used File it would be perfectly possible, but it would need to go in a separate module to distinguish between the META-INF/services files based on which module is loaded. We would also have to rename the current module from tika-java7 to something more specific. As you say, in a performance critical application, the results will be cached to avoid duplication, so it isn't a big deal in the greater scheme of things. [~lewismc] You can find the patch that Jukka committed in the Tika trunk if you want to test it, but it isn't necessary to do it now if you have other things to do. https://github.com/apache/tika/commit/39370848b8bd9214dc4b7720539edc0eb595300c > Integrate with Java-7 FileTypeDetector API > ------------------------------------------ > > Key: TIKA-1217 > URL: https://issues.apache.org/jira/browse/TIKA-1217 > Project: Tika > Issue Type: New Feature > Components: detector, mime > Reporter: Peter Ansell > Attachments: TIKA-1217-v2.patch, TIKA-1217.patch > > > It would be useful if Tika natively provided Java-7 FileTypeDetector [1] > implementations. Adding the corresponding > META-INF/services/java.nio.file.spi.FileTypeDetector files would allow the > use of Files.probeContentType [2] without any specific links to Tika for this > functionality. > If you do not want to rely on Java-7 for the core, then this could be added > as an extension module. > [1] > http://docs.oracle.com/javase/7/docs/api/java/nio/file/spi/FileTypeDetector.html > [2] > http://docs.oracle.com/javase/7/docs/api/java/nio/file/Files.html#probeContentType(java.nio.file.Path) -- This message was sent by Atlassian JIRA (v6.1.5#6160)