[ https://issues.apache.org/jira/browse/TIKA-851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Chow updated TIKA-851: -------------------------------- Description: When the mime type of an M4V file is detected using its name only, it returns video/x-m4v. When it is detected using the InputStream (hence utilising the MagicDetector), it incorrectly returns video/quicktime. Using the sample M4V file from Apple's [knowledge base|http://support.apple.com/kb/HT1425]: {code:title=TikaTest.java} public class TikaTest { public static void main(String[] args) throws Exception { String userHome = System.getProperty("user.home"); File file = new File(userHome + "/Desktop/sample_iPod.m4v"); InputStream is = TikaInputStream.get(file); Detector detector = new DefaultDetector( MimeTypes.getDefaultMimeTypes()); Metadata metadata = new Metadata(); metadata.set(Metadata.RESOURCE_NAME_KEY, file.getName()); System.out.println("File + filename: " + detector.detect(is, metadata)); System.out.println("File only: " + detector.detect(is, new Metadata())); System.out.println("Filename only: " + detector.detect(null, metadata)); } } {code} Renders the output: {code} File + filename: video/quicktime File only: video/quicktime Filename only: video/x-m4v {code} Moreover, if the same test is run against an M4A file, the results are even more incorrect: {code} File + filename: video/quicktime File only: video/quicktime Filename only: application/octet-stream {code} was: When the mime type of an M4V file is detected using its name only, it returns video/x-m4v. When it is detected using the InputStream (hence utilising the MagicDetector), it incorrectly returns video/quicktime. Using the sample M4V file from Apple's [knowledge base|http://support.apple.com/kb/HT1425]: {code:title=TikaTest.java} public class TikaTest { public static void main(String[] args) throws Exception { String userHome = System.getProperty("user.home"); File file = new File(userHome + "/Desktop/sample_iPod.m4v"); InputStream is = TikaInputStream.get(file); Detector detector = new DefaultDetector( MimeTypes.getDefaultMimeTypes()); Metadata metadata = new Metadata(); metadata.set(Metadata.RESOURCE_NAME_KEY, file.getName()); System.out.println("File + filename: " + detector.detect(is, metadata)); System.out.println("File only: " + detector.detect(is, new Metadata())); System.out.println("Filename only: " + detector.detect(null, metadata)); } } {code} Renders the output: {code} File + filename: video/quicktime File only: video/quicktime Filename only: video/x-m4v {code} Summary: M4V and M4A detection invalid (was: M4V magic detection invalid) > M4V and M4A detection invalid > ----------------------------- > > Key: TIKA-851 > URL: https://issues.apache.org/jira/browse/TIKA-851 > Project: Tika > Issue Type: Bug > Components: mime > Affects Versions: 1.0 > Reporter: Alexander Chow > > When the mime type of an M4V file is detected using its name only, it returns > video/x-m4v. When it is detected using the InputStream (hence utilising the > MagicDetector), it incorrectly returns video/quicktime. > Using the sample M4V file from Apple's [knowledge > base|http://support.apple.com/kb/HT1425]: > {code:title=TikaTest.java} > public class TikaTest { > public static void main(String[] args) throws Exception { > String userHome = System.getProperty("user.home"); > File file = new File(userHome + "/Desktop/sample_iPod.m4v"); > InputStream is = TikaInputStream.get(file); > Detector detector = new DefaultDetector( > MimeTypes.getDefaultMimeTypes()); > Metadata metadata = new Metadata(); > metadata.set(Metadata.RESOURCE_NAME_KEY, file.getName()); > System.out.println("File + filename: " + detector.detect(is, > metadata)); > System.out.println("File only: " + detector.detect(is, > new Metadata())); > System.out.println("Filename only: " + detector.detect(null, > metadata)); > } > } > {code} > Renders the output: > {code} > File + filename: video/quicktime > File only: video/quicktime > Filename only: video/x-m4v > {code} > Moreover, if the same test is run against an M4A file, the results are even > more incorrect: > {code} > File + filename: video/quicktime > File only: video/quicktime > Filename only: application/octet-stream > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira