[ https://issues.apache.org/jira/browse/TIKA-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012389#comment-14012389 ]
Nick Burch commented on TIKA-1204: ---------------------------------- Really we need a file that either you yourself produce, or one you find elsewhere with a suitable permissive license on it. Sadly we can't just add random copyright files from the internet... > DWFX files detection > -------------------- > > Key: TIKA-1204 > URL: https://issues.apache.org/jira/browse/TIKA-1204 > Project: Tika > Issue Type: Improvement > Components: detector, mime > Affects Versions: 1.4 > Reporter: Marco Quaranta > Priority: Minor > Attachments: General assembly filter.dwfx > > > DWFX are AutoCAD [Design web > format|http://en.wikipedia.org/wiki/Design_Web_Format] files and follow [Open > Packaging > Conventions|http://en.wikipedia.org/wiki/Open_Packaging_Conventions]. > Tika "correctly" detects these files as application/zip. > It would be better if Tika could recognize the true mimetype: > model/vnd.dwfx+xps. (y) > Please add logic in ZipContainerDetector in such a way could be possible to > detect dwfx. We need a method behaving like detectOfficeOpenXML(OPCPackage > pkg): > {noformat} > PackageRelationshipCollection core = > pkg.getRelationshipsByType("http://schemas.autodesk.com/dwfx/2007/relationships/documentsequence"); > if (core.size() != 1) { > // Invalid DWFX Package received > return null; > } > PackagePart corePart = pkg.getPart(core.getRelationship(0)); > String coreType = corePart.getContentType(); > return MediaType.parse(coreType); > {noformat} > Thank you, > Marco -- This message was sent by Atlassian JIRA (v6.2#6252)