[ 
https://issues.apache.org/jira/browse/TIKA-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Burch resolved TIKA-1823.
------------------------------
       Resolution: Fixed
    Fix Version/s: 1.13

Thanks, I've added this magic, along with a unit test, and some more specific 
magic which can give version information too, in 38fbc504 & 6a092332

> Support detecting DWF format
> ----------------------------
>
>                 Key: TIKA-1823
>                 URL: https://issues.apache.org/jira/browse/TIKA-1823
>             Project: Tika
>          Issue Type: Improvement
>          Components: detector, mime
>            Reporter: Luca Moretti
>            Priority: Minor
>              Labels: detection, dwf, mime
>             Fix For: 1.13
>
>         Attachments: blocks_and_tables.dwf
>
>
> Tika currently detects dwf files as application/octect-stream.
> To make Tika mime magic detector correctly recognize dwf files it should be 
> added this code fragment in _tika-mimetypes.xml_ registry:
> {code:xml}
> <mime-type type="model/vnd.dwf">
>       <acronym>dwf</acronym>
>       <_comment>Design Web Format</_comment>
>       <magic priority="50">
>               <match type="string" offset="0" value="(DWF V">
>                       <match type="string" offset="8" value=".">
>                               <match type="string" offset="11" value=")" />
>                       </match>
>               </match>
>       </magic>
>       <glob pattern="*.dwf" />
> </mime-type>
> {code}
> \\
> In current version (DWF 6.0), dwf file is a ZIP-compressed container for 
> vector-based CAD drawings. It is basically a ZIP archive with the _(DWF 
> V06.00)_ signature added before the regular ZIP magic number. For this 
> reason, the match value to detect dwf files should be: {{(DWF V06.00)PK}}.
> In the previous versions, the dwf data transport isn't a ZIP file format, so 
> the magic number is only the _(DWF V00.55)_ signature in the file header.
> To make Tika detect dwf files with this version too I propose the match value 
> in the code above.
> Thanks,
> Luca
> \\
> P.S.: The DWF format specification is included in the DWF Toolkit. The DWF 
> Toolkit is available for free at [http://www.autodesk.com/dwftoolkit]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to