Better HWP (Hangul Word Processor) detection pattern
----------------------------------------------------
Key: TIKA-330
URL: https://issues.apache.org/jira/browse/TIKA-330
Project: Tika
Issue Type: Improvement
Components: mime
Reporter: Jukka Zitting
Assignee: Jukka Zitting
Priority: Minor
The current magic byte pattern we have for the HWP (Hangul Word Processor,
application/x-hwp) file format matches also the test-outlook.msg test file we
have. I looked for a better detection pattern and found one from OpenOffice.org.
The hwpfilter/source/hwpfile.cpp file suggests that all HWP files start with
the signature string "HWP Document File V", so I'll change the detection
pattern accordingly.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.