Better HWP (Hangul Word Processor) detection pattern
----------------------------------------------------

                 Key: TIKA-330
                 URL: https://issues.apache.org/jira/browse/TIKA-330
             Project: Tika
          Issue Type: Improvement
          Components: mime
            Reporter: Jukka Zitting
            Assignee: Jukka Zitting
            Priority: Minor


The current magic byte pattern we have for the HWP (Hangul Word Processor, 
application/x-hwp) file format matches also the test-outlook.msg test file we 
have. I looked for a better detection pattern and found one from OpenOffice.org.

The hwpfilter/source/hwpfile.cpp file suggests that all HWP files start with 
the signature string "HWP Document File V", so I'll change the detection 
pattern accordingly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to