[
https://issues.apache.org/jira/browse/TIKA-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930392#comment-17930392
]
Tim Allison commented on TIKA-4381:
-----------------------------------
I attached to this issue the hackery of a parser I wrote to extract info from
MS-OXPROPS to generate the table of props that the ExtendedMetadataExtractor is
now using.
I first copied/pasted the text from the PDF spec into a text file, and then I
also had to manually fix at least one typo and I manually added some spaces for
rare page break issues.
> Improve extraction of metadata from Appointment/Task msgs
> ---------------------------------------------------------
>
> Key: TIKA-4381
> URL: https://issues.apache.org/jira/browse/TIKA-4381
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Major
> Attachments: Parser.java
>
>
> Our metadata extraction on msgs is mostly focused on "NOTE"/regular emails.
> We could do to improve extraction from appointments, tasks and other msg
> types.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)