Microsoft Project (MPP) basic support
-------------------------------------

                 Key: TIKA-789
                 URL: https://issues.apache.org/jira/browse/TIKA-789
             Project: Tika
          Issue Type: New Feature
          Components: parser
    Affects Versions: 1.0
            Reporter: Nick Burch
            Assignee: Nick Burch


The Microsoft Project file format (MPP) could fairly easily be better supported 
by Tika. Gaps to fill are:
 * Correct mimetype definition (it's OLE2 based)
 * OLE2 detection for MPP
 * Common OLE2 metadata extraction

For fuller support (such as text contents), we'd probably want a parser which 
used MPXJ. However, as MPXJ is LGPL, it'd need to be an external 3rd party 
parser. (MPXJ is based on top of POI, but it's under a more copyleft license. 
POI itself doesn't have MPP support)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to