[ 
https://issues.apache.org/jira/browse/TIKA-402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jukka Zitting reopened TIKA-402:
--------------------------------


Reopening for a minor test failure on Java 5, see revision 960892. It looks 
like in some cases the parser loses whitespace between words. This is probably 
related to the way the XML parser works in the underlying Java version. Perhaps 
a distinction between characters() and ignorableWhitespace() calls.

> Support for iWork documents
> ---------------------------
>
>                 Key: TIKA-402
>                 URL: https://issues.apache.org/jira/browse/TIKA-402
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>             Fix For: 0.8
>
>         Attachments: iwork.patch, iwork.patch, iwork.patch, iwork.patch, 
> iwork.patch, iwork.patch, testKeynote.key, testKeynote.key, 
> testNumbers.numbers, testPages.pages
>
>
> It would be nice to have support for documents created by Apple's Keynote and 
> Pages applications. Both file formats are described in 
> http://developer.apple.com/mac/library/documentation/AppleApplications/Conceptual/iWork2-0_XML/Chapter01/Introduction.html.
>  I'm not sure if there already are open source parser libraries for these 
> formats or if we'd need to directly process the XML content.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to