[ https://issues.apache.org/jira/browse/TIKA-402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885862#action_12885862 ]
Jukka Zitting commented on TIKA-402: ------------------------------------ BTW, the XHTMLContentHandler is an extended ContentHandler, so you can also use lower level methods like characters(char[],int,int). I did that in revision 961266 to avoid having to instantiate extra String objects in the iwork content handlers. > Support for iWork documents > --------------------------- > > Key: TIKA-402 > URL: https://issues.apache.org/jira/browse/TIKA-402 > Project: Tika > Issue Type: New Feature > Components: parser > Reporter: Jukka Zitting > Assignee: Jukka Zitting > Fix For: 0.8 > > Attachments: iwork.patch, iwork.patch, iwork.patch, iwork.patch, > iwork.patch, iwork.patch, iwork.patch, testKeynote.key, testKeynote.key, > testNumbers.numbers, testPages.pages > > > It would be nice to have support for documents created by Apple's Keynote and > Pages applications. Both file formats are described in > http://developer.apple.com/mac/library/documentation/AppleApplications/Conceptual/iWork2-0_XML/Chapter01/Introduction.html. > I'm not sure if there already are open source parser libraries for these > formats or if we'd need to directly process the XML content. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.