[ https://issues.apache.org/jira/browse/TIKA-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15295532#comment-15295532 ]
Stefan Bodewig commented on TIKA-1966: -------------------------------------- Ah, looks as if I misunderstood the format. The numbers/pages/key files are ZIPs that contain IWA files which again use Snappy. I'll stick with the files created by Nick. Please ignore the noise. > Issue in parsing iWorksDocument with Apache Tika > ------------------------------------------------ > > Key: TIKA-1966 > URL: https://issues.apache.org/jira/browse/TIKA-1966 > Project: Tika > Issue Type: Improvement > Components: parser > Affects Versions: 1.12 > Environment: Ubuntu 15 > Reporter: Sachin Shaju > Attachments: budget.numbers, connors_20040127.key, pages.pages, > sample code > > > I was trying to parse iWorksDoc with Apache Tika. But am not getting parsed > content as it is instead getting some other output from the content handler. > Code snippet that I've used is attached with this. > Output :- > Contents of the file : > Index/Document.iwa > Index/ViewState.iwa > Index/CalculationEngine.iwa > Index/Tables/HeaderStorageBucket-2.iwa > Index/Tables/Tile.iwa > Index/Metadata.iwa > Metadata/Properties.plist > I'm able to detect the file type using Detector api correctly. But am not > getting the useful content out of the document. > I'm attaching the iWorks docs that I've tested with (made with latest version > of iOS). I got it working when testing with older versions. Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)