[ 
https://issues.apache.org/jira/browse/TIKA-402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873735#action_12873735
 ] 

Martijn van Groningen commented on TIKA-402:
--------------------------------------------

I'm also looking forward in having this feature in 0.8! I'm not sure what you 
mean with: 'to use our existing generic XML root element detection mechanism 
for these file formats'. What class(es) do you mean specifically with this? 
Also how would the PackageParser be able to parse the contents of a directory 
(The class only works for archives. Correct me if i'm wrong.)? I've only see 
this happening with keynote presentations made with IWork '06 (I'm not sure if 
I had the compression disabled for Keynote at that time). 

I also agree with Chris. There are properly more formats out there that use the 
same mechanism as IWork, but we can maybe wait to refactor until it's necessary 
(Agile... to throw in a buzz word).

BTW It was a good to to use theContentHandlerDecorator. Too bad I missed that 
one!



> Support for iWork documents
> ---------------------------
>
>                 Key: TIKA-402
>                 URL: https://issues.apache.org/jira/browse/TIKA-402
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>         Attachments: iwork.patch, iwork.patch, iwork.patch, iwork.patch, 
> iwork.patch, testKeynote.key, testKeynote.key, testNumbers.numbers, 
> testPages.pages
>
>
> It would be nice to have support for documents created by Apple's Keynote and 
> Pages applications. Both file formats are described in 
> http://developer.apple.com/mac/library/documentation/AppleApplications/Conceptual/iWork2-0_XML/Chapter01/Introduction.html.
>  I'm not sure if there already are open source parser libraries for these 
> formats or if we'd need to directly process the XML content.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to