non-binding +1 from me. On 18.03.2007 10:51:37 Jukka Zitting wrote: <snip/> > [ ] +1 Accept Tika as a new podling > [ ] -1 Do not accept the new podling (provide reason, please) <snip/> > Instead of implementing its own document parsers, Tika will use existing > parser libraries like Jakarta POI [1] and PDFBox [2].
I would like to make the Tika people aware that we've recently started a little XMP framework as part of the XML Graphics Project. XMP is used with a number of document formats, with PDF its most prominent format. It could be interesting to work together on this. I've also been in contact with Ben Litchfield, author of PDFBox, about possibly joining forces on the topic. However, not much has happened. At the moment, the XMP code can only cover what is necessary to implement the very basics of the PDF/A-1b specification. But I'm sure it can be easily enhanced to fit a wider audience. I already see the need to take the code a step further in order to cover extension schemas that is mandated by the PDF/A-1 standard. Finally, the code doesn't absolutely have to stay within XML Graphics, I guess, but that's only me speaking. Links: http://xmlgraphics.apache.org/commons/ http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/xmp/ <snip/> Jeremias Maerki (watching with interest) --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]