Adding document format libraries as subprojects of Tika still "hides" them somewhat. So this wouldn't really solve the problem of easily finding such libraries. If new libraries should be developed, I would think that a lab or Commons is better suited.
There were many talks over the years about creating an image library inside the ASF but it has never developed into a real effort. It's a lot of work and with ImageIO built into the JDK only exotic wishes are still open. If we had a Tika Wiki we could at least list potential existing libraries and libraries that we'd like but don't exist. We could list licenses, candidates for incubation, quality/maturity indicators... Inside the XML Graphics project, we have the following available (if anyone is interested to know): * XMP metadata framework in XML Graphics Commons, read/write, work in progress * PostScript DSC in XML Graphics Commons, read/write (no PS interpreter!) * PNG and TIFF codecs in XML Graphics Commons, read/write * PDF in FOP, write only * RTF in FOP, write only * SVG in Batik, read/write Others: PDF (PDFBox @SourceForge), read/write, signalled interest for incubation personal wishlist: ODF, read/write Mars, read/write On 10.07.2007 09:18:33 Carsten Ziegeler wrote: > Afaik there is currently no central place at Apache where > libraries/frameworks for handling of specific document formats are > developed. We have single projects like poi of course. > > If you are searching for java libraries which support a specific format, > like some image formats, you'll find many libraries of varying quality > and it's really hard (if not impossible) to choose a correct one. > > I'm wondering if something could be done about it by starting a project > at Apache which supports various file formats (like images, mp3 etc.) - > perhaps by incubating some existing stuff. > > Although Tika is more the framework for plugin in such stuff, it perhaps > makes sense to try to start something like that as sub projects of Tika? > > WDYT? > > Carsten > -- > Carsten Ziegeler > [EMAIL PROTECTED] > Jeremias Maerki
