Hi, See http://wiki.apache.org/incubator/October2008
Find a draft below - I'll be offline next week, feel free to finalize and post. -Bertrand Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Tika entered incubation on March 22nd, 2007. Community Dave Meikle was just voted in as a new committer. Paolo Mottadelli will present Tika at ApacheCon US. Development Tika 0.2 should be released soon. Usage documentation has been added to the website. Issues before graduation: The current plan is to graduate as a Lucene subproject, which could happen soon as the incubation criteria seem to be met.