Bertrand, I would suggest to replace "various documents" with "various document formats". Other than that, looks fine to me.
--Thilo Bertrand Delacretaz wrote: > I'll add this to http://wiki.apache.org/incubator/July2007 this > afternoon (in about 6 hours), please yell if something's wrong or > missing. > > <report> > Tika is a toolkit for detecting and extracting metadata and structured > text content from various documents using existing parser > libraries.Tika entered incubation on March 22nd, 2007. > > Community > > The Tika mailing list has seen increased activity in the last weeks, > with some new people showing interest for Tika's goals. > > Grant Ingersoll brought the Aperture framework to our attention > (http://aperture.sourceforge.net/), which has similar goals to Tika. > We will look at possible synergies. > > Development > > No code has been committed since our last report, but some initial > code is ready in JIRA and should be committed soon. > > Issues before graduation > > No changes since our last report: the Tika project is still at an > early stage of incubation. We need to continue bringing in the initial > codebases and probably target for an initial incubating release later > this year. We also need to work on growing the community and figuring > out how to best interact with external parser projects. > </report>
