Hi, You could actually use Lius as text extraction API, I have implement for each Indexer a method that allows you to get the String content of the Document. Lius could be used as a starting point of Tika project, if Tika committers are interested on it. We can also as mark said decouple Lius's parser logic from it's indexing logic. Taking the project into Apache incubator could be also interesting, to get more people involved on it.
My goal is to join our effort to build a framework for text extraction. Here is an example of text extraction with lius : LiusConfig lc = LiusConfigBuilder.getSingletonInstance().getLiusConfig(liusConfigPathString); Indexer indexer = IndexerFactory.getIndexer(documentToIndex, lc); String text = Indexer.getContent(); On 3/1/07, Jukka Zitting <[EMAIL PROTECTED]> wrote:
Hi, I am interested in a Lius/Tika project that could be used not only with Lucene. As mentioned by Mark, there are a number of related efforts which leads me to believe a application-independent content analysis/parsing tool would be very helpful for many users. I'd like to propose taking the project to the Apache Incubator to better attract interest also from outside Lucene. BR, Jukka Zitting -- View this message in context: http://www.nabble.com/Lius-into-apache-incubator-tf3145937.html#a9247508 Sent from the Lucene - Java Developer mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]