Hi,
You could actually use Lius as text extraction API, I have implement for
each Indexer a method that allows you to get the String content of the
Document.
Lius could be used as a starting point of Tika project, if Tika committers
are interested on it. We can also as mark said decouple Lius's parser logic
from it's indexing logic.
Taking the project into Apache incubator could be also interesting, to get
more people involved on it.

My goal is to join our effort to build a framework for text extraction.
Here is an example of text extraction with lius :

LiusConfig lc =
LiusConfigBuilder.getSingletonInstance().getLiusConfig(liusConfigPathString);

Indexer indexer = IndexerFactory.getIndexer(documentToIndex, lc);
String text = Indexer.getContent();


On 3/1/07, Jukka Zitting <[EMAIL PROTECTED]> wrote:


Hi,

I am interested in a Lius/Tika project that could be used not only with
Lucene. As mentioned by Mark, there are a number of related efforts which
leads me to believe a application-independent content analysis/parsing
tool
would be very helpful for many users.

I'd like to propose taking the project to the Apache Incubator to better
attract interest also from outside Lucene.

BR,

Jukka Zitting

--
View this message in context:
http://www.nabble.com/Lius-into-apache-incubator-tf3145937.html#a9247508
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to