Hi all,
I am working on integrating Apache UIMA as un UpdateRequestProcessor for
Apache Solr and I am now at the first working snapshot.
I put the code on GoogleCode [1] and you can take a look at the tutorial
[2].

I would be glad to donate it to the Apache Solr project, as I think it could
be a useful module to trigger automatic content extraction while indexing
documents.

At the moment the UIMAUpdateRequestProcessor base implementation can
automatically extract document's sentences, language, keywords, concepts and
named entities using Apache UIMA's HMMTagger, OpenCalaisAnnotator and
AlchemyAPIAnnotator components (but it can be easily expanded).

Any feedback is welcome.
Have a nice day.
Tommaso

[1] : http://code.google.com/p/solr-uima/
[2] : http://code.google.com/p/solr-uima/wiki/5MinutesTutorial

Reply via email to