Re: Patch for Lucene/Solr

Svetoslav Marinov Fri, 01 Jun 2012 04:19:14 -0700

At Findwise we active use a number of OpenNLP components with both Hydra
and OpenPipeline when indexing with Solr.


I look forward to see the result of the patch!

Best,
Svetoslav

On 2012-05-31 23:10, "Lance Norskog" <[email protected]> wrote:

>Thanks. I have looked at UIMA several times and it seemed very
>complex. It has a lot of features, is mature, has an Eclipse app
>builder, etc. I could not keep it all in my head at once. The
>Solr/Lucene document pipeline features give little space for NLP
>features. Hydra or OpenPipeline give UIMA and OpenNLP "room to
>breathe".
>
>Are there free annotated text databases for UIMA? OpenNLP does not use
>any with open licences. It has binary models made from copyrighted
>annotations and so they cannot be checked into Apache.
>
>On Wed, May 30, 2012 at 6:11 PM, Christian Moen <[email protected]> wrote:
>> Hello Lance,
>>
>> This is very cool!  I'm looking forward to having a look at this.
>>
>>
>> Christian Moen
>> http://atilika.com
>>
>> On May 31, 2012, at 9:54 AM, Lance Norskog wrote:
>>
>>> I'm creating a patch to integrate OpenNLP into the Lucene/Solr
>>> project. The SentenceDetector, Tokenizer, POS tagger, Chunker, and NER
>>> tools are included. The SentenceDetector and Tokenizer are a Lucene
>>> Tokenizer, and a Lucene TokenFilter takes this stream and runs
>>> POS/Chunking/NER on it, saving the tags as upper-case payloads. The
>>> patch includes a couple of handy combinations. For example, make a
>>> more focused search index by only indexing the nouns & verbs.
>>>
>>> Do you have any hints on how to package it? The documentation should
>>> include how to download and install the models.
>>>
>>> --
>>> Lance Norskog
>>> [email protected]
>>
>
>
>
>-- 
>Lance Norskog
>[email protected]
>

Re: Patch for Lucene/Solr

Reply via email to