RE: Indexing nouns only - UIMA vs. OpenNLP

2013-02-01 Thread Kai Gülzau
Hi Lance, > About removing non-nouns: the OpenNLP patch includes two simple > TokenFilters for manipulating terms with payloads. The > FilterPayloadFilter lets you keep or remove terms with given payloads. yes, I used this already in the schema.xml > payloadList="NN,NNS,NNP,NNPS,FM" keepPayloa

Re: Indexing nouns only - UIMA vs. OpenNLP

2013-01-31 Thread Lance Norskog
Thanks, Kai! About removing non-nouns: the OpenNLP patch includes two simple TokenFilters for manipulating terms with payloads. The FilterPayloadFilter lets you keep or remove terms with given payloads. In the demo schema.xml, there is an example type that keeps only nouns&verbs. There is a

RE: Indexing nouns only - UIMA vs. OpenNLP

2013-01-31 Thread Kai Gülzau
UIMA: I just found this issue https://issues.apache.org/jira/browse/SOLR-3013 Now I am able to use this analyzer for english texts and filter (un)wanted token types :-) Open issue -> How to set the ModelFile for the Tagger to "german/TuebaModel.dat" ??? OpenNLP: And a mod