Hi Lance, > About removing non-nouns: the OpenNLP patch includes two simple > TokenFilters for manipulating terms with payloads. The > FilterPayloadFilter lets you keep or remove terms with given payloads.
yes, I used this already in the schema.xml > <filter class="solr.FilterPayloadsFilterFactory" > payloadList="NN,NNS,NNP,NNPS,FM" keepPayloads="true"/> > <filter class="solr.StripPayloadsFilterFactory"/> Works fine :-) But as Robert Muir stated in LUCENE-4345 I also think using types (and storing these optionally as payloads) would be a better approach. > http://code.google.com/p/universal-pos-tags/ Thanks for the pointer, used it to improve my english (brown) whitelist for UIMA :-) Regards, Kai Gülzau