ShingleFilter include words
---------------------------
Key: LUCENE-1917
URL: https://issues.apache.org/jira/browse/LUCENE-1917
Project: Lucene - Java
Issue Type: Improvement
Components: contrib/analyzers
Affects Versions: 2.9
Reporter: Jason Rutherglen
Priority: Minor
Fix For: 3.0
By default ShingleFilter creates shingles (i.e. combines tokens
into a single token) from all tokens. For the purposes of for
example, indexing stop words as shingles, however not creating
shingles out of every word, we can supply an include words
CharArraySet to ShingleFilter that determines the tokens to
shingle.
This is similar to Nutch CommonGrams and SOLR-908. SOLR-908
does not utilize the new token attribute API, and I figured this
functionality is more suitable being a part of Lucene.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]