KeepWordFilter can be slow at query time if wordlist is large
-------------------------------------------------------------
Key: SOLR-1850
URL: https://issues.apache.org/jira/browse/SOLR-1850
Project: Solr
Issue Type: Improvement
Components: Schema and Analysis
Affects Versions: 1.4
Reporter: John Wang
In the case when "Set<String> words" is large, constructing a KeepWordFilter at
query time is very costly because of the construction (copy) of the set, e.g.:
this.words = new CharArraySet(words, ignoreCase);
This call does an addAll on the set, and is done for each query, and is the
same work.
Suggestion: overload the constructor and expose the CharArraySet, e.g.:
public KeepWordFilter(TokenStream in, CharArraySet words ) {
super(in);
this.words = words;
this.termAtt = (TermAttribute)addAttribute(TermAttribute.class);
}
This allows the ability to have CharArraySet to be constructed once staticly
for the application instead at query time.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.