Profimedia created SOLR-13861:
---------------------------------

             Summary: SynonymGraphFilterFactory - with pattern tokenizer - not 
able to start
                 Key: SOLR-13861
                 URL: https://issues.apache.org/jira/browse/SOLR-13861
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: search
    Affects Versions: 7.7.2
            Reporter: Profimedia


Hi,

we face problem with definition of SynonymGraphFilterFactory, when we use 
SimplePatternTokenizerFactory. It seem's that there is a problem, that Solr 
during processing schema, lose attribute tokenizerFactory.pattern.

 
{code:xml}
<fieldType name="text_synonym" class="solr.TextField"  >
                <analyzer type="index">
                        <tokenizer class="solr.SimplePatternTokenizerFactory" 
pattern="[^,]+"/>
                </analyzer>
                <analyzer type="query">
                        <tokenizer class="solr.SimplePatternTokenizerFactory" 
pattern="[^,]+"/>
                        <filter class="solr.SynonymGraphFilterFactory"
                                        synonyms="synonyms.txt"
                                        expand="false"
            tokenizerFactory="solr.SimplePatternTokenizerFactory" 
tokenizerFactory.pattern="[^,]+" />
                </analyzer>
        </fieldType>
{code}
We got exception like this:
{code:java}
Caused by: java.lang.IllegalArgumentException: Configuration Error: missing 
parameter 'pattern'
        at 
org.apache.lucene.analysis.util.AbstractAnalysisFactory.require(AbstractAnalysisFactory.java:97)
        at 
org.apache.lucene.analysis.pattern.SimplePatternTokenizerFactory.<init>(SimplePatternTokenizerFactory.java:68)
        ... 58 more
{code}
We debug this issue and we found that problem is at this method which are 
called more than once:
{code:java}
// (there are no tests for this functionality)
  private TokenizerFactory loadTokenizerFactory(ResourceLoader loader, String 
cname) throws IOException {
    Class<? extends TokenizerFactory> clazz = loader.findClass(cname, 
TokenizerFactory.class);
    try {
      TokenizerFactory tokFactory = 
clazz.getConstructor(Map.class).newInstance(tokArgs);
      if (tokFactory instanceof ResourceLoaderAware) {
        ((ResourceLoaderAware) tokFactory).inform(loader);
      }
      return tokFactory;
    } catch (Exception e) {
      throw new RuntimeException(e);
    }
  }
{code}
In a first step argument tokArgs was cleared. And in second step, Solr reports 
missing param pattern.

We did some workaround like this:
{code:java}
TokenizerFactory tokFactory = clazz.getConstructor(Map.class).newInstance(new 
HashMap<>(tokArgs))
{code}
, which creates for each call new map from tokArgs, which could be cleared. But 
I think, that for this issue will exist better solution, then creating copy of 
tokArgs map.

After that we can run filter, mentioned above, without problems.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to