[ https://issues.apache.org/jira/browse/SOLR-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Trey Grainger updated SOLR-14434: --------------------------------- Affects Version/s: (was: 8.6) > Add documentation for adding multiterm analyzers in Schema API > -------------------------------------------------------------- > > Key: SOLR-14434 > URL: https://issues.apache.org/jira/browse/SOLR-14434 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Schema and Analysis > Reporter: Trey Grainger > Priority: Major > > Originally this was filed as a bug report, but upon further inspection I > realized the usage was just undocumented and just a result of inconsistent > property name (casing) between the XML and JSON. Changing this to a Jira to > add documentation so others don't run into this issue in the future. > Also need to document that the "analysis/field" API ignores {{multiterm}} > analysis and thus doesn't reflect the full nature of incoming queries. This > has been an annoying quirk for years and I think would be worth fixing, but > for now we should at least document it. > -------------- > In addition to "{{index}}" and "{{query}}" analyzers, Solr supports adding an > explicit "{{multiterm}}" analyzer to schema {{fieldType}} definitions. This > allows for specific control over analysis for things like wildcard terms, > prefix queries, range queries, etc. For example, the following would cause > the wildcard query for "{{hats*}}" to get stemmed to "{{hat*}}" instead of > "{{hats*}}", and thus match on the indexed version of "{{hat}}". > {code:java} > <fieldType class="solr.TextField" multiValued="true" name="multiterm_test" > positionIncrementGap="100" termOffsets="true" termVectors="true"> > <analyzer type="index"> > <tokenizer class="solr.ClassicTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.EnglishMinimalStemFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.ClassicTokenizerFactory"/> > <filter class="solr.SynonymGraphFilterFactory" expand="true" > ignoreCase="true" synonyms="synonyms.txt"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.EnglishMinimalStemFilterFactory"/> > </analyzer> > <analyzer type="multiterm"> > <tokenizer class="solr.ClassicTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.EnglishMinimalStemFilterFactory"/> > </analyzer> > </fieldType>{code} > In the xml version this analyzer is called "{{multiterm}}", whereas it's > "{{multiTerm}}" in the JsonAPI. This isn't in the documentation anywhere and > just cost me a bunch of time debugging through the code until I finally found > what was going on. Using this ticket to add better documentation around usage > and gotchas around this feature. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org