Chris M. Hostetter created SOLR-16930: -----------------------------------------
Summary: schema short class name support can use factories w/different names then specified name Key: SOLR-16930 URL: https://issues.apache.org/jira/browse/SOLR-16930 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: Chris M. Hostetter I recently encountered a schema "in the wild" that had a fieldType that looked roughly like this... {noformat} <fieldType autoGeneratePhraseQueries="true" class="solr.TextField" name="edgengram" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EdgeNGramTokenizerFactory" maxGramSize="25" minGramSize="4"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> {noformat} I was going to explain to the user that this wouldn't work, because they were trying to configure {{solr.EdgeNGramTokenizerFactory}} as a token {_}filter{_}, but it's a _tokenizer_ – and that they needed to use {{{}solr.EdgeNGramTokenFilterFactory{}}}. But then I realized there schema loaded just fine, and did exactly what they expected. Which made no sense to me. Experimentation using the {{/analysis/field}} request handler confirmed that – somehow – they were getting an {{org.apache.lucene.analysis.ngram.EdgeNGramTokenFilter}} instance. ---- I have not dug into this code, but I _suspect_ what's happening, is that the logic for resolving {{solr.FooClassName}} "short" classnames is finding the class with the name {{FooClassName}} and then checking what it's SPI name is _with out checking if it implements the expected API_ and then using that SPI name to actually create an instance of the factory. So the resolution of {{solr.EdgeNGramTokenizerFactory}} finds {{org.apache.lucene.analysis.ngram.EdgeNGramTokenizerFactory}} which has an SPI name of {{edgeNGram}} which when resolved _in the context of a looking for a TokenFilterFactory_ returns {{org.apache.lucene.analysis.ngram.EdgeNGramFilterFactory}} because both class have the *SAME* SPI name (but for different APIs) ---- I know we've moved away from suggesting the {{solr.FooClassName}} short classname syntax (and will probably remove it completely at some point) in favor of using the SPI registration names -- so maybe this isn't worth worrying about, but it sure confused the hell out of me, and will likely confuse the hell out of someone else at some point as well (hence i'm creating a jira in case it helps anyone else confused about this) -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org