Re: Using stored value of a field to build suggester index
Thanks Erick, This makes things clearer. Thanks, Faisal On Sun, Nov 23, 2014 at 2:17 PM, Erick Erickson erickerick...@gmail.com wrote: You can't build the suggester from the stored values, it's constructed from indexed terms only. You probably want to create a copyField to a less-analyzed (indexed) field and suggest from _that_. You'll probably want to do things like remove punctuation, perhaps lowercase and the like but not stem etc. Best, Erick On Sun, Nov 23, 2014 at 12:25 PM, Faisal Mansoor faisal.mans...@gmail.com wrote: Hi, I am trying to build a suggester for a field which is both index and stored. The field is whitespace tokenized, lowercased, stemmed etc while indexing. It looks like that the indexed terms are used as a source for building the suggester index. Which is what the following line in the suggester documentation also mentions. https://wiki.apache.org/solr/Suggester - field - if sourceLocation is empty then terms from this field in the index will be used when building the trie. I want to display the suggested value in UI, is it possible to use the stored value of the field rather than the indexed terms to build the index. Here are the relevant definitions from solrconfig.xml and schema.xml. Thanks. Faisal solrconfig.xml searchComponent class=solr.SpellCheckComponent name=infix_suggest_analyzing lst name=spellchecker str name=nameinfix_suggest_analyzing/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory/str str name=buildOnCommitfalse/str !-- Suggester properties -- str name=suggestAnalyzerFieldTypeautosuggest_fieldType/str str name=dictionaryImplorg.apache.solr.spelling.suggest.HighFrequencyDictionaryFactory/str str name=fieldDisplayName/str /lst !-- specify a fieldtype using keywordtokenizer + lowercase + cleanup -- str name=queryAnalyzerFieldTypephrase_suggest/str /searchComponent requestHandler name=/suggest class=org.apache.solr.handler.component.SearchHandler lst name=defaults str name=echoParamsexplicit/str str name=spellchecktrue/str str name=spellcheck.dictionaryinfix_suggest_analyzing/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.count200/str str name=spellcheck.collatetrue/str str name=spellcheck.maxCollations10/str /lst arr name=components strinfix_suggest_analyzing/str /arr /requestHandler schema.xml fieldType name=autosuggest_fieldType class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.ASCIIFoldingFilterFactory/ /analyzer /fieldType fieldtype name=phrase_suggest class=solr.TextField analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.PatternReplaceFilterFactory pattern=([^\p{L}\p{M}\p{N}\p{Cs}]*[\p{L}\p{M}\p{N}\p{Cs}\_]+:)|([^\p{L}\p{M}\p{N}\p{Cs}])+ replacement= replace=all/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory/ /analyzer /fieldtype fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType field name=DisplayName type=text indexed=true stored=true required=true multiValued=false /
Using stored value of a field to build suggester index
Hi, I am trying to build a suggester for a field which is both index and stored. The field is whitespace tokenized, lowercased, stemmed etc while indexing. It looks like that the indexed terms are used as a source for building the suggester index. Which is what the following line in the suggester documentation also mentions. https://wiki.apache.org/solr/Suggester - field - if sourceLocation is empty then terms from this field in the index will be used when building the trie. I want to display the suggested value in UI, is it possible to use the stored value of the field rather than the indexed terms to build the index. Here are the relevant definitions from solrconfig.xml and schema.xml. Thanks. Faisal solrconfig.xml searchComponent class=solr.SpellCheckComponent name=infix_suggest_analyzing lst name=spellchecker str name=nameinfix_suggest_analyzing/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory/str str name=buildOnCommitfalse/str !-- Suggester properties -- str name=suggestAnalyzerFieldTypeautosuggest_fieldType/str str name=dictionaryImplorg.apache.solr.spelling.suggest.HighFrequencyDictionaryFactory/str str name=fieldDisplayName/str /lst !-- specify a fieldtype using keywordtokenizer + lowercase + cleanup -- str name=queryAnalyzerFieldTypephrase_suggest/str /searchComponent requestHandler name=/suggest class=org.apache.solr.handler.component.SearchHandler lst name=defaults str name=echoParamsexplicit/str str name=spellchecktrue/str str name=spellcheck.dictionaryinfix_suggest_analyzing/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.count200/str str name=spellcheck.collatetrue/str str name=spellcheck.maxCollations10/str /lst arr name=components strinfix_suggest_analyzing/str /arr /requestHandler schema.xml fieldType name=autosuggest_fieldType class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.ASCIIFoldingFilterFactory/ /analyzer /fieldType fieldtype name=phrase_suggest class=solr.TextField analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.PatternReplaceFilterFactory pattern=([^\p{L}\p{M}\p{N}\p{Cs}]*[\p{L}\p{M}\p{N}\p{Cs}\_]+:)|([^\p{L}\p{M}\p{N}\p{Cs}])+ replacement= replace=all/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory/ /analyzer /fieldtype fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType field name=DisplayName type=text indexed=true stored=true required=true multiValued=false /
Re: Using stored value of a field to build suggester index
You can't build the suggester from the stored values, it's constructed from indexed terms only. You probably want to create a copyField to a less-analyzed (indexed) field and suggest from _that_. You'll probably want to do things like remove punctuation, perhaps lowercase and the like but not stem etc. Best, Erick On Sun, Nov 23, 2014 at 12:25 PM, Faisal Mansoor faisal.mans...@gmail.com wrote: Hi, I am trying to build a suggester for a field which is both index and stored. The field is whitespace tokenized, lowercased, stemmed etc while indexing. It looks like that the indexed terms are used as a source for building the suggester index. Which is what the following line in the suggester documentation also mentions. https://wiki.apache.org/solr/Suggester - field - if sourceLocation is empty then terms from this field in the index will be used when building the trie. I want to display the suggested value in UI, is it possible to use the stored value of the field rather than the indexed terms to build the index. Here are the relevant definitions from solrconfig.xml and schema.xml. Thanks. Faisal solrconfig.xml searchComponent class=solr.SpellCheckComponent name=infix_suggest_analyzing lst name=spellchecker str name=nameinfix_suggest_analyzing/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory/str str name=buildOnCommitfalse/str !-- Suggester properties -- str name=suggestAnalyzerFieldTypeautosuggest_fieldType/str str name=dictionaryImplorg.apache.solr.spelling.suggest.HighFrequencyDictionaryFactory/str str name=fieldDisplayName/str /lst !-- specify a fieldtype using keywordtokenizer + lowercase + cleanup -- str name=queryAnalyzerFieldTypephrase_suggest/str /searchComponent requestHandler name=/suggest class=org.apache.solr.handler.component.SearchHandler lst name=defaults str name=echoParamsexplicit/str str name=spellchecktrue/str str name=spellcheck.dictionaryinfix_suggest_analyzing/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.count200/str str name=spellcheck.collatetrue/str str name=spellcheck.maxCollations10/str /lst arr name=components strinfix_suggest_analyzing/str /arr /requestHandler schema.xml fieldType name=autosuggest_fieldType class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.ASCIIFoldingFilterFactory/ /analyzer /fieldType fieldtype name=phrase_suggest class=solr.TextField analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.PatternReplaceFilterFactory pattern=([^\p{L}\p{M}\p{N}\p{Cs}]*[\p{L}\p{M}\p{N}\p{Cs}\_]+:)|([^\p{L}\p{M}\p{N}\p{Cs}])+ replacement= replace=all/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory/ /analyzer /fieldtype fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType field name=DisplayName type=text indexed=true stored=true required=true multiValued=false /