Re: Using stored value of a field to build suggester index

2014-11-24 Thread Faisal Mansoor
Thanks Erick, This makes things clearer.

Thanks,
Faisal

On Sun, Nov 23, 2014 at 2:17 PM, Erick Erickson erickerick...@gmail.com
wrote:

 You can't build the suggester from the stored
 values, it's constructed from indexed terms only.

 You probably want to create a copyField to
 a less-analyzed (indexed) field and suggest
 from _that_. You'll probably want to do things
 like remove punctuation, perhaps lowercase
 and the like but not stem etc.

 Best,
 Erick

 On Sun, Nov 23, 2014 at 12:25 PM, Faisal Mansoor
 faisal.mans...@gmail.com wrote:
  Hi,
 
  I am trying to build a suggester for a field which is both index and
  stored. The field is whitespace tokenized, lowercased, stemmed etc while
  indexing.
 
  It looks like that the indexed terms are used as a source for building
 the
  suggester index. Which is what the following line in the suggester
  documentation also mentions.
 
  https://wiki.apache.org/solr/Suggester
 
 
 -
 
 field - if sourceLocation is empty then terms from this field in the
 index will be used when building the trie.
 
  I want to display the suggested value in UI, is it possible to use the
  stored value of the field rather than the indexed terms to build the
 index.
 
  Here are the relevant definitions from solrconfig.xml and schema.xml.
 
  Thanks.
  Faisal
 
  solrconfig.xml
 
searchComponent class=solr.SpellCheckComponent
  name=infix_suggest_analyzing
  lst name=spellchecker
str name=nameinfix_suggest_analyzing/str
str
 name=classnameorg.apache.solr.spelling.suggest.Suggester/str
str
 
 name=lookupImplorg.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory/str
str name=buildOnCommitfalse/str
 
!-- Suggester properties --
str name=suggestAnalyzerFieldTypeautosuggest_fieldType/str
str
 
 name=dictionaryImplorg.apache.solr.spelling.suggest.HighFrequencyDictionaryFactory/str
str name=fieldDisplayName/str
 
 
  /lst
 
  !-- specify a fieldtype using keywordtokenizer + lowercase + cleanup
  --
  str name=queryAnalyzerFieldTypephrase_suggest/str
/searchComponent
 
 
requestHandler name=/suggest
  class=org.apache.solr.handler.component.SearchHandler
  lst name=defaults
str name=echoParamsexplicit/str
str name=spellchecktrue/str
str name=spellcheck.dictionaryinfix_suggest_analyzing/str
str name=spellcheck.onlyMorePopulartrue/str
str name=spellcheck.count200/str
str name=spellcheck.collatetrue/str
str name=spellcheck.maxCollations10/str
  /lst
  arr name=components
strinfix_suggest_analyzing/str
  /arr
/requestHandler
 
 
  schema.xml
 
 
fieldType name=autosuggest_fieldType class=solr.TextField
  positionIncrementGap=100
  analyzer
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ASCIIFoldingFilterFactory/
  /analyzer
/fieldType
 
fieldtype name=phrase_suggest class=solr.TextField
analyzer
  tokenizer class=solr.KeywordTokenizerFactory/
  filter class=solr.PatternReplaceFilterFactory
 
 
 pattern=([^\p{L}\p{M}\p{N}\p{Cs}]*[\p{L}\p{M}\p{N}\p{Cs}\_]+:)|([^\p{L}\p{M}\p{N}\p{Cs}])+
  replacement=  replace=all/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.TrimFilterFactory/
/analyzer
  /fieldtype
 
 
  fieldType name=text class=solr.TextField
  positionIncrementGap=100
analyzer type=index
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt/
  filter class=solr.WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=1
  catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.KeywordMarkerFilterFactory
  protected=protwords.txt/
  filter class=solr.PorterStemFilterFactory/
  filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
analyzer type=query
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
  ignoreCase=true expand=true/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt/
  filter class=solr.WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=0
  catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.KeywordMarkerFilterFactory
  protected=protwords.txt/
  filter class=solr.PorterStemFilterFactory/
  filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
  /fieldType
 
 
  field name=DisplayName type=text indexed=true stored=true
  required=true multiValued=false /



Using stored value of a field to build suggester index

2014-11-23 Thread Faisal Mansoor
Hi,

I am trying to build a suggester for a field which is both index and
stored. The field is whitespace tokenized, lowercased, stemmed etc while
indexing.

It looks like that the indexed terms are used as a source for building the
suggester index. Which is what the following line in the suggester
documentation also mentions.

https://wiki.apache.org/solr/Suggester


   -

   field - if sourceLocation is empty then terms from this field in the
   index will be used when building the trie.

I want to display the suggested value in UI, is it possible to use the
stored value of the field rather than the indexed terms to build the index.

Here are the relevant definitions from solrconfig.xml and schema.xml.

Thanks.
Faisal

solrconfig.xml

  searchComponent class=solr.SpellCheckComponent
name=infix_suggest_analyzing
lst name=spellchecker
  str name=nameinfix_suggest_analyzing/str
  str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str
name=lookupImplorg.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory/str
  str name=buildOnCommitfalse/str

  !-- Suggester properties --
  str name=suggestAnalyzerFieldTypeautosuggest_fieldType/str
  str
name=dictionaryImplorg.apache.solr.spelling.suggest.HighFrequencyDictionaryFactory/str
  str name=fieldDisplayName/str


/lst

!-- specify a fieldtype using keywordtokenizer + lowercase + cleanup
--
str name=queryAnalyzerFieldTypephrase_suggest/str
  /searchComponent


  requestHandler name=/suggest
class=org.apache.solr.handler.component.SearchHandler
lst name=defaults
  str name=echoParamsexplicit/str
  str name=spellchecktrue/str
  str name=spellcheck.dictionaryinfix_suggest_analyzing/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.count200/str
  str name=spellcheck.collatetrue/str
  str name=spellcheck.maxCollations10/str
/lst
arr name=components
  strinfix_suggest_analyzing/str
/arr
  /requestHandler


schema.xml


  fieldType name=autosuggest_fieldType class=solr.TextField
positionIncrementGap=100
analyzer
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.ASCIIFoldingFilterFactory/
/analyzer
  /fieldType

  fieldtype name=phrase_suggest class=solr.TextField
  analyzer
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.PatternReplaceFilterFactory

pattern=([^\p{L}\p{M}\p{N}\p{Cs}]*[\p{L}\p{M}\p{N}\p{Cs}\_]+:)|([^\p{L}\p{M}\p{N}\p{Cs}])+
replacement=  replace=all/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.TrimFilterFactory/
  /analyzer
/fieldtype


fieldType name=text class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
/fieldType


field name=DisplayName type=text indexed=true stored=true
required=true multiValued=false /


Re: Using stored value of a field to build suggester index

2014-11-23 Thread Erick Erickson
You can't build the suggester from the stored
values, it's constructed from indexed terms only.

You probably want to create a copyField to
a less-analyzed (indexed) field and suggest
from _that_. You'll probably want to do things
like remove punctuation, perhaps lowercase
and the like but not stem etc.

Best,
Erick

On Sun, Nov 23, 2014 at 12:25 PM, Faisal Mansoor
faisal.mans...@gmail.com wrote:
 Hi,

 I am trying to build a suggester for a field which is both index and
 stored. The field is whitespace tokenized, lowercased, stemmed etc while
 indexing.

 It looks like that the indexed terms are used as a source for building the
 suggester index. Which is what the following line in the suggester
 documentation also mentions.

 https://wiki.apache.org/solr/Suggester


-

field - if sourceLocation is empty then terms from this field in the
index will be used when building the trie.

 I want to display the suggested value in UI, is it possible to use the
 stored value of the field rather than the indexed terms to build the index.

 Here are the relevant definitions from solrconfig.xml and schema.xml.

 Thanks.
 Faisal

 solrconfig.xml

   searchComponent class=solr.SpellCheckComponent
 name=infix_suggest_analyzing
 lst name=spellchecker
   str name=nameinfix_suggest_analyzing/str
   str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
   str
 name=lookupImplorg.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory/str
   str name=buildOnCommitfalse/str

   !-- Suggester properties --
   str name=suggestAnalyzerFieldTypeautosuggest_fieldType/str
   str
 name=dictionaryImplorg.apache.solr.spelling.suggest.HighFrequencyDictionaryFactory/str
   str name=fieldDisplayName/str


 /lst

 !-- specify a fieldtype using keywordtokenizer + lowercase + cleanup
 --
 str name=queryAnalyzerFieldTypephrase_suggest/str
   /searchComponent


   requestHandler name=/suggest
 class=org.apache.solr.handler.component.SearchHandler
 lst name=defaults
   str name=echoParamsexplicit/str
   str name=spellchecktrue/str
   str name=spellcheck.dictionaryinfix_suggest_analyzing/str
   str name=spellcheck.onlyMorePopulartrue/str
   str name=spellcheck.count200/str
   str name=spellcheck.collatetrue/str
   str name=spellcheck.maxCollations10/str
 /lst
 arr name=components
   strinfix_suggest_analyzing/str
 /arr
   /requestHandler


 schema.xml


   fieldType name=autosuggest_fieldType class=solr.TextField
 positionIncrementGap=100
 analyzer
   tokenizer class=solr.StandardTokenizerFactory/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.ASCIIFoldingFilterFactory/
 /analyzer
   /fieldType

   fieldtype name=phrase_suggest class=solr.TextField
   analyzer
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.PatternReplaceFilterFactory

 pattern=([^\p{L}\p{M}\p{N}\p{Cs}]*[\p{L}\p{M}\p{N}\p{Cs}\_]+:)|([^\p{L}\p{M}\p{N}\p{Cs}])+
 replacement=  replace=all/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.TrimFilterFactory/
   /analyzer
 /fieldtype


 fieldType name=text class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt/
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt/
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
 /fieldType


 field name=DisplayName type=text indexed=true stored=true
 required=true multiValued=false /