Re: DirectSpellChecker not returning expected suggestions.
If you have access to the solr admin screen you have access to how it was analyzed through the analysis page. You have to hover over the little abbreviations to see the class in the analysis chain. Likewise, the admin screen should have access to the raw schema.xml file which _also_ has the analysis chain definition. And if you really don't have access to either of those, and you can't find out how the fields were analyzed, then there's not much that you can do... Best Erick On Mon, Jun 2, 2014 at 12:25 PM, S.L simpleliving...@gmail.com wrote: James, I get no results back and no suggestions for wrangle , however I get suggestions for wranglr , and wrangle is not present in my index. I am just searching for wrangle in a field that is created by copying other fields, as to how it is analyzed I dont have access to it now. Thanks. On Mon, Jun 2, 2014 at 2:48 PM, Dyer, James james.d...@ingramcontent.com wrote: If wrangle is not in your index, and if it is within the max # of edits, then it should suggest it. Are you getting anything back from spellcheck at all? What is the exact query you are using? How is the spellcheck field analyzed? If you're using stemming, then wrangle and wrangler might be stemmed to the same word. (by the way, you shouldn't spellcheck against a stemmed or otherwise heavily-analyzed field). James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: S.L [mailto:simpleliving...@gmail.com] Sent: Monday, June 02, 2014 1:06 PM To: solr-user@lucene.apache.org Subject: Re: DirectSpellChecker not returning expected suggestions. OK, I just realized that wrangle is a proper english word, probably thats why I dont get a suggestion for wrangler in this case. How ever in my test index there is no wrangle present , so even though this is a proper english word , since there is no occurence of it in the index should'nt Solr suggest me wrangler ? On Mon, Jun 2, 2014 at 2:00 PM, S.L simpleliving...@gmail.com wrote: I do not get any suggestion (when I search for wrangle) , however I correctly get the suggestion wrangler when I search for wranglr , I am using the Direct and WordBreak spellcheckers in combination, I have not tried using anything else. Is the distance calculation of Solr different than what Levestien distance calculation ? I have set maxEdits to 1 , assuming that this corresponds to the maxDistance. Thanks for your help! On Mon, Jun 2, 2014 at 1:54 PM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: What do you get then? Suggestions, but not the one you’re looking for, or is it deemed correctly spelled? Have you tried another spellChecker impl, for troubleshooting purposes? ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Sat, May 31, 2014 at 12:33 AM, S.L simpleliving...@gmail.com wrote: Hi All, I have a small test index of 400 documents , it happens to have an entry for wrangler, When I search for wranglr, I correctly get the collation suggestion as wrangler, however when I search for wrangle , I do not get a suggestion for wrangler. The Levenstien distance between wrangle -- wrangler is same as the Levestien distance between wranglr--wrangler , I am just wondering why I do not get a suggestion for wrangle. Below is my Direct spell checker configuration. lst name=spellchecker str name=namedirect/str str name=fieldsuggestAggregate/str str name=classnamesolr.DirectSolrSpellChecker/str !-- the spellcheck distance measure used, the default is the internal levenshtein -- str name=distanceMeasureinternal/str str name=comparatorClassscore/str !-- minimum accuracy needed to be considered a valid spellcheck suggestion -- float name=accuracy0.7/float !-- the maximum #edits we consider when enumerating terms: can be 1 or 2 -- int name=maxEdits1/int !-- the minimum shared prefix when enumerating terms -- int name=minPrefix3/int !-- maximum number of inspections per result. -- int name=maxInspections5/int !-- minimum length of a query term to be considered for correction -- int name=minQueryLength4/int !-- maximum threshold of documents a query term can appear to be considered for correction -- float name=maxQueryFrequency0.01/float !-- uncomment this to require suggestions to occur in 1% of the documents -- !-- float name=thresholdTokenFrequency.01/float -- /lst
Re: DirectSpellChecker not returning expected suggestions.
Anyone ? On Sat, May 31, 2014 at 12:33 AM, S.L simpleliving...@gmail.com wrote: Hi All, I have a small test index of 400 documents , it happens to have an entry for wrangler, When I search for wranglr, I correctly get the collation suggestion as wrangler, however when I search for wrangle , I do not get a suggestion for wrangler. The Levenstien distance between wrangle -- wrangler is same as the Levestien distance between wranglr--wrangler , I am just wondering why I do not get a suggestion for wrangle. Below is my Direct spell checker configuration. lst name=spellchecker str name=namedirect/str str name=fieldsuggestAggregate/str str name=classnamesolr.DirectSolrSpellChecker/str !-- the spellcheck distance measure used, the default is the internal levenshtein -- str name=distanceMeasureinternal/str str name=comparatorClassscore/str !-- minimum accuracy needed to be considered a valid spellcheck suggestion -- float name=accuracy0.7/float !-- the maximum #edits we consider when enumerating terms: can be 1 or 2 -- int name=maxEdits1/int !-- the minimum shared prefix when enumerating terms -- int name=minPrefix3/int !-- maximum number of inspections per result. -- int name=maxInspections5/int !-- minimum length of a query term to be considered for correction -- int name=minQueryLength4/int !-- maximum threshold of documents a query term can appear to be considered for correction -- float name=maxQueryFrequency0.01/float !-- uncomment this to require suggestions to occur in 1% of the documents -- !-- float name=thresholdTokenFrequency.01/float -- /lst
Re: DirectSpellChecker not returning expected suggestions.
What do you get then? Suggestions, but not the one you’re looking for, or is it deemed correctly spelled? Have you tried another spellChecker impl, for troubleshooting purposes? ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Sat, May 31, 2014 at 12:33 AM, S.L simpleliving...@gmail.com wrote: Hi All, I have a small test index of 400 documents , it happens to have an entry for wrangler, When I search for wranglr, I correctly get the collation suggestion as wrangler, however when I search for wrangle , I do not get a suggestion for wrangler. The Levenstien distance between wrangle -- wrangler is same as the Levestien distance between wranglr--wrangler , I am just wondering why I do not get a suggestion for wrangle. Below is my Direct spell checker configuration. lst name=spellchecker str name=namedirect/str str name=fieldsuggestAggregate/str str name=classnamesolr.DirectSolrSpellChecker/str !-- the spellcheck distance measure used, the default is the internal levenshtein -- str name=distanceMeasureinternal/str str name=comparatorClassscore/str !-- minimum accuracy needed to be considered a valid spellcheck suggestion -- float name=accuracy0.7/float !-- the maximum #edits we consider when enumerating terms: can be 1 or 2 -- int name=maxEdits1/int !-- the minimum shared prefix when enumerating terms -- int name=minPrefix3/int !-- maximum number of inspections per result. -- int name=maxInspections5/int !-- minimum length of a query term to be considered for correction -- int name=minQueryLength4/int !-- maximum threshold of documents a query term can appear to be considered for correction -- float name=maxQueryFrequency0.01/float !-- uncomment this to require suggestions to occur in 1% of the documents -- !-- float name=thresholdTokenFrequency.01/float -- /lst
Re: DirectSpellChecker not returning expected suggestions.
I do not get any suggestion (when I search for wrangle) , however I correctly get the suggestion wrangler when I search for wranglr , I am using the Direct and WordBreak spellcheckers in combination, I have not tried using anything else. Is the distance calculation of Solr different than what Levestien distance calculation ? I have set maxEdits to 1 , assuming that this corresponds to the maxDistance. Thanks for your help! On Mon, Jun 2, 2014 at 1:54 PM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: What do you get then? Suggestions, but not the one you’re looking for, or is it deemed correctly spelled? Have you tried another spellChecker impl, for troubleshooting purposes? ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Sat, May 31, 2014 at 12:33 AM, S.L simpleliving...@gmail.com wrote: Hi All, I have a small test index of 400 documents , it happens to have an entry for wrangler, When I search for wranglr, I correctly get the collation suggestion as wrangler, however when I search for wrangle , I do not get a suggestion for wrangler. The Levenstien distance between wrangle -- wrangler is same as the Levestien distance between wranglr--wrangler , I am just wondering why I do not get a suggestion for wrangle. Below is my Direct spell checker configuration. lst name=spellchecker str name=namedirect/str str name=fieldsuggestAggregate/str str name=classnamesolr.DirectSolrSpellChecker/str !-- the spellcheck distance measure used, the default is the internal levenshtein -- str name=distanceMeasureinternal/str str name=comparatorClassscore/str !-- minimum accuracy needed to be considered a valid spellcheck suggestion -- float name=accuracy0.7/float !-- the maximum #edits we consider when enumerating terms: can be 1 or 2 -- int name=maxEdits1/int !-- the minimum shared prefix when enumerating terms -- int name=minPrefix3/int !-- maximum number of inspections per result. -- int name=maxInspections5/int !-- minimum length of a query term to be considered for correction -- int name=minQueryLength4/int !-- maximum threshold of documents a query term can appear to be considered for correction -- float name=maxQueryFrequency0.01/float !-- uncomment this to require suggestions to occur in 1% of the documents -- !-- float name=thresholdTokenFrequency.01/float -- /lst
Re: DirectSpellChecker not returning expected suggestions.
OK, I just realized that wrangle is a proper english word, probably thats why I dont get a suggestion for wrangler in this case. How ever in my test index there is no wrangle present , so even though this is a proper english word , since there is no occurence of it in the index should'nt Solr suggest me wrangler ? On Mon, Jun 2, 2014 at 2:00 PM, S.L simpleliving...@gmail.com wrote: I do not get any suggestion (when I search for wrangle) , however I correctly get the suggestion wrangler when I search for wranglr , I am using the Direct and WordBreak spellcheckers in combination, I have not tried using anything else. Is the distance calculation of Solr different than what Levestien distance calculation ? I have set maxEdits to 1 , assuming that this corresponds to the maxDistance. Thanks for your help! On Mon, Jun 2, 2014 at 1:54 PM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: What do you get then? Suggestions, but not the one you’re looking for, or is it deemed correctly spelled? Have you tried another spellChecker impl, for troubleshooting purposes? ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Sat, May 31, 2014 at 12:33 AM, S.L simpleliving...@gmail.com wrote: Hi All, I have a small test index of 400 documents , it happens to have an entry for wrangler, When I search for wranglr, I correctly get the collation suggestion as wrangler, however when I search for wrangle , I do not get a suggestion for wrangler. The Levenstien distance between wrangle -- wrangler is same as the Levestien distance between wranglr--wrangler , I am just wondering why I do not get a suggestion for wrangle. Below is my Direct spell checker configuration. lst name=spellchecker str name=namedirect/str str name=fieldsuggestAggregate/str str name=classnamesolr.DirectSolrSpellChecker/str !-- the spellcheck distance measure used, the default is the internal levenshtein -- str name=distanceMeasureinternal/str str name=comparatorClassscore/str !-- minimum accuracy needed to be considered a valid spellcheck suggestion -- float name=accuracy0.7/float !-- the maximum #edits we consider when enumerating terms: can be 1 or 2 -- int name=maxEdits1/int !-- the minimum shared prefix when enumerating terms -- int name=minPrefix3/int !-- maximum number of inspections per result. -- int name=maxInspections5/int !-- minimum length of a query term to be considered for correction -- int name=minQueryLength4/int !-- maximum threshold of documents a query term can appear to be considered for correction -- float name=maxQueryFrequency0.01/float !-- uncomment this to require suggestions to occur in 1% of the documents -- !-- float name=thresholdTokenFrequency.01/float -- /lst
Re: DirectSpellChecker not returning expected suggestions.
It appears to be stemmed. ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Mon, Jun 2, 2014 at 2:06 PM, S.L simpleliving...@gmail.com wrote: OK, I just realized that wrangle is a proper english word, probably thats why I dont get a suggestion for wrangler in this case. How ever in my test index there is no wrangle present , so even though this is a proper english word , since there is no occurence of it in the index should'nt Solr suggest me wrangler ? On Mon, Jun 2, 2014 at 2:00 PM, S.L simpleliving...@gmail.com wrote: I do not get any suggestion (when I search for wrangle) , however I correctly get the suggestion wrangler when I search for wranglr , I am using the Direct and WordBreak spellcheckers in combination, I have not tried using anything else. Is the distance calculation of Solr different than what Levestien distance calculation ? I have set maxEdits to 1 , assuming that this corresponds to the maxDistance. Thanks for your help! On Mon, Jun 2, 2014 at 1:54 PM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: What do you get then? Suggestions, but not the one you’re looking for, or is it deemed correctly spelled? Have you tried another spellChecker impl, for troubleshooting purposes? ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Sat, May 31, 2014 at 12:33 AM, S.L simpleliving...@gmail.com wrote: Hi All, I have a small test index of 400 documents , it happens to have an entry for wrangler, When I search for wranglr, I correctly get the collation suggestion as wrangler, however when I search for wrangle , I do not get a suggestion for wrangler. The Levenstien distance between wrangle -- wrangler is same as the Levestien distance between wranglr--wrangler , I am just wondering why I do not get a suggestion for wrangle. Below is my Direct spell checker configuration. lst name=spellchecker str name=namedirect/str str name=fieldsuggestAggregate/str str name=classnamesolr.DirectSolrSpellChecker/str !-- the spellcheck distance measure used, the default is the internal levenshtein -- str name=distanceMeasureinternal/str str name=comparatorClassscore/str !-- minimum accuracy needed to be considered a valid spellcheck suggestion -- float name=accuracy0.7/float !-- the maximum #edits we consider when enumerating terms: can be 1 or 2 -- int name=maxEdits1/int !-- the minimum shared prefix when enumerating terms -- int name=minPrefix3/int !-- maximum number of inspections per result. -- int name=maxInspections5/int !-- minimum length of a query term to be considered for correction -- int name=minQueryLength4/int !-- maximum threshold of documents a query term can appear to be considered for correction -- float name=maxQueryFrequency0.01/float !-- uncomment this to require suggestions to occur in 1% of the documents -- !-- float name=thresholdTokenFrequency.01/float -- /lst
Re: DirectSpellChecker not returning expected suggestions.
Thanks, you mean wrangler , has been stemmed to wrangle , if thats the case then why does it not return any results for wrangle ? On Mon, Jun 2, 2014 at 2:07 PM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: It appears to be stemmed. ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Mon, Jun 2, 2014 at 2:06 PM, S.L simpleliving...@gmail.com wrote: OK, I just realized that wrangle is a proper english word, probably thats why I dont get a suggestion for wrangler in this case. How ever in my test index there is no wrangle present , so even though this is a proper english word , since there is no occurence of it in the index should'nt Solr suggest me wrangler ? On Mon, Jun 2, 2014 at 2:00 PM, S.L simpleliving...@gmail.com wrote: I do not get any suggestion (when I search for wrangle) , however I correctly get the suggestion wrangler when I search for wranglr , I am using the Direct and WordBreak spellcheckers in combination, I have not tried using anything else. Is the distance calculation of Solr different than what Levestien distance calculation ? I have set maxEdits to 1 , assuming that this corresponds to the maxDistance. Thanks for your help! On Mon, Jun 2, 2014 at 1:54 PM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: What do you get then? Suggestions, but not the one you’re looking for, or is it deemed correctly spelled? Have you tried another spellChecker impl, for troubleshooting purposes? ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Sat, May 31, 2014 at 12:33 AM, S.L simpleliving...@gmail.com wrote: Hi All, I have a small test index of 400 documents , it happens to have an entry for wrangler, When I search for wranglr, I correctly get the collation suggestion as wrangler, however when I search for wrangle , I do not get a suggestion for wrangler. The Levenstien distance between wrangle -- wrangler is same as the Levestien distance between wranglr--wrangler , I am just wondering why I do not get a suggestion for wrangle. Below is my Direct spell checker configuration. lst name=spellchecker str name=namedirect/str str name=fieldsuggestAggregate/str str name=classnamesolr.DirectSolrSpellChecker/str !-- the spellcheck distance measure used, the default is the internal levenshtein -- str name=distanceMeasureinternal/str str name=comparatorClassscore/str !-- minimum accuracy needed to be considered a valid spellcheck suggestion -- float name=accuracy0.7/float !-- the maximum #edits we consider when enumerating terms: can be 1 or 2 -- int name=maxEdits1/int !-- the minimum shared prefix when enumerating terms -- int name=minPrefix3/int !-- maximum number of inspections per result. -- int name=maxInspections5/int !-- minimum length of a query term to be considered for correction -- int name=minQueryLength4/int !-- maximum threshold of documents a query term can appear to be considered for correction -- float name=maxQueryFrequency0.01/float !-- uncomment this to require suggestions to occur in 1% of the documents -- !-- float name=thresholdTokenFrequency.01/float -- /lst
RE: DirectSpellChecker not returning expected suggestions.
If wrangle is not in your index, and if it is within the max # of edits, then it should suggest it. Are you getting anything back from spellcheck at all? What is the exact query you are using? How is the spellcheck field analyzed? If you're using stemming, then wrangle and wrangler might be stemmed to the same word. (by the way, you shouldn't spellcheck against a stemmed or otherwise heavily-analyzed field). James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: S.L [mailto:simpleliving...@gmail.com] Sent: Monday, June 02, 2014 1:06 PM To: solr-user@lucene.apache.org Subject: Re: DirectSpellChecker not returning expected suggestions. OK, I just realized that wrangle is a proper english word, probably thats why I dont get a suggestion for wrangler in this case. How ever in my test index there is no wrangle present , so even though this is a proper english word , since there is no occurence of it in the index should'nt Solr suggest me wrangler ? On Mon, Jun 2, 2014 at 2:00 PM, S.L simpleliving...@gmail.com wrote: I do not get any suggestion (when I search for wrangle) , however I correctly get the suggestion wrangler when I search for wranglr , I am using the Direct and WordBreak spellcheckers in combination, I have not tried using anything else. Is the distance calculation of Solr different than what Levestien distance calculation ? I have set maxEdits to 1 , assuming that this corresponds to the maxDistance. Thanks for your help! On Mon, Jun 2, 2014 at 1:54 PM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: What do you get then? Suggestions, but not the one you’re looking for, or is it deemed correctly spelled? Have you tried another spellChecker impl, for troubleshooting purposes? ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Sat, May 31, 2014 at 12:33 AM, S.L simpleliving...@gmail.com wrote: Hi All, I have a small test index of 400 documents , it happens to have an entry for wrangler, When I search for wranglr, I correctly get the collation suggestion as wrangler, however when I search for wrangle , I do not get a suggestion for wrangler. The Levenstien distance between wrangle -- wrangler is same as the Levestien distance between wranglr--wrangler , I am just wondering why I do not get a suggestion for wrangle. Below is my Direct spell checker configuration. lst name=spellchecker str name=namedirect/str str name=fieldsuggestAggregate/str str name=classnamesolr.DirectSolrSpellChecker/str !-- the spellcheck distance measure used, the default is the internal levenshtein -- str name=distanceMeasureinternal/str str name=comparatorClassscore/str !-- minimum accuracy needed to be considered a valid spellcheck suggestion -- float name=accuracy0.7/float !-- the maximum #edits we consider when enumerating terms: can be 1 or 2 -- int name=maxEdits1/int !-- the minimum shared prefix when enumerating terms -- int name=minPrefix3/int !-- maximum number of inspections per result. -- int name=maxInspections5/int !-- minimum length of a query term to be considered for correction -- int name=minQueryLength4/int !-- maximum threshold of documents a query term can appear to be considered for correction -- float name=maxQueryFrequency0.01/float !-- uncomment this to require suggestions to occur in 1% of the documents -- !-- float name=thresholdTokenFrequency.01/float -- /lst
Re: DirectSpellChecker not returning expected suggestions.
James, I get no results back and no suggestions for wrangle , however I get suggestions for wranglr , and wrangle is not present in my index. I am just searching for wrangle in a field that is created by copying other fields, as to how it is analyzed I dont have access to it now. Thanks. On Mon, Jun 2, 2014 at 2:48 PM, Dyer, James james.d...@ingramcontent.com wrote: If wrangle is not in your index, and if it is within the max # of edits, then it should suggest it. Are you getting anything back from spellcheck at all? What is the exact query you are using? How is the spellcheck field analyzed? If you're using stemming, then wrangle and wrangler might be stemmed to the same word. (by the way, you shouldn't spellcheck against a stemmed or otherwise heavily-analyzed field). James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: S.L [mailto:simpleliving...@gmail.com] Sent: Monday, June 02, 2014 1:06 PM To: solr-user@lucene.apache.org Subject: Re: DirectSpellChecker not returning expected suggestions. OK, I just realized that wrangle is a proper english word, probably thats why I dont get a suggestion for wrangler in this case. How ever in my test index there is no wrangle present , so even though this is a proper english word , since there is no occurence of it in the index should'nt Solr suggest me wrangler ? On Mon, Jun 2, 2014 at 2:00 PM, S.L simpleliving...@gmail.com wrote: I do not get any suggestion (when I search for wrangle) , however I correctly get the suggestion wrangler when I search for wranglr , I am using the Direct and WordBreak spellcheckers in combination, I have not tried using anything else. Is the distance calculation of Solr different than what Levestien distance calculation ? I have set maxEdits to 1 , assuming that this corresponds to the maxDistance. Thanks for your help! On Mon, Jun 2, 2014 at 1:54 PM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: What do you get then? Suggestions, but not the one you’re looking for, or is it deemed correctly spelled? Have you tried another spellChecker impl, for troubleshooting purposes? ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Sat, May 31, 2014 at 12:33 AM, S.L simpleliving...@gmail.com wrote: Hi All, I have a small test index of 400 documents , it happens to have an entry for wrangler, When I search for wranglr, I correctly get the collation suggestion as wrangler, however when I search for wrangle , I do not get a suggestion for wrangler. The Levenstien distance between wrangle -- wrangler is same as the Levestien distance between wranglr--wrangler , I am just wondering why I do not get a suggestion for wrangle. Below is my Direct spell checker configuration. lst name=spellchecker str name=namedirect/str str name=fieldsuggestAggregate/str str name=classnamesolr.DirectSolrSpellChecker/str !-- the spellcheck distance measure used, the default is the internal levenshtein -- str name=distanceMeasureinternal/str str name=comparatorClassscore/str !-- minimum accuracy needed to be considered a valid spellcheck suggestion -- float name=accuracy0.7/float !-- the maximum #edits we consider when enumerating terms: can be 1 or 2 -- int name=maxEdits1/int !-- the minimum shared prefix when enumerating terms -- int name=minPrefix3/int !-- maximum number of inspections per result. -- int name=maxInspections5/int !-- minimum length of a query term to be considered for correction -- int name=minQueryLength4/int !-- maximum threshold of documents a query term can appear to be considered for correction -- float name=maxQueryFrequency0.01/float !-- uncomment this to require suggestions to occur in 1% of the documents -- !-- float name=thresholdTokenFrequency.01/float -- /lst