Hi James,

Couple more follow up questions -

1. Do changes to the response format have to be backwards compatible at
this point? Seems like if we changed it to always return the origFreq even
if there are no suggestions then that could break things right?
2. For our purposes, we need to be able to order suggestions from multiple
Solr cores so we were thinking of changing the format to also include the
score that is calculated for each suggestion (which isn't exposed right
now). Are these scores from different dictionary fields comparable
(assuming we use the default INTERNAL_LEVENSHTEIN_DISTANCE metric)? And do
you think this would be of general use i.e. could it be contributed back to
Solr?

Thanks,
Nalini


On Fri, Dec 7, 2012 at 2:20 PM, Nalini Kartha <nalinikar...@gmail.com>wrote:

> Ah I see what you mean. Will probably try to change the response to look
> like the internal shard one then.
>
> Thanks for the detailed explanation!
>
> - Nalini
>
>
> On Fri, Dec 7, 2012 at 1:38 PM, Dyer, James 
> <james.d...@ingramcontent.com>wrote:
>
>> The response from the shards is different from the final spellcheck
>> response in that it does include the term even if there are no suggestions
>> for it.  So to get the behavior you want, we'd probably just have to make
>> it so you could get the "shard-to-shard-internal" version.
>>
>> See
>> http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/component/SpellCheckComponent.java
>>
>> ...and method "toNamedList(...)"
>>
>> ...and this line:
>>
>> if (theSuggestions != null && (theSuggestions.size() > 0 ||
>> shardRequest)) {
>> ...
>> }
>>
>> ...the "shardRequest" boolean is passed with "true" here if its the 1st
>> stage of a distributed request (from #process).  The various shards send
>> their responses to the main shard which then integrates them together (in
>> #finishStage)  Note that #finishStage always passes "shardRequest=false" to
>> #toNamedList so that the end user gets a "normal" response back, omitting
>> terms for which there are no suggestions.
>>
>> James Dyer
>> E-Commerce Systems
>> Ingram Content Group
>> (615) 213-4311
>>
>>
>> -----Original Message-----
>> From: Nalini Kartha [mailto:nalinikar...@gmail.com]
>> Sent: Friday, December 07, 2012 9:54 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Differentiate between correctly spelled term and mis-spelled
>> term with no corrections
>>
>> Hi James,
>>
>> Thanks for the response, will open a JIRA for this.
>>
>> Had one follow-up question - how does the Distributed SpellCheckComponent
>> handle this? I tried looking at the code but it's not obvious to me how it
>> is able to differentiate between these 2 cases. I see that it only
>> considers a term to be wrongly spelt if all shards return a suggestion for
>> it but isn't it possible that a suggestion is not returned because nothing
>> close enough could be found in some shard? Or is the response from shards
>> different than the final spellcheck response we get from Solr in some way?
>>
>> Thanks,
>> Nalini
>>
>>
>> On Fri, Dec 7, 2012 at 10:26 AM, Dyer, James
>> <james.d...@ingramcontent.com>wrote:
>>
>> > You might want to open a jira issue for this to request that the feature
>> > be added.  If you haven't used it before, you need to create an account.
>> >
>> > https://issues.apache.org/jira/browse/SOLR
>> >
>> > In the mean time, If you need to get the document frequency of the query
>> > terms, see http://wiki.apache.org/solr/TermsComponent , which maybe
>> would
>> > provide you a viable workaround.
>> >
>> > James Dyer
>> > E-Commerce Systems
>> > Ingram Content Group
>> > (615) 213-4311
>> >
>> >
>> > -----Original Message-----
>> > From: Nalini Kartha [mailto:nalinikar...@gmail.com]
>> > Sent: Thursday, December 06, 2012 2:44 PM
>> > To: solr-user@lucene.apache.org
>> > Subject: Differentiate between correctly spelled term and mis-spelled
>> term
>> > with no corrections
>> >
>> > Hi,
>> >
>> > When using the SolrSpellChecker, is there currently any way to
>> > differentiate between a term that exists in the dictionary and a
>> > mis-spelled term for which no corrections were found when looking at the
>> > spellcheck response?
>> >
>> > From reading the doc and trying out some simple test cases it seems like
>> > there isn't - in both cases it looks like the response doesn't include
>> the
>> > term.
>> >
>> > Could the extended results format be changed to include the original
>> term
>> > frequency even if there are no suggestions? This would allow us to make
>> > this differentiation.
>> >
>> > Thanks,
>> > Nalini
>> >
>> >
>>
>>
>

Reply via email to