Shalin Shekhar Mangar wrote:
On Sun, Feb 15, 2009 at 8:56 AM, Mark Miller <markrmil...@gmail.com> wrote:
I think thats the problem with it. People do think of it this way, and it
ends up being very confusing.
If you dont use onlyMorePopular, and you ask for suggestions for a word
that happens to be in the index, you get the word back.
So if I ask for corrections to Lucene, and its in the index, it suggests
Lucene. This is nice for multi term suggestions, because for "mrk lucene" it
might suggest "mark lucene".
Now say I want to toggle onlyMorePopular to add frequency into the mix - my
expectation is that, perhaps now I will get the suggestion "mork lucene" if
mork has a higher freq than mark.
But I will get maybe "mork luke" instead, because I am guaranteed not to
get Lucene as a suggestion if onlyMorePopular is on.
onlyMorePopular=true considers tokens of frequency greater than equal to
frequency of original token. So you may still get Lucene as a suggestion.
Is that the only difference? When I look at the code (I'm new to this
area of the code, so I certainly could be wrong, wouldnt be the first
time, or less than the 100,000th probably), I see:
// if the word exists in the real index and we don't care for word
frequency, return the word itself
if (!morePopular && freq > 0) {
return new String[] { word };
}
So if you have onlyMorePopular=false, Lucene will get Lucene if its in
the index. But if we make it past that line (onlyMorePopular=true),
later there is:
// don't suggest a word for itself, that would be silly
if (sugWord.string.equals(word)) {
continue;
}
So you end up only getting all of the suggestions *but* Lucene, right?
You had to already know the word was misspelled, and now your asking for
a better one. With the onlyMorePopular=false, you only get a correction
if the word is misspelled.
It seems to me, if you are trying to use the suggested query thats built
up, you change the behavior beyond just:
onlyMorePopular=true considers tokens of frequency greater than equal to
frequency of original token.
- Mark