LOL - I sure wish it was! :)
Sadly, that was a typo (Luke, for all its beauties, does not seem to grasp
the concept of a clipboard so the sample was a manual transcription).
A few more details - don't know if this will help or not.
Same query as before, when I do a rewrite of the query in Luke I get back a
set of 44 matching tokens for the given "Austell" ranging between a boost of
0.1429 for "turrel" up to 1.0 for "austell". That the token "austell" gets
a 1.0 is rather obvious since it's a perfect match...
BUT - when I rewrite the query in my local code, I get 2 matches, austell
(at 0.14285707) and (austerlitz at 0.20000005). This is disturbing on two
fronts - first of all where are the other 42? And secondly - why is the
exact match evaluating to such a low boost?
Does that point at all to where I'm going astray?
Thanks!
Casey
On 5/23/08 5:00 AM, "Ian Lea" <[EMAIL PROTECTED]> wrote:
>> ...
>> And expect to match document 156297 (search_text=="Austell GA", type==1).
>> ...
>> System.out.println(searcher.explain(query, 156296));
>
> 156297 != 156296
>
> Could that be it?
>
>
> --
> Ian.
>
>
> On Thu, May 22, 2008 at 11:21 PM, Casey Dement <[EMAIL PROTECTED]> wrote:
>> Hi - trying to execute a search in Lucene and getting results I don't
>> understand :(
>>
>> The index contains fields search_text and type - both indexed tokenized.
>> I'm attempting to execute the query:
>>
>> +(search_text:austell~0.9 search_text:ga~0.9) +(type:1 type:4)
>>
>> And expect to match document 156297 (search_text=="Austell GA", type==1).
>>
>> I am executing this query both directly in code and via the tool Luke - but
>> getting WILDLY different answers. In Luke, the expected document is found
>> no problem, but in my own code I find no results. Obviously I suspect my
>> code of being crap ;)
>>
>> Oh, FYI, in both my local code and Luke I am using a StandardAnalyzer and
>> the default column is "search_text".
>>
>> Here's what I'm doing:
>>
>> /******************************************************************/
>> File location = new File("/the/correct/path");
>> IndexReader index = IndexReader.open(location);
>> Searcher searcher = new IndexSearcher(index);
>> QueryParser parser = new QueryParser("search_text", new
>> StandardAnalyzer());
>> Query query = parser.parse("+(search_text:austell~0.9 search_text:ga~0.9)
>> +(type:1 type:4)");
>> System.out.println(searcher.explain(query, 156296));
>> /******************************************************************/
>>
>> When I run this, I get:
>> | 0.0000 = (NON-MATCH) Failure to meet condition(s) of required/prohibited
>> clause(s)
>> | 0.0000 = no match on required clause (() ())
>> | 0.0000 = (NON-MATCH) product of:
>> | 0.0000 = (NON-MATCH) sum of:
>> | 0.0000 = coord(0/2)
>> | 0.2133 = (MATCH) product of:
>> | 0.4267 = (MATCH) sum of:
>> | 0.4267 = (MATCH) weight(type:1 in 156296), product of:
>> | 0.3672 = queryWeight(type:1), product of:
>> | 1.1618 = idf(docFreq=315734, numDocs=371197)
>> | 0.3161 = queryNorm
>> | 1.1618 = (MATCH) fieldWeight(type:1 in 156296), product of:
>> | 1.0000 = tf(termFreq(type:1)=1)
>> | 1.1618 = idf(docFreq=315734, numDocs=371197)
>> | 1.0000 = fieldNorm(field=type, doc=156296)
>> | 0.5000 = coord(1/2)
>>
>> So obviously I'm loading the index (since it did match the "type") - but it
>> seems to be COMPLETELY ignoring the criteria on "search_text".
>>
>> When I run this exact same string in Luke, I get:
>> | 8.0079 = (MATCH) sum of:
>> | 7.9578 = (MATCH) sum of:
>> | 5.4904 = (MATCH) weight(search_text:austell in 156297), product of:
>> | 0.8074 = queryWeight(search_text:austell), product of:
>> | 10.8800 = idf(docFreq=18, numDocs=371197)
>> | 0.0742 = queryNorm
>> | 6.8000 = (MATCH) fieldWeight(search_text:austell in 156297),
>> product of:
>> | 1.0000 = tf(termFreq(search_text:austell)=1)
>> | 10.8800 = idf(docFreq=18, numDocs=371197)
>> | 0.6250 = fieldNorm(field=search_text, doc=156297)
>> | 2.4673 = (MATCH) weight(search_text:ga in 156297), product of:
>> | 0.5413 = queryWeight(search_text:ga), product of:
>> | 7.2936 = idf(docFreq=685, numDocs=371197)
>> | 0.0742 = queryNorm
>> | 4.5585 = (MATCH) fieldWeight(search_text:ga in 156297), product of:
>> | 1.0000 = tf(termFreq(search_text:ga)=1)
>> | 7.2936 = idf(docFreq=685, numDocs=371197)
>> | 0.6250 = fieldNorm(field=search_text, doc=156297)
>> | 0.0501 = (MATCH) product of:
>> | 0.1002 = (MATCH) sum of:
>> | 0.1002 = (MATCH) weight(type:1 in 156296), product of:
>> | 0.0862 = queryWeight(type:1), product of:
>> | 1.1618 = idf(docFreq=315734, numDocs=371197)
>> | 0.0742 = queryNorm
>> | 1.1618 = (MATCH) fieldWeight(type:1 in 156296), product of:
>> | 1.0000 = tf(termFreq(type:1)=1)
>> | 1.1618 = idf(docFreq=315734, numDocs=371197)
>> | 1.0000 = fieldNorm(field=type, doc=156296)
>> | 0.5000 = coord(1/2)
>>
>> Which while clearly looking at the same document ID in the same index is
>> conversely working perfectly!
>>
>> Does anybody have any idea where I am screwing up? Thanks!
>>
>> Casey
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]