To explain a little: if you allow post-filtering, then there are no
differences with positive searches. You only delay verification of
distance till after index resolution. Index resolution will return too
much, those are the so-called false positives. Filtering is usually
slower, because it typically involves accessing the data itself.

With negative searches that still applies, but the false positives become
false negatives (which means good results are excluded mistakenly),
because the not-query is applied before discriminating the good from the
bad word distances. With word positions enabled, you allow ML to derive
distance from index, so the near-query becomes accurate. No false
positives, so no false negatives either..

Cheers,
Geert

On 12/24/15, 10:39 AM, ""neil bradley"" <[email protected]> wrote:

>Thanks all for the answers. I have switched on word positions and it is
>now
>behaving as expected. I was a bit surprised by this problem, because in
>the
>admin UI it simply says that enabling word positions will result in "Index
>word positions for faster phrase and near searches (slower document loads
>and larger database files). ". It does not say that the actual results
>will
>be different. I would no doubt have enabled that index when I was chasing
>performance improvements later, but did not consider it to be a vital
>thing
>to do immediately for functionality to work.
>
>Thanks again.
>
>Neil.
>
>
>
>
>on 24/12/15 5:54 AM, Geert Josten <[email protected]> wrote:
>
>> Hi Neil,
>>
>> Did you enable word positions in your database? If not, that means
>> distance is only considered in filtering phase, and a near-query
>> effectively behaves like an and-query unfiltered. That likely includes
>> false positives, which can turn into false negatives when wrapped in a
>> not-query.
>>
>> Cheers,
>> Geert
>>
>> On 12/23/15, 4:50 PM, "[email protected] on behalf
>> of "neil bradley"" <[email protected] on behalf of
>> [email protected]> wrote:
>>
>>>I am having a probkem with using near queries within not queries when
>>>the queries are stored and used in a reverse query.
>>>
>>>This fails, no matter how far apart "bad" and "word" are. The distance
>>>value seems to be ignored. The query will not return the stored query
>>>even if the words "bad" and "word" are hundreds of words apart.
>>>
>>>let $XMLQuery :=
>>><cts:and-query xmlns:cts="http://marklogic.com/cts";>
>>>  <cts:not-query>
>>>    <cts:near-query distance="2">
>>>      <cts:word-query><cts:text
>>>xml:lang="en">bad</cts:text></cts:word-query>
>>>      <cts:word-query><cts:text
>>>xml:lang="en">word</cts:text></cts:word-query>
>>>    </cts:near-query>
>>>  </cts:not-query>
>>>  <cts:or-query>
>>>    <cts:word-query><cts:text
>>>xml:lang="en">red</cts:text></cts:word-query>
>>>    <cts:word-query><cts:text
>>>xml:lang="en">yellow</cts:text></cts:word-query>
>>>  </cts:or-query>
>>></cts:and-query>
>>>let $File := xdmp:document-insert("test.xml",
>>><Query>{$XMLQuery}</Query>)
>>>return
>>>  ( )
>>>
>>> ;
>>>
>>>let $XML := <X>this contains yellow, with a bad a a a a a a word.</X>
>>>return
>>>  cts:search( /Query, cts:reverse-query($XML))
>>>
>>>
>>>But strangely, when I do something similar in memory, it works fine.
>>>So this returns "true" as I would expect:
>>>
>>>let $XMLQuery := ...
>>>let $XML := <X>this contains yellow, with a bad a a a a a a word.</X>
>>>return
>>>  cts:contains(cts:query($XMLQuery), cts:reverse-query($XML))
>>>
>>>And this returns "false", as I would also expect because the words are
>>>close together, so should prevent the match...
>>>
>>>  let $XML := <X>this contains yellow, with a bad a word.</X>
>>>
>>>
>>>Neil.
>>>_______________________________________________
>>>General mailing list
>>>[email protected]
>>>Manage your subscription at:
>>>http://developer.marklogic.com/mailman/listinfo/general
>>
>>
>>
>>

_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to