If I am searching for matches in a doc using OR'd word-queries, and one query
(A) happens to contain the text of another query (B), cts:highlight doesn't
behave ideally when replacing the text of query A.
Here's a simplified example:
let $p := <p>From the desk of a Top Secret Government Agency.</p>
return
cts:highlight($p,
cts:or-query(
(cts:word-query("Top Secret Government Agency"),
cts:word-query("Secret Government Agency"))),
<m>{$cts:text}</m>)
The output is:
<p>From the desk of a <m>Top </m><m>Secret Government Agency</m>.</p>
Ideally, I would expect:
<p>From the desk of a <m>Top <m>Secret Government Agency</m></m>.</p>
but if I could even get this it would be fine:
<p>From the desk of a <m>Top Secret Government Agency</m>.</p>
I tried to intervene inside the replacement param of cts:highlight, using
xdmp:set to keep track of matches and bailing out if the current match was
found in the set of already-matched. But the output is not an artifact of one
replacement being followed by another, mangling the output - cts:highlight
really does think that "Top " & "Secret Government Agency" are the matches.
Any suggestions on how to prevent this from happening?
Thanks,
Will
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general