Work-around:
I discovered that if I enclosed the first part of the expression in parens,
it works.
E.g.  replace($s, "^(a*?)b+", "x"): "xcc"

On Fri, May 21, 2010 at 2:56 PM, Maloney, Christopher (NIH/NLM/NCBI) [C] <
malon...@ncbi.nlm.nih.gov> wrote:

>  xquery version "1.0";
>
>
>
> let $s := "aabbcc"
>
> return
>
>   <html xmlns="http://www.w3.org/1999/xhtml";>
>
>     <head>
>
>       <title>regex-anomaly.xqy</title>
>
>     </head>
>
>     <body>
>
>       <h1>regex-anomaly.xqy</h1>
>
>       <p>
>
>         Demonstrate a MarkLogic regular expression bug.
>
>       </p>
>
>       <p>
>
>         Test string is "{$s}"
>
>       </p>
>
>       <p>
>
>         <b>Works correctly:</b>  match as many "a"s as you can
>
>         (greedily),
>
>         then one or more "b", and replace with an "x", we expect
>
>         "xcc":
>
>       </p>
>
>       <blockquote>
>
>         replace($s, "^a*b+", "x"):
>
>         "{replace($s, "^a*b+", "x")}"
>
>       </blockquote>
>
>       <p>
>
>         <b>Fails:</b>  note how the question mark turns off
>
>         greedy matching for the "+" sign, the result should
>
>         be the same "xcc", but I am seeing "xbcc":
>
>       </p>
>
>       <blockquote>
>
>         replace($s, "^a*?b+", "x"):
>
>         "{replace($s, "^a*?b+", "x")}"
>
>       </blockquote>
>
>       <p>
>
>         I tested this same expression using Perl and using
>
>         Saxon, and they both give the expected results "xcc"
>
>         in both cases.
>
>       </p>
>
>     </body>
>
>   </html>
>
>
>
>
>
>
>
>
>
> Chris Maloney
>
> NIH/NLM/NCBI (Contractor)
>
> Building 45, 5AN.36D-17
>
> 301-443-6461
>
>
>
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
>
>
_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to