Niu Danny via austin-group-l at The Open Group wrote in
 <[email protected]>:
 |Now I'm completely lost.

Me too.

 |So how about the informative text from "precedence of construct" perspec\
 |tive?
 |I discussed it a few days earlier with Steffen:

I seem to have lost track of which discussions are going on here.
But Geoff Clare's mail quoted below i recall, and i found it
logical.  He also before that already said

  What I meant, when I said it is not recursive, is something like:

  ([0-9]+[a-z]*)+?

  where the inner + and * are individually greedy; they don't inherit
  the outer repetition's non-greediness.

which i also found logical.
Mike Haertel's MinRX (cool thing, my old 1998 C++ knowledge feels
a bit lost with all the "auto &X =" things), with the patch

  diff --git a/tryit.c b/tryit.c
  index 67bfbfb9ec..82123254d2 100644
  --- a/tryit.c
  +++ b/tryit.c
  @@ -26,6 +26,8 @@ main(int argc, char *argv[])
                        eflags |= MINRX_REG_FIRSTSUB;
                else if (strcmp(argv[1], "-i") == 0)
                        cflags |= MINRX_REG_ICASE;
  +             else if (strcmp(argv[1], "-m") == 0)
  +                     cflags |= MINRX_REG_MINIMAL;
                else if (strcmp(argv[1], "-n") == 0)
                        cflags |= MINRX_REG_NEWLINE;
                else if (strcmp(argv[1], "-r") == 0)
  @@ -66,7 +68,9 @@ main(int argc, char *argv[])
                                        break;
                        for (int i = 0; i <= j; ++i)
                                if (rm[i].rm_so != -1)
  -                                     printf("(%d,%d)", (int) rm[i].rm_so, 
(int) rm[i].rm_eo);
  +                                     printf("(%d,%d=%.*s)",
  +                                             (int) rm[i].rm_so, (int) 
rm[i].rm_eo,
  +                                             (int) 
(rm[i].rm_eo-rm[i].rm_so), &argv[2][rm[i].rm_so]);
                                else
                                        printf("(?,?)");
                        putchar('\n');

spits out

  #?0|kent:minrx.git$ ./tryit 'X(([0-9a-zA-Z]+)([a-zA-Z]*))+Y' 'X000aaaYbbbY'
  (0,12=X000aaaYbbbY)(1,11=000aaaYbbb)(1,11=000aaaYbbb)(11,11=)
  #?0|kent:minrx.git$ ./tryit 'X(([0-9a-zA-Z]+)([a-zA-Z]*))+?Y' 'X000aaaYbbbY'
  (0,8=X000aaaY)(1,7=000aaa)(1,7=000aaa)(7,7=)
  #?0|kent:minrx.git$ ./tryit 'X(([0-9a-zA-Z]+?)([a-zA-Z]*))+?Y' 'X000aaaYbbbY'
  (0,8=X000aaaY)(1,7=000aaa)(1,4=000)(4,7=aaa)
  #?0|kent:minrx.git$ ./tryit 'X(([0-9a-zA-Z]+?)([a-zA-Z]*))+Y' 'X000aaaYbbbY'
  (0,12=X000aaaYbbbY)(1,11=000aaaYbbb)(1,4=000)(4,11=aaaYbbb)

And perl says:

  #?0|kent:nail.git$ perl -e '$i="X000aaaYbbbY"; if($i =~ 
"X(([0-9a-zA-Z]+)([a-zA-Z]*))+Y"){print ",$&,$1,$2,$3,\n"}'
  ,X000aaaYbbbY,000aaaYbbb,000aaaYbbb,,
  #?0|kent:nail.git$ perl -e '$i="X000aaaYbbbY"; if($i =~ 
"X(([0-9a-zA-Z]+)([a-zA-Z]*))+?Y"){print ",$&,$1,$2,$3,\n"}'
  ,X000aaaYbbbY,000aaaYbbb,000aaaYbbb,,
  #?0|kent:nail.git$ perl -e '$i="X000aaaYbbbY"; if($i =~ 
"X(([0-9a-zA-Z]+?)([a-zA-Z]*))+?Y"){print ",$&,$1,$2,$3,\n"}'
  ,X000aaaYbbbY,0aaaYbbb,0,aaaYbbb,
  #?0|kent:nail.git$ perl -e '$i="X000aaaYbbbY"; if($i =~ 
"X(([0-9a-zA-Z]+?)([a-zA-Z]*))+Y"){print ",$&,$1,$2,$3,\n"}'
  ,X000aaaYbbbY,0aaaYbbb,0,aaaYbbb,

Now i need to think about that.

Ciao already here,

 |> I was thinking about something like this:
 |> 
 |> The precedence of quantifiers are as follow (from highest to lowest):
 |> 
 |> 1. The length of any minimal quantifier -modified subexpression
 |>    shall be such that they match the shortest substring of the
 |>    subject string, in descending priority from left to right.
 |> 
 |> 2. Consistent with rule 1, the length of the overall match shall 
 |>    be the longest possible.
 |> 
 |> 3. The length of any greedy quantifier -modified subexpression
 |>    shall be such that they match the longest substring of the
 |>    subject string, in descending priority from left to right.
 |
 |> 2025年3月12日 00:51,Geoff Clare via austin-group-l at The Open Group \
 |> <[email protected]> 写道:
 |> 
 |> Niu Danny wrote, on 07 Mar 2025:
 |>> 
 |>>> 2025年3月6日 23:06,Geoff Clare via austin-group-l at The Open \
 |>>> Group <[email protected]> 写道:
 |>>>> 
 |>>>> 
 |>>>> 
 |>>>> e.g. `([0-9]+)+?`
 |>>> 
 |>>> This is a pathological case because you are simultaneously asking for
 |>>> both the longest and shortest match for the SAME part of the string.
 |>>> Such cases ought not to occur in real-world use.
 |>>> 
 |>>> What I meant, when I said it is not recursive, is something like:
 |>>> 
 |>>> ([0-9]+[a-z]*)+?
 |>>> 
 |>>> where the inner + and * are individually greedy; they don't inherit
 |>>> the outer repetition's non-greediness.
 |>> 
 |>> But greedy quantifiers are still nested in a non-greedy one.
 |>> Would you say that this is also pathological, or do you 
 |>> have something else on your mind?
 |> 
 |> Here's a better example.  This uses the code from bug note 7094
 |> modified to take the ERE and string as arguments:
 |> 
 |> $ ./a.out 'X(([0-9a-zA-Z]+)([a-zA-Z]*))+?Y' 'X000aaaYbbbY' 
 |> regcomp returned: 0
 |> regexec returned: 0
 |> 0       8
 |> 1       7
 |> 1       7
 |> 7       7
 |> -1      -1
 |> 
 |> Because [0-9a-zA-Z]+ is greedy it matched 000aaa and [a-zA-Z]* was
 |> left matching the empty string.  If the ? modifier was recursive then
 |> [0-9a-zA-Z]+ would have matched just the first 0 and [a-zA-Z]* would
 |> have matched 00aaa.
 |> 
 |> -- 
 |> Geoff Clare <[email protected]>
 |> The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
 --End of <[email protected]>

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

  • I still find the res... Niu Danny via austin-group-l at The Open Group
    • Re: I still fin... Niu Danny via austin-group-l at The Open Group
      • Re: I still... Geoff Clare via austin-group-l at The Open Group
        • Re: I s... Niu Danny via austin-group-l at The Open Group
    • Re: I still fin... Geoff Clare via austin-group-l at The Open Group
      • Re: I still... Niu Danny via austin-group-l at The Open Group
        • Re: I s... Geoff Clare via austin-group-l at The Open Group
          • Re:... Niu Danny via austin-group-l at The Open Group
            • ... Steffen Nurpmeso via austin-group-l at The Open Group
            • ... Geoff Clare via austin-group-l at The Open Group
              • ... Niu Danny via austin-group-l at The Open Group
                • ... Niu Danny via austin-group-l at The Open Group
                • ... Geoff Clare via austin-group-l at The Open Group

Reply via email to