A NOTE has been added to this issue. ====================================================================== https://austingroupbugs.net/view.php?id=1857 ====================================================================== Reported By: dannyniu Assigned To: ====================================================================== Project: 1003.1(2024)/Issue8 Issue ID: 1857 Category: Base Definitions and Headers Type: Error Severity: Objection Priority: normal Status: New Name: DannyNiu/NJF Organization: Individual User Reference: Section: 9.1 Regular Expression Definitions # and others. Page Number: 179-180 and others Line Number: 6366-6368 and others. Interp Status: --- Final Accepted Text: ====================================================================== Date Submitted: 2024-09-14 12:54 UTC Last Modified: 2024-09-23 08:56 UTC ====================================================================== Summary: Several problems with the new "lazy" regex quantifier. ======================================================================
---------------------------------------------------------------------- (0006880) geoffclare (manager) - 2024-09-23 08:56 https://austingroupbugs.net/view.php?id=1857#c6880 ---------------------------------------------------------------------- > For quantifiers without the `?` lazy quantifier, the most number of possible repetition is the fittest in terms of length; likewise, for quantifiers with the `?` lazy quantifier, the least number of possible repetition is the fittest in terms of length. This would change the established-for-decades "longest" requirement to "most repetitions", which is not the same thing. And it turns out that on macOS the '?' modifier does not change to matching the least repetitions, it is shortest match; the re_format(7) man page is wrong. Tested using the program at the end of https://posix.rhansen.org/p/2020-11-09 with REG_MINIMAL removed: <pre>$ ./a.out '([ab]{6}|a)*?b' aaaabbbb regexec() returned 0 rm_so 0, rm_eo 5</pre> (Least repetitions would give rm_eo 7.) Same test with grep, using -o to see what matched: <pre>$ echo aaaabbbb | grep -E -o '([ab]{6}|a)*?b' aaaab b b b</pre> This behaviour makes sense as the whole point of REG_MINIMAL and the '?' modifier is to change to the opposite greediness, and the opposite of longest is shortest. Having the default as longest and REG_MINIMAL/'?' as least repetitions would produce the same output in the above tests with and without the '?', making them pointless in such cases. Issue History Date Modified Username Field Change ====================================================================== 2024-09-14 12:54 dannyniu New Issue 2024-09-14 12:54 dannyniu Name => DannyNiu/NJF 2024-09-14 12:54 dannyniu Organization => Individual 2024-09-14 12:54 dannyniu Section => 9.1 Regular Expression Definitions # and others. 2024-09-14 12:54 dannyniu Page Number => 179-180 and others 2024-09-14 12:54 dannyniu Line Number => 6366-6368 and others. 2024-09-20 08:05 dannyniu Note Added: 0006879 2024-09-20 08:07 dannyniu Note Edited: 0006879 2024-09-20 08:13 dannyniu Note Edited: 0006879 2024-09-23 08:56 geoffclare Note Added: 0006880 ======================================================================