[it seems to me Mantis should set Reply-To: and/or Mail-Followup-To: to [email protected], as it did before?]
Austin Group Issue Tracker wrote in <l6pptcinzl3kuidmhbp0tlfuiskjkmdpkkmroi...@www.austingroupbugs.net>: ... |https://www.austingroupbugs.net/view.php?id=1857 ... | (0007090) dannyniu (reporter) - 2025-03-04 14:56 | https://www.austingroupbugs.net/view.php?id=1857#c7090 |---------------------------------------------------------------------- |For the sake of public record, I'm duplicating mailing list message \ |to note here |that Geoff's step-by-step analysis of my torture testing case (at |https://www.austingroupbugs.net/view.php?id=1857#c6898 ) is inconsistent \ |with |macOS `grep`. Here's my terminal output: ... |Whether this is indeed a bug in software with no change to the standard \ |text |needed, or that the standard text itself is in error is arguable. ... I argue in favour of what is the resolution of this bug, and which reads (is parts): If the pattern permits a variable number of matching characters and thus there is more than one such sequence starting at that point, the matched sequence shall be the longest such sequence for which any minimal repetitions (see [xref to 9.4.6]) used in the match have the shortest possible match. For example, the BRE "bb*" matches the second to fourth characters of the string "abbbc", and the ERE "(wee|week)(knights|night)" matches all ten characters of the string "weeknights". However, the ERE "(aaa??)*" matches only the first four characters of the string "aaaaa", not all five, because in order to match all five, "a??" would match with length one instead of zero; the ERE "(aaa??)*|(aaa?)*" matches all five because the longest match is one which does not use any minimal repetitions. Consistent with the match for the entire regular expression being the leftmost and longest for which any minimal repetitions used in the match have the shortest possible match, each BRE or ERE in a concatenated set, from left to right, shall match the longest possible string for which any minimal repetitions used in the match for that BRE or ERE have the shortest possible match. and Note that the repetition modifier '?' (<question-mark>) is specified as changing the matching behavior for the modified repetition from the leftmost longest possible match to the leftmost shortest possible match. This does not necessarily give the same result as matching with the least repetitions. For example, the ERE "([ab]{6}|a)*?b" matches the first five characters of the string "aaaabbbb" as this is the shortest for the minimal repetition "*?". Matching with the least repetitions would match the first seven characters by using one repetition of "[ab]{6}" instead of four repetitions of "a". This distinction is only possible because the alternatives in an ERE alternation are chosen according to which gives the longest (or shortest) match. Other types of regular expression exist (notably in perl, php, and python) where the alternatives are tried in order; for those there is no difference between longest and most repetitions or between shortest and least repetitions. I, btw, also posted that (and a bit more) to Mike Haertel of the GNU project, who is developing the new minrx regular expression library (and the widely used GNU awk of Aharon Robbins is hooked to it), after he had written +[.] Nowhere else in the standard +is the word subpattern defined or used. Are the subpatterns of ABC: +AB and C, A and BC, or A, B, and C? The standard doesn't say. If the +subpatterns are A, B, and C, then the standard is saying to maximize A at +the expense of B and C. However the only example in corresponding +paragraph of the standard has just 2 subpatterns. , and also quoting parts of Geoff Clare said just recently (Thu, 27 Feb 2025 10:13:14 +0000): I also apologise for not having intellectually penetrated the notable differences in between POSIX EREs and perl etc regular expressions back in 2013, when i was opening https://www.austingroupbugs.net/view.php?id=793 to include "shortest possible match"es. I was coming from programming solutions, and certain problems can just not be solved with regular expressions except by matching against shorted possible matches. Ciao, --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
