A NOTE has been added to this issue. 
====================================================================== 
https://austingroupbugs.net/view.php?id=1857 
====================================================================== 
Reported By:                dannyniu
Assigned To:                
====================================================================== 
Project:                    1003.1(2024)/Issue8
Issue ID:                   1857
Category:                   Base Definitions and Headers
Type:                       Error
Severity:                   Objection
Priority:                   normal
Status:                     New
Name:                       DannyNiu/NJF 
Organization:               Individual 
User Reference:              
Section:                    9.1 Regular Expression Definitions # and others. 
Page Number:                179-180 and others 
Line Number:                6366-6368 and others. 
Interp Status:              --- 
Final Accepted Text:         
====================================================================== 
Date Submitted:             2024-09-14 12:54 UTC
Last Modified:              2024-09-24 10:46 UTC
====================================================================== 
Summary:                    Several problems with the new "lazy" regex
quantifier.
====================================================================== 

---------------------------------------------------------------------- 
 (0006881) geoffclare (manager) - 2024-09-24 10:46
 https://austingroupbugs.net/view.php?id=1857#c6881 
---------------------------------------------------------------------- 
Suggested changes ...

On page 179 line 6348 section 9.1, add a sentence:<blockquote>The matching
process is described in [xref to 9.2].</blockquote>
and move the remaining paragraphs of the "matching" definition to after
page 181 line 6413 section 9.2.

On page 179 line 6357 section 9.1, change:<blockquote>If the pattern
permits a variable number of matching characters and thus there is more
than one such sequence starting at that point, the longest such sequence is
matched. For example, the BRE "bb*" matches the second to fourth characters
of the string "abbbc", and the ERE "(wee|week)(knights|night)" matches all
ten characters of the string "weeknights".

Consistent with the whole match being the longest of the leftmost matches,
each subpattern, from left to right, shall match the longest possible
string.</blockquote>to:<blockquote>If the pattern permits a variable number
of matching characters and thus there is more than one such sequence
starting at that point, the matched sequence shall be the longest such
sequence for which any minimal repetitions (see [xref to 9.4.6]) used in
the match have the shortest possible match. For example, the BRE "bb*"
matches the second to fourth characters of the string "abbbc", and the ERE
"(wee|week)(knights|night)" matches all ten characters of the string
"weeknights". However, the ERE "(aaa??)*" matches only the first four
characters of the string "aaaaa", not all five, because in order to match
all five, "a??" would match with length one instead of zero; the ERE
"(aaa??)*|(aaa?)*" matches all five because the longest match is one which
does not use any minimal repetitions.

Consistent with the match for the entire regular expression being the
leftmost and longest for which any minimal repetitions used in the match
have the shortest possible match, each BRE or ERE in a concatenated set,
from left to right, shall match the longest possible string for which any
minimal repetitions used in the match for that BRE or ERE have the shortest
possible match.</blockquote>
On page 180 line 6367 section 9.1, change:<blockquote>the subpattern
"(.*?)" matches the empty string, since that is the longest possible match
for the ERE ".*?"</blockquote>to:<blockquote>the subexpression "(.*?)"
matches the empty string, since that is the longest possible match for
which the minimal repetition ".*?" has the shortest possible match (zero
length).</blockquote>
On page 179 line 6370 section 9.1, change:<blockquote>the longest sequence
shall be measured</blockquote>to:<blockquote>the sequence length shall be
measured</blockquote>
After page 191 line 6814 section 9.5, add a
note:<blockquote><small><b>Note:</b>The grammar defines syntax only and
places no requirements on implementations as to how the parsed BRE or ERE
is used for matching. The matching process is described in [xref to
9.2].</small></blockquote>
After XRAT page 3716 line 127617 section A.9.4.6, add a
paragraph:<blockquote>Note that the repetition modifier '?'
(<question-mark>) is specified as changing the matching behavior for the
modified repetition from the leftmost longest possible match to the
leftmost shortest possible match. This does not necessarily give the same
result as matching with the least repetitions. For example, the ERE
"([ab]{6}|a)*?b" matches the first five characters of the string "aaaabbbb"
as this is the shortest for the minimal repetition "*?". Matching with the
least repetitions would match the first seven characters by using one
repetition of "[ab]{6}" instead of four repetitions of "a".</blockquote> 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2024-09-14 12:54 dannyniu       New Issue                                    
2024-09-14 12:54 dannyniu       Name                      => DannyNiu/NJF    
2024-09-14 12:54 dannyniu       Organization              => Individual      
2024-09-14 12:54 dannyniu       Section                   => 9.1 Regular
Expression Definitions # and others.
2024-09-14 12:54 dannyniu       Page Number               => 179-180 and others
2024-09-14 12:54 dannyniu       Line Number               => 6366-6368 and
others.
2024-09-20 08:05 dannyniu       Note Added: 0006879                          
2024-09-20 08:07 dannyniu       Note Edited: 0006879                         
2024-09-20 08:13 dannyniu       Note Edited: 0006879                         
2024-09-23 08:56 geoffclare     Note Added: 0006880                          
2024-09-24 10:46 geoffclare     Note Added: 0006881                          
======================================================================


              • ... Hans Ã…berg via austin-group-l at The Open Group
        • ... Stephane Chazelas via austin-group-l at The Open Group
          • ... Geoff Clare via austin-group-l at The Open Group
  • [1003.1(2024... Austin Group Bug Tracker via austin-group-l at The Open Group
  • [1003.1(2024... Austin Group Bug Tracker via austin-group-l at The Open Group
    • Re: [10... Niu Danny via austin-group-l at The Open Group
    • Re: [10... Niu Danny via austin-group-l at The Open Group
      • Re:... Geoff Clare via austin-group-l at The Open Group
        • ... Niu Danny via austin-group-l at The Open Group
          • ... Geoff Clare via austin-group-l at The Open Group
  • [1003.1(2024... Austin Group Bug Tracker via austin-group-l at The Open Group
    • Re: [10... Steffen Nurpmeso via austin-group-l at The Open Group
      • Re:... Geoff Clare via austin-group-l at The Open Group
        • ... Harald van Dijk via austin-group-l at The Open Group
        • ... Garrett Wollman via austin-group-l at The Open Group
          • ... Steffen Nurpmeso via austin-group-l at The Open Group
          • ... Geoff Clare via austin-group-l at The Open Group
            • ... Geoff Clare via austin-group-l at The Open Group
        • ... Steffen Nurpmeso via austin-group-l at The Open Group
  • [1003.1(2024... Austin Group Bug Tracker via austin-group-l at The Open Group
  • [1003.1(2024... Austin Group Bug Tracker via austin-group-l at The Open Group

Reply via email to