A NOTE has been added to this issue. 
====================================================================== 
https://www.austingroupbugs.net/view.php?id=1857 
====================================================================== 
Reported By:                dannyniu
Assigned To:                
====================================================================== 
Project:                    1003.1(2024)/Issue8
Issue ID:                   1857
Category:                   Base Definitions and Headers
Tags:                       tc1-2024
Type:                       Error
Severity:                   Objection
Priority:                   normal
Status:                     Interpretation Required
Name:                       DannyNiu/NJF 
Organization:               Individual 
User Reference:              
Section:                    9.1 Regular Expression Definitions # and others. 
Page Number:                179-180 and others 
Line Number:                6366-6368 and others. 
Interp Status:              Approved 
Final Accepted Text:       
https://www.austingroupbugs.net/view.php?id=1857#c6919 
Resolution:                 Accepted As Marked
Fixed in Version:           
====================================================================== 
Date Submitted:             2024-09-14 12:54 UTC
Last Modified:              2025-03-20 15:57 UTC
====================================================================== 
Summary:                    Several problems with the new "lazy" regex
quantifier.
======================================================================
Relationships       ID      Summary
----------------------------------------------------------------------
related to          0001877 ISO editors Issue 8 comment 068
====================================================================== 

---------------------------------------------------------------------- 
 (0007130) geoffclare (manager) - 2025-03-20 15:57
 https://www.austingroupbugs.net/view.php?id=1857#c7130 
---------------------------------------------------------------------- 
On page 179 line 6348 section 9.1, add a sentence:
<blockquote>The matching process is described in [xref to 9.2].</blockquote>
and move the remaining paragraphs of the "matching" definition to after page 181
line 6413 section 9.2.

On page 179 line 6357 section 9.1, change:
<blockquote>If the pattern permits a variable number of matching characters and
thus there is more than one such sequence starting at that point, the longest
such sequence is matched. For example, the BRE "bb*" matches the second to
fourth characters of the string "abbbc", and the ERE "(wee|week)(knights|night)"
matches all ten characters of the string "weeknights".

Consistent with the whole match being the longest of the leftmost matches, each
subpattern, from left to right, shall match the longest possible
string.</blockquote>
to:
<blockquote>If the pattern permits a variable number of matching characters and
thus there is more than one such sequence starting at that point, the match
shall be made according to the following rules:

1. For a BRE, or an ERE that does not use the repetition modifier '?', the match
shall be the leftmost longest.

2. If an ERE contains repetitions with and without the repetition modifier '?',
the precedence between the repetitions shall be:

a. Each leftmost shortest match shall match the leftmost shortest sequence in
the string, in descending priority from left to right.

b. Consistent with rule 2a, the length matched by the entire regular expression
shall be the leftmost longest.

c. Consistent with rules 2a and 2b, each leftmost longest match shall match the
leftmost longest sequence in the string, in descending priority from left to
right.

d. If an attempt is made to match the same sequence of the string using
repetitions both with and without the repetition modifier '?', the behavior is
unspecified. For example, the ERE ([0-9]+)+? has unspecified behavior.

According to these rules, the BRE "bb*" matches the second to fourth characters
of the string "abbbc", and the ERE "(wee|week)(knights|night)" matches all ten
characters of the string "weeknights". However, the ERE "(aaa??)*" matches only
the first four characters of the string "aaaaa", not all five, because in order
to match all five, "a??" would match with length one instead of zero; the ERE
"(aaa??)*|(aaa?)*" matches all five because the longest match is one which does
not use any minimal repetitions.

Consistent with the match for the entire regular expression being made according
to the above rules, each BRE or ERE in a concatenated set, from left to right,
shall match according to the above rules, applied to that BRE or
ERE.</blockquote>

On page 180 line 6367 section 9.1, change:
<blockquote>the subpattern "(.*?)" matches the empty string, since that is the
longest possible match for the ERE ".*?"</blockquote>
to:
<blockquote>the subexpression "(.*?)" matches the empty string, since the
minimal repetition ".*?" has priority and the empty string is the shortest
possible match (zero length) for that repetition.</blockquote>

On page 179 line 6370 section 9.1, change:
<blockquote>the longest sequence shall be measured</blockquote>
to:
<blockquote>the sequence length shall be measured</blockquote>

After page 191 line 6814 section 9.5, add a note:
<blockquote><small><b>Note:</b>The grammar defines syntax only and places no
requirements on implementations as to how the parsed BRE or ERE is used for
matching. The matching process is described in [xref to
9.2].</small></blockquote>

After XRAT page 3716 line 127617 section A.9.4.6, add a paragraph:
<blockquote>Note that the repetition modifier '?' (<question-mark>) is specified
as changing the matching behavior for the modified repetition from the leftmost
longest possible match to the leftmost shortest possible match. This does not
necessarily give the same result as matching with the least repetitions. For
example, the ERE "([ab]{6}|a)*?b" matches the first five characters of the
string "aaaabbbb" as this is the shortest for the minimal repetition "*?".
Matching with the least repetitions would match the first seven characters by
using one repetition of "[ab]{6}" instead of four repetitions of "a". This
distinction is only possible because the alternatives in an ERE alternation are
chosen according to which gives the longest (or shortest) match. Other types of
regular expression exist (notably in <i>perl</i>, <i>php</i>, and <i>python</i>)
where the alternatives are tried in order; for those there is no difference
between longest and most repetitions or between shortest and least
repetitions.</blockquote> 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2024-09-14 12:54 dannyniu       New Issue                                    
2024-09-14 12:54 dannyniu       Name                      => DannyNiu/NJF    
2024-09-14 12:54 dannyniu       Organization              => Individual      
2024-09-14 12:54 dannyniu       Section                   => 9.1 Regular
Expression Definitions # and others.
2024-09-14 12:54 dannyniu       Page Number               => 179-180 and others
2024-09-14 12:54 dannyniu       Line Number               => 6366-6368 and
others.
2024-09-20 08:05 dannyniu       Note Added: 0006879                          
2024-09-20 08:07 dannyniu       Note Edited: 0006879                         
2024-09-20 08:13 dannyniu       Note Edited: 0006879                         
2024-09-23 08:56 geoffclare     Note Added: 0006880                          
2024-09-24 10:46 geoffclare     Note Added: 0006881                          
2024-09-24 10:46 geoffclare     Note Edited: 0006881                         
2024-09-24 11:54 dannyniu       Note Added: 0006882                          
2024-09-24 12:08 dannyniu       Note Edited: 0006882                         
2024-09-24 12:09 dannyniu       Note Edited: 0006882                         
2024-09-24 12:11 dannyniu       Note Edited: 0006882                         
2024-09-24 12:12 dannyniu       Note Edited: 0006882                         
2024-09-24 14:04 geoffclare     Note Added: 0006883                          
2024-09-25 08:28 dannyniu       Note Added: 0006884                          
2024-09-25 08:30 dannyniu       Note Edited: 0006884                         
2024-09-25 08:33 dannyniu       Note Edited: 0006884                         
2024-09-25 08:42 dannyniu       Note Edited: 0006884                         
2024-09-25 08:43 dannyniu       Note Edited: 0006884                         
2024-09-25 11:36 dannyniu       Note Edited: 0006884                         
2024-09-25 13:17 geoffclare     Note Added: 0006885                          
2024-09-25 15:08 dannyniu       Note Added: 0006886                          
2024-09-25 15:17 dannyniu       Note Edited: 0006886                         
2024-09-25 15:23 dannyniu       Note Edited: 0006886                         
2024-09-25 15:27 dannyniu       Note Edited: 0006886                         
2024-09-25 22:10 steffen        Note Added: 0006887                          
2024-09-25 22:33 steffen        Note Added: 0006888                          
2024-09-25 22:36 steffen        Note Added: 0006889                          
2024-09-26 04:02 dannyniu       Note Edited: 0006886                         
2024-09-26 06:50 dannyniu       Note Edited: 0006886                         
2024-09-26 08:41 geoffclare     Note Added: 0006890                          
2024-09-26 11:43 dannyniu       Note Added: 0006891                          
2024-09-26 11:50 dannyniu       Note Edited: 0006891                         
2024-09-26 12:16 geoffclare     Note Added: 0006892                          
2024-09-26 12:17 geoffclare     Note Edited: 0006892                         
2024-09-26 13:27 geoffclare     Note Edited: 0006881                         
2024-09-26 13:28 geoffclare     Note Edited: 0006881                         
2024-09-26 13:30 geoffclare     Note Edited: 0006892                         
2024-09-27 07:09 geoffclare     Note Edited: 0006885                         
2024-09-27 11:34 dannyniu       Note Added: 0006896                          
2024-09-27 11:37 dannyniu       Note Edited: 0006896                         
2024-09-27 15:51 steffen        Note Added: 0006897                          
2024-09-30 09:26 geoffclare     Note Added: 0006898                          
2024-10-02 02:07 dannyniu       Note Added: 0006899                          
2024-10-03 09:12 geoffclare     Note Added: 0006900                          
2024-10-03 09:14 geoffclare     Note Edited: 0006900                         
2024-10-03 09:15 geoffclare     Note Edited: 0006900                         
2024-10-17 15:24 geoffclare     Note Added: 0006919                          
2024-10-17 15:25 geoffclare     Interp Status             => Pending         
2024-10-17 15:25 geoffclare     Final Accepted Text       =>
https://www.austingroupbugs.net/view.php?id=1857#c6919    
2024-10-17 15:25 geoffclare     Status                   New => Interpretation
Required
2024-10-17 15:25 geoffclare     Resolution               Open => Accepted As
Marked
2024-10-17 15:26 geoffclare     Tag Attached: tc1-2024                       
2024-10-17 16:22 agadmin        Interp Status            Pending => Proposed 
2024-10-17 16:22 agadmin        Note Added: 0006920                          
2024-11-19 11:53 agadmin        Interp Status            Proposed => Approved
2024-11-19 11:53 agadmin        Note Added: 0006963                          
2024-11-19 12:11 geoffclare     Relationship added       related to 0001877  
2024-12-01 12:43 dannyniu       Note Added: 0006979                          
2024-12-03 15:11 geoffclare     Note Added: 0006982                          
2024-12-25 14:40 dannyniu       Note Added: 0007033                          
2025-02-27 05:18 dannyniu       Note Added: 0007087                          
2025-03-04 14:56 dannyniu       Note Added: 0007090                          
2025-03-05 11:43 dannyniu       Note Added: 0007091                          
2025-03-06 14:26 geoffclare     Note Added: 0007094                          
2025-03-20 15:57 geoffclare     Note Added: 0007130                          
======================================================================


      • R... Niu Danny via austin-group-l at The Open Group
      • R... Niu Danny via austin-group-l at The Open Group
        • ... Hans Ã…berg via austin-group-l at The Open Group
        • ... Geoff Clare via austin-group-l at The Open Group
  • [1003.1(20... Austin Group Issue Tracker via austin-group-l at The Open Group
  • [1003.1(20... Austin Group Issue Tracker via austin-group-l at The Open Group
  • Re: [1003.... Steffen Nurpmeso via austin-group-l at The Open Group
    • Re: [... Niu Danny via austin-group-l at The Open Group
  • [1003.1(20... Austin Group Issue Tracker via austin-group-l at The Open Group
  • [1003.1(20... Austin Group Issue Tracker via austin-group-l at The Open Group
  • [1003.1(20... Austin Group Issue Tracker via austin-group-l at The Open Group
  • [1003.1(20... Austin Group Issue Tracker via austin-group-l at The Open Group
  • [1003.1(20... Austin Group Issue Tracker via austin-group-l at The Open Group
  • [1003.1(20... Austin Group Issue Tracker via austin-group-l at The Open Group
  • [1003.1(20... Austin Group Issue Tracker via austin-group-l at The Open Group

Reply via email to