The following issue has been SUBMITTED. 
====================================================================== 
https://www.austingroupbugs.net/view.php?id=1857 
====================================================================== 
Reported By:                dannyniu
Assigned To:                
====================================================================== 
Project:                    1003.1(2024)/Issue8
Issue ID:                   1857
Category:                   Base Definitions and Headers
Type:                       Error
Severity:                   Objection
Priority:                   normal
Status:                     New
Name:                       DannyNiu/NJF 
Organization:               Individual 
User Reference:              
Section:                    9.1 Regular Expression Definitions # and others. 
Page Number:                179-180 and others 
Line Number:                6366-6368 and others. 
Interp Status:              --- 
Final Accepted Text:         
====================================================================== 
Date Submitted:             2024-09-14 12:54 UTC
Last Modified:              2024-09-14 12:54 UTC
====================================================================== 
Summary:                    Several problems with the new "lazy" regex
quantifier.
Description: 
1. Newly added text describe "shortest" match with the word "longest":
====

lines 6366-6368 on page 180:

> However, matching the ERE "(.*?).*" against "abcdef", the subpattern
"(.*?)" matches the empty string, since that is the **longest** possible
match for the ERE ".*?".

The qualifier "?" modifies the quantifier to make them "lazy", and the
"longest" make the intention of the standard writer confusing. Maybe it
should be "shortest"?

2. the supposed length of subpatterns.
====

Lines 6362-6363 on page 180:

> Consistent with the whole match being the longest of the leftmost
matches, each subpattern, from left to right, shall match the longest
possible string

This is okay for longest matches and without the "lazy" qualifier. In a
conceptual implementation of ERE, a back-tracking recursive-decending
matcher greedily match, from left to right, each subpattern - so that
they're longest before the final match. 

For each new match that're longer than the previous, the right-most
subpatterns are contracted first, before left-side ones. Thus after all
iterations, the result conform to the requirement laid out for subpatterns
naturally.

However, when "lazy" qualifier's applied to a component in a subpattern
(one that's parenthesized), without other "greedy" quantifiers applied to
other component(s) of the subpattern, the subpattern can only be
"shortest". Therefore, the rules regarding the matched lengths of
subpatterns with "lazy" qualifier(s) needs to be updated.

Next, when `REG_MINIMAL` is applied to the whole regex, the quantifiers
become "lazy" by default, therefore absant any qualifier, subpatterns
matched can only be "shortest". Thus re-iterating the need to update the
rules for matching subpatterns when "lazy" qualfiers/specifiers are used.
Desired Action: 
Various. The rules needs to be worked out carefully, and I have no
definitive desired action at this moment. Also, this issue need to be
related #793 and #1329, unless it's against procedure to relate new bugs to
closed ones.
====================================================================== 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2024-09-14 12:54 dannyniu       New Issue                                    
2024-09-14 12:54 dannyniu       Name                      => DannyNiu/NJF    
2024-09-14 12:54 dannyniu       Organization              => Individual      
2024-09-14 12:54 dannyniu       Section                   => 9.1 Regular
Expression Definitions # and others.
2024-09-14 12:54 dannyniu       Page Number               => 179-180 and others
2024-09-14 12:54 dannyniu       Line Number               => 6366-6368 and
others.
======================================================================


  • [1003.1(2024... Austin Group Bug Tracker via austin-group-l at The Open Group
    • 回复: [1... Niu Danny via austin-group-l at The Open Group
      • Re:... Steffen Nurpmeso via austin-group-l at The Open Group
        • ... Niu Danny via austin-group-l at The Open Group
          • ... shwaresyst via austin-group-l at The Open Group
            • ... Niu Danny via austin-group-l at The Open Group
              • ... Niu Danny via austin-group-l at The Open Group
          • ... Geoff Clare via austin-group-l at The Open Group
          • ... Steffen Nurpmeso via austin-group-l at The Open Group
            • ... Steffen Nurpmeso via austin-group-l at The Open Group
              • ... Niu Danny via austin-group-l at The Open Group

Reply via email to