A NOTE has been added to this issue. 
====================================================================== 
https://www.austingroupbugs.net/view.php?id=1941 
====================================================================== 
Reported By:                dwheeler
Assigned To:                ajosey
====================================================================== 
Project:                    1003.1(2024)/Issue8
Issue ID:                   1941
Category:                   Shell and Utilities
Type:                       Enhancement Request
Severity:                   Objection
Priority:                   normal
Status:                     Under Review
Name:                       David A. Wheeler 
Organization:                
User Reference:              
Section:                    grep 
Page Number:                1 
Line Number:                1 
Interp Status:              --- 
Final Accepted Text:         
====================================================================== 
Date Submitted:             2025-08-30 21:51 UTC
Last Modified:              2025-09-12 16:28 UTC
====================================================================== 
Summary:                    Add widely-implemented options to grep
====================================================================== 

---------------------------------------------------------------------- 
 (0007258) stephane (reporter) - 2025-09-12 16:28
 https://www.austingroupbugs.net/view.php?id=1941#c7258 
---------------------------------------------------------------------- 
> All implement a "whole word" match with -w. However, that
> raises complications on defining word boundaries, especially
> since POSIX doesn't define the underlying construct. This may
> be quite doable, but since that discussion is complicated,
> maybe that's for another day.

Actually POSIX does already specify the \< and \> regexp
operators for the ex utility:
https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/utilities/ex.html#tag_20_40_13_58

> \<
>     Match the beginning of a word. (See the definition of word
>     at the beginning of Command Descriptions in ex.)
> \>
>     Match the end of a word.

That's the wrong reference, btw, looks like it should be a
reference to "Input Editing in ex" (I'll raise a bug about that):

> word
> 
>   In the POSIX locale, a word consists of a maximal sequence of
>   letters, digits, and underscores, delimited at both ends by
>   characters other than letters, digits, or underscores, or by
>   the beginning or end of a line or the edit buffer.

And the initial implementation of grep -w (AFAIK from BSD in
the late 70s, ex being also a BSD utility) was implemented by
adding \<...\> around the regex to match.

https://github.com/dspinellis/unix-history-repo/blob/BSD-2/src/grep.c#L105-L106

That's however not necessarily the best approach and not what
all implementations do these days.

For example, with GNU grep (and its clones):

<pre>
$ echo 'a -b- c' | grep '\<-b-\>'
$ echo 'a -b- c' | grep -we -b-
a -b- c
</pre>

That is grep -w word being more like grep -P '(?<!\w)word(?!\w)'
regardless of whether "word" itself starts and/ord ends with \w
or not.

Sounds like a better approach.

<pre>
$ echo 'a--b--c' | grep -we -b-
a--b--c
</pre>

May be more debattable.

The fact that there's no agreement in practice between grep
implementations, may mean it's best to leave it out for now.

Another issue with \<, \> if they were to be specified is that
we'd likely want to also specify the REG_STARTEND BSD flag for
regcomp() and sed/grep -o to use it, or we'd get into issues
such as:

<pre>
$ echo aaa | sed 's/\<a/<a/g
<a<a<a
$ echo aaa | grep -o '\<a'
a
a
a
</pre>

As each "a" ends up being at the start of the subject upon
successive match.

For the record, and for what it's worth, I otherwise support
your proposal. 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
2025-08-30 21:51 dwheeler       New Issue                                    
2025-08-30 21:51 dwheeler       Status                   New => Under Review 
2025-08-30 21:51 dwheeler       Assigned To               => ajosey          
2025-08-30 21:56 dwheeler       Note Added: 0007240                          
2025-08-30 21:59 dwheeler       Note Added: 0007241                          
2025-08-31 00:07 mirabilos      Note Added: 0007242                          
2025-08-31 00:10 mirabilos      Note Added: 0007243                          
2025-08-31 21:52 dwheeler       Note Added: 0007244                          
2025-08-31 22:01 dwheeler       Note Added: 0007245                          
2025-09-01 05:57 stephane       Note Added: 0007246                          
2025-09-01 06:05 stephane       Note Added: 0007247                          
2025-09-01 15:36 dwheeler       Note Added: 0007249                          
2025-09-01 17:10 dwheeler       Note Added: 0007250                          
2025-09-01 17:18 dwheeler       Note Added: 0007251                          
2025-09-11 15:31 lanodan        Note Added: 0007253                          
2025-09-11 15:36 lanodan        Note Edited: 0007253                         
2025-09-11 15:37 lanodan        Note Edited: 0007253                         
2025-09-11 15:37 lanodan        Note Edited: 0007253                         
2025-09-11 15:50 geoffclare     Project                  1003.1(2008)/Issue 7 =>
1003.1(2024)/Issue8
2025-09-11 18:01 dwheeler       Note Added: 0007256                          
2025-09-12 16:28 stephane       Note Added: 0007258                          
======================================================================


  • [1003.1(20... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group
    • [1003... Austin Group Issue Tracker via austin-group-l at The Open Group

Reply via email to