Re: [Issue 8 drafts 0001564]: clariy on what (character/byte) strings pattern matching notation should work

2022-05-19 Thread Harald van Dijk via austin-group-l at The Open Group

On 20/05/2022 01:11, Christoph Anton Mitterer wrote:

On Thu, 2022-05-19 at 09:05 +0100, Harald van Dijk wrote:


The above, AFAIU, mean that any shell/fnmatch matches a valid
multibyte
character... but also a byte that is not a character in the locale.


Correct, though as I wrote later on, the way they go about it is
different.


And I think, for any real standardisation of this (which I'd still love
to see) quite a few things would need to be reasonably defined,
including but most likely not limited to:
- Does * match bytes (by which I mean 1-n which don't form valid
   characters in the current locale).


This is another one of those that seemed obvious enough to me that I did 
not think to check explicitly. As far as I can tell at a quick glance, * 
matches any number (zero or more) of ?, whatever ? means, except in the 
case of a particular shell bug that also breaks scripts already required 
by the standard to work.



- The same for ? ... and if that matches bytes at all - only 1 or n?


It matches a single character or a single byte that is not part of a 
character.



- In which "direction" is the matching done, which AFAIU would be
   important, e.g.
   \303\244\244
   *if* '?' were to match also bytes, is '?\244' meant to be matching
   the character followed by the byte:
 (\303\244)\244
   or could it non-match the byte followed by more bytes:
 (\303)\244\244


In an UTF-8 locale, \303\244\244 is an invalid character string. As the 
test results have shown, in some of the implementations, that causes 
pattern matching to be done as if under the C locale. In those 
implementations, it does not match ?\244, but it does match ??\244. In 
other implementations, only the final invalid byte \244 is given special 
treatment, in which case the whole string does match ?\244.



- And I guess these questions would also pop up for the ##, #, %% and %
   forms of parameter expansions, especially when one has a local like
   Big-5.
   In the sense of, can one strip of a character (or byte) that forms
   part of another character.


This does pop up there too but the questions are not new, they are the 
same questions that already pop up for regular pattern matching. 
${var#pat} strips a leading pat of $var if and only if $var matches 
pat*. ${var%pat} strips a trailing pat of $var if and only if $var 
matches *pat. That said, in those cases where shells disagree over 
whether $var matches pat* / *pat, that is those cases where I would 
propose making the result unspecified, the results may also be 
inconsistent with the same shell's pattern matching in other contexts.



   If shells were required not to decompose such valid characters (that
   contain another valid character, when looked at it from right to
   left), then it would also need to be defined how the strings needed
   to be interpreted (most likely of course: as defined by the
   respective char encoding).


This is where the example with β comes in. The current standard, as far 
as I can tell, *already* requires


  var=β
  echo ${var%]}
  case $var in
  *]) echo match
  esac

to print "β", and not print "match", regardless of how that β is 
encoded. There are no invalid bytes here. This can only be done by 
processing the string left-to-right.



   So for all these cases it might additionally be required to check how
   the different shells behave when trying to ##, #, %%, % ...
   And AFAIU, some actually allow to "decompose" a character.


Yes, this is expected and consistent with regular pattern matching.


   And even if the standard were to say, that it must check whether the
   matched part is part of a bigger multibyte character (like in the
   BIG5 case) and then not allow to decompose that would it still
   be allowed to do so when the pattern contains bytes that are
   themselves not valid characters)


Yes, they should be allowed to do so. As we have seen, bash and GNU 
fnmatch() simply fall back to single-byte-character-set matching if the 
string or pattern is not valid in the current locale, and what you 
describe would be the natural result of that.



- Are there any undesired side effects? Like bash, has the nocasematch
   shell option... which IIRC affects patterns... would we break
   anything in such fields?


How could we? What bash does if a non-standard shell option is set is 
not covered by POSIX, nor should it be.



- I think it already is defined (more or less) which locale is actually
   used for the matching, i.e. the current one as set by LC_CTYPE and
   not e.g. the "lexical" locale defined on the start of the shell.


Agreed.


I tested this now. In that same list of shells, and in glibc
fnmatch(),
? only matches a single invalid byte. Tested in an UTF-8 locale with
the
string \200\200 and the patterns ? and ??. With ?, they do not match.
With ??, they do.


The next question that would come to my mind:
Do these tests really give us a definite answer on the behaviour... or
may some things be 

Re: [Issue 8 drafts 0001564]: clariy on what (character/byte) strings pattern matching notation should work

2022-05-19 Thread Christoph Anton Mitterer via austin-group-l at The Open Group
On Thu, 2022-05-19 at 09:05 +0100, Harald van Dijk wrote:
> > 
> > The above, AFAIU, mean that any shell/fnmatch matches a valid
> > multibyte
> > character... but also a byte that is not a character in the locale.
> 
> Correct, though as I wrote later on, the way they go about it is
> different.

And I think, for any real standardisation of this (which I'd still love
to see) quite a few things would need to be reasonably defined,
including but most likely not limited to:
- Does * match bytes (by which I mean 1-n which don't form valid
  characters in the current locale).

- The same for ? ... and if that matches bytes at all - only 1 or n?

- In which "direction" is the matching done, which AFAIU would be
  important, e.g.
  \303\244\244
  *if* '?' were to match also bytes, is '?\244' meant to be matching
  the character followed by the byte:
(\303\244)\244
  or could it non-match the byte followed by more bytes:
(\303)\244\244

- And I guess these questions would also pop up for the ##, #, %% and %
  forms of parameter expansions, especially when one has a local like
  Big-5.
  In the sense of, can one strip of a character (or byte) that forms
  part of another character.
  If shells were required not to decompose such valid characters (that
  contain another valid character, when looked at it from right to
  left), then it would also need to be defined how the strings needed
  to be interpreted (most likely of course: as defined by the
  respective char encoding).

  So for all these cases it might additionally be required to check how
  the different shells behave when trying to ##, #, %%, % ...
  And AFAIU, some actually allow to "decompose" a character.

  And even if the standard were to say, that it must check whether the
  matched part is part of a bigger multibyte character (like in the
  BIG5 case) and then not allow to decompose that would it still
  be allowed to do so when the pattern contains bytes that are
  themselves not valid characters)

- Are there any undesired side effects? Like bash, has the nocasematch
  shell option... which IIRC affects patterns... would we break
  anything in such fields?

- I think it already is defined (more or less) which locale is actually
  used for the matching, i.e. the current one as set by LC_CTYPE and
  not e.g. the "lexical" locale defined on the start of the shell. 



> I tested this now. In that same list of shells, and in glibc
> fnmatch(), 
> ? only matches a single invalid byte. Tested in an UTF-8 locale with
> the 
> string \200\200 and the patterns ? and ??. With ?, they do not match.
> With ??, they do.

The next question that would come to my mind:
Do these tests really give us a definite answer on the behaviour... or
may some things be dependent on the specific locale? Maybe the above
behaviour is *only* with UTF-8?
Or can this be ruled out?

> 


> > So unlike before, in the above bash/fnmatch do seem to let '?'
> > match a
> > single byte that is not a character... and the remaining ones have
> > quite mixed feelings
> Not quite: all of them always let ? match a single invalid byte, but 
> here we have a single byte that is invalid on its own, valid as part
> of 
> a character, and appears in the string as part of that character.
> When 
> processing \303\244, most shells don't process this as the single
> byte 
> \303 followed by the single byte \244, they preprocess this so that
> by 
> the time they actually check whether it matches, they just see the 
> character U+00C4, so that if a pattern looks for \303 on its own, it 
> will not be found.

Hmm... seems a bit strange to me... I mean above you had:

string  pattern
\303.\303\244   ?.?

And e.g. bash didn't match.. my assumption was, because the first \303
is not a character.

But later on you had:
\303\244\303*
\303\244\303?
which bash *did* match.

Sure, the 2 bytes together are already one character, but bash had to
match the single \303 plus * or ? ... and if above ? didn't match the
single invalid \303 it did match the single \244 here (which ain't a
character either).

No even if one says now it's the direction,... there was also:
\303\244.\303   ?.?
with no match in bash... the first ? should be okay, because it's a
char,... the 2nd one would be the lone \303 byte.

> 

> > Seem also a bit strange to me,... all shells match \243 against ?
> > ...
> > i.e. ? matches a single byte that is not a character... but later
> > on it
> > doesn't work again with \243] and ?]
> 
> Remember that \243] is a single character β. \243] is not supposed to
> match when given a pattern ?]. The pattern ?] means "any character, 
> followed by ]". "β" is a character not followed by ]. This is similar
> to 
> how in UTF-8 environments, ä should not match against the pattern ?? 
> even though both of the bytes that make up ä individually do match 
> against the pattern ?.

Okay but isn't that then the case where the matching 

Re: [Issue 8 drafts 0001564]: clariy on what (character/byte) strings pattern matching notation should work

2022-05-19 Thread Harald van Dijk via austin-group-l at The Open Group
On 15/05/2022 16:14, Harald van Dijk via austin-group-l at The Open 
Group wrote:
On 19/04/2022 01:52, Harald van Dijk via austin-group-l at The Open 
Group wrote:

On 15/04/2022 04:57, Christoph Anton Mitterer wrote:

On Fri, 2022-04-15 at 00:44 +0100, Harald van Dijk via austin-group-l
at The Open Group wrote:

If there is interest in getting this standardised, I can spend some
more
time on creating some hopefully comprehensive tests for this to
confirm
in what cases shells agree and disagree, and use that as a basis for
proposing wording to cover it.


I'd love to see that and if you'd actually do so, I'd kindly ask
Geoff to defer any changes in the ticket #1564 of mine, until it can be
said whether it might be possible to get that standardised.


Very well, I will post tests and test results as soon I can make the 
time for it.


Please see the tests and results here. Apologies for the HTML mail but 
this is hard to make readable in plain text.


String

Pattern

dash, busybox ash, mksh, posh, pdksh

glibc fnmatch

bash

bosh

gwsh

ksh

zsh
\303\244

[\303\244]

no match

match

match

match

match

match

match
\303\244

?

no match

match

match

match

match

match

match
\303

[\303]

match

match

match

match

match

match

match
\303

?

match

match

match

match

match

match

match
\303.\303\244

[\303].[\303\244]

no match

no match

no match

match

match

match

match
\303.\303\244

?.?

no match

no match

no match

match

match

match

match
\303\303\244

[\303][\303\244]

no match

no match

no match

match

match

match

match
\303\303\244

??

no match

no match

no match

match

match

match

match
\303\244.\303

[\303\244].[\303]

no match

no match

no match

match

match

match

match
\303\244.\303

?.?

no match

no match

no match

match

match

match

match
\303\244\303

[\303\244][\303]

no match

no match

no match

match

match

match

match
\303\244\303

??

no match

no match

no match

match

match

match

match
\303\244

\303*

match

match

match

match

no match

match

no match
\303\244

\303?

match

match

match

no match

no match

match

no match
\303\244

[\303]*

match

match

match

match

no match

match

no match
\303\244

[\303]?

match

match

match

no match

no match

match

no match
\303\244

*\204

match

match

match

no match

no match

no match

match
\303\244

?\204

match

match

match

no match

no match

no match

no match
\303\244

*[\204]

match

match

match

no match

no match

no match

no match
\303\244

?[\204]

match

match

match

no match

no match

no match

no match
\243]

[\243]]

match

match

match

match

match

match

match
\243]

?

no match

match

match

match

match

match

match
\243

?

match

match

match

match

match

match

match
\243

[\243]

match

match

match

match

no match

no match

error
\243

[\243!]

match

match

match

match

match

match

match
\243]

[\243!]]

match

match

no match

no match

no match

match

no match
\243]

?]

match

match

no match

no match

no match

no match

no match
\243]

*]

match

match

no match

no match

no match

no match

match

The tests 

Re: tv_nsec

2022-05-19 Thread Fred J. Tydeman via austin-group-l at The Open Group
On Wed, 18 May 2022 09:30:51 +0100 Geoff Clare via austin-group-l at The Open 
Group wrote:
>
>Fred J. Tydeman wrote, on 17 May 2022:
>>
>> The 202x version I have, in , shows tv_nsec tagged as CX.
>> tv_nsec was added to C11, so is not an extension to the C standard.
>
>The current draft (2.1) is from before the changes to align with C17
>were applied.
>
>The relevant change to  can be seen on page 26 lines 928-935
>of C17_alignment_20211019.pdf which is the bug 1302 attachment
>referenced in the "Final Accepted Text" field of that bug.
>
>https://austingroupbugs.net/view.php?id=1302
>

Would it be OK if the C committee changed the type of tv_nsec from 'long'
to an implementation defined type?  That way, an implementation could
make it be a 32-bit int, instead of a 64-bit long.

Or, would that change cause problems for existing applications?

The paper:
  https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2878.pdf
is being discussed at this week's WG14 meeting.


---
Fred J. TydemanTydeman Consulting
tyde...@tybor.com  Testing, numerics, programming
+1 (702) 608-6093  Vice-chair of PL22.11 (ANSI "C")
Sample C99+FPCE tests: http://www.tybor.com
Savers sleep well, investors eat well, spenders work forever.



[1003.1(2016/18)/Issue7+TC2 0001546]: BREs: reserve \? \+ and \|

2022-05-19 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


The following issue has a resolution that has been APPLIED. 
== 
https://austingroupbugs.net/view.php?id=1546 
== 
Reported By:calestyo
Assigned To:
== 
Project:1003.1(2016/18)/Issue7+TC2
Issue ID:   1546
Category:   Base Definitions and Headers
Type:   Enhancement Request
Severity:   Editorial
Priority:   normal
Status: Applied
Name:   Christoph Anton Mitterer 
Organization:
User Reference:  
Section:9.3 Basic Regular Expressions 
Page Number:N/A 
Line Number:N/A 
Interp Status:  --- 
Final Accepted Text:https://austingroupbugs.net/view.php?id=1546#c5755 
Resolution: Accepted As Marked
Fixed in Version:   
== 
Date Submitted: 2022-01-08 03:48 UTC
Last Modified:  2022-05-19 09:15 UTC
== 
Summary:BREs: reserve \? \+ and \|
==
Relationships   ID  Summary
--
has duplicate   773 Summary: Add \+, \?, and \| to Basic Re...
== 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2022-01-08 03:48 calestyo   New Issue
2022-01-08 03:48 calestyo   Name  => Christoph Anton
Mitterer
2022-01-08 03:48 calestyo   Section   => 9.3 Basic Regular
Expressions
2022-01-08 03:48 calestyo   Page Number   => N/A 
2022-01-08 03:48 calestyo   Line Number   => N/A 
2022-01-28 11:21 mirabilos  Note Added: 0005636  
2022-01-28 23:10 calestyo   Note Added: 0005639  
2022-03-03 06:56 Don Cragun Note Added: 0005731  
2022-03-03 07:03 Don Cragun Note Edited: 0005731 
2022-03-10 17:09 geoffclare Note Added: 0005738  
2022-03-10 17:09 geoffclare Interp Status => --- 
2022-03-10 17:09 geoffclare Final Accepted Text   =>
https://austingroupbugs.net/view.php?id=1546#c5738
2022-03-10 17:09 geoffclare Status   New => Resolved 
2022-03-10 17:09 geoffclare Resolution   Open => Accepted As
Marked
2022-03-10 17:10 geoffclare Tag Attached: issue8 
2022-03-10 17:44 calestyo   Note Added: 0005740  
2022-03-12 21:01 Don Cragun Relationship added   related to 773  
2022-03-12 21:12 calestyo   Note Added: 0005744  
2022-03-12 21:13 calestyo   Note Edited: 0005744 
2022-03-14 10:08 geoffclare Note Added: 0005748  
2022-03-14 10:08 geoffclare Status   Resolved => Under
Review
2022-03-14 10:08 geoffclare Resolution   Accepted As Marked =>
Reopened
2022-03-14 13:50 calestyo   Note Added: 0005749  
2022-03-14 14:31 geoffclare Note Added: 0005750  
2022-03-18 09:32 geoffclare Note Added: 0005755  
2022-03-24 15:29 geoffclare Final Accepted Text 
https://austingroupbugs.net/view.php?id=1546#c5738 =>
https://austingroupbugs.net/view.php?id=1546#c5755
2022-03-24 15:29 geoffclare Status   Under Review =>
Resolved
2022-03-24 15:29 geoffclare Resolution   Reopened => Accepted As
Marked
2022-03-24 15:34 geoffclare Relationship replacedhas duplicate 773
2022-03-25 22:41 calestyo   Note Added: 0005765  
2022-03-25 22:57 calestyo   Note Added: 0005766  
2022-03-31 15:06 geoffclare Note Edited: 0005755 
2022-03-31 15:07 geoffclare Note Edited: 0005755 
2022-03-31 15:08 geoffclare Note Added: 0005768  
2022-04-01 22:22 calestyo   Note Added: 0005770  
2022-04-02 08:17 kreNote Added: 0005773  
2022-04-02 08:19 kreNote Edited: 0005773 
2022-04-02 08:22 kre 

[1003.1(2016/18)/Issue7+TC2 0001544]: uudecode: standardise or at least reserve - as another special symbol for decoding to stdout

2022-05-19 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


The following issue has a resolution that has been APPLIED. 
== 
https://austingroupbugs.net/view.php?id=1544 
== 
Reported By:calestyo
Assigned To:
== 
Project:1003.1(2016/18)/Issue7+TC2
Issue ID:   1544
Category:   Shell and Utilities
Type:   Clarification Requested
Severity:   Editorial
Priority:   normal
Status: Applied
Name:   Christoph Anton Mitterer 
Organization:
User Reference:  
Section:uudecode 
Page Number:3357 
Line Number:113061-113063 
Interp Status:  --- 
Final Accepted Text:https://austingroupbugs.net/view.php?id=1544#c5799 
Resolution: Accepted As Marked
Fixed in Version:   
== 
Date Submitted: 2022-01-08 03:21 UTC
Last Modified:  2022-05-19 09:05 UTC
== 
Summary:uudecode: standardise or at least reserve - as
another special symbol for decoding to stdout
== 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2022-01-08 03:21 calestyo   New Issue
2022-01-08 03:21 calestyo   Name  => Christoph Anton
Mitterer
2022-01-08 03:21 calestyo   Section   => uudecode
2022-01-08 03:21 calestyo   Page Number   => N/A 
2022-01-08 03:21 calestyo   Line Number   => N/A 
2022-01-10 10:08 geoffclare Note Added: 0005584  
2022-01-10 15:16 calestyo   Note Added: 0005586  
2022-01-11 00:19 alanc  Note Added: 0005592  
2022-02-24 09:24 Don Cragun Page Number  N/A => 3357 
2022-02-24 09:24 Don Cragun Line Number  N/A => 113061-113063
2022-02-24 09:24 Don Cragun Interp Status => --- 
2022-02-24 15:51 shware_systems Note Added: 0005707  
2022-02-24 16:01 calestyo   Note Added: 0005708  
2022-02-24 16:26 geoffclare Note Added: 0005709  
2022-02-24 16:39 calestyo   Note Added: 0005710  
2022-02-24 17:33 geoffclare Note Added: 0005711  
2022-02-24 17:35 geoffclare Final Accepted Text   =>
https://austingroupbugs.net/view.php?id=1544#c5711
2022-02-24 17:35 geoffclare Status   New => Resolution
Proposed
2022-02-24 21:34 calestyo   Note Added: 0005712  
2022-02-24 23:23 kreNote Added: 0005713  
2022-02-25 03:08 calestyo   Note Edited: 0005712 
2022-02-25 03:13 calestyo   Note Added: 0005715  
2022-02-25 10:07 geoffclare Note Added: 0005717  
2022-02-25 10:08 geoffclare Note Edited: 0005717 
2022-02-25 15:46 kreNote Added: 0005718  
2022-03-01 11:45 geoffclare Note Added: 0005722  
2022-03-02 09:49 quinq  Note Added: 0005724  
2022-03-02 12:50 steffenNote Added: 0005725  
2022-03-03 02:51 calestyo   Note Added: 0005726  
2022-03-03 02:59 calestyo   Note Edited: 0005726 
2022-03-03 03:12 calestyo   Note Added: 0005727  
2022-03-03 20:35 eblake Note Added: 0005732  
2022-03-03 20:36 eblake Note Edited: 0005732 
2022-03-03 20:36 eblake Note Edited: 0005732 
2022-03-05 03:18 calestyo   Note Added: 0005734  
2022-03-05 03:19 calestyo   Note Edited: 0005727 
2022-04-01 23:03 calestyo   Note Edited: 0005727 
2022-04-14 15:47 geoffclare Note Added: 0005799  
2022-04-14 15:48 geoffclare Note Edited: 0005799 
2022-04-14 15:48 geoffclare Final Accepted Text 
https://austingroupbugs.net/view.php?id=1544#c5711 =>
https://austingroupbugs.net/view.php?id=1544#c5799
2022-04-14 15:48 geoffclare 

[1003.1(2013)/Issue7+TC1 0000729]: Integrate posix_devctl() from standalone IEEE Std 1003.26 into the next revision of IEEE Std 1003.1

2022-05-19 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


The following issue has a resolution that has been APPLIED. 
== 
https://austingroupbugs.net/view.php?id=729 
== 
Reported By:Don Cragun
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   729
Category:   System Interfaces
Type:   Enhancement Request
Severity:   Objection
Priority:   normal
Status: Applied
Name:   Don Cragun 
Organization:   IEEE PASC 
User Reference: Integrate 1003.26 into Issue 8 
Section:posix_devctl() 
Page Number:- 
Line Number:- 
Interp Status:  --- 
Final Accepted Text:https://austingroupbugs.net/view.php?id=729#c5723 
Resolution: Accepted As Marked
Fixed in Version:   
== 
Date Submitted: 2013-08-08 07:20 UTC
Last Modified:  2022-05-19 09:01 UTC
== 
Summary:Integrate posix_devctl() from standalone IEEE Std
1003.26 into the next revision of IEEE Std 1003.1
== 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-08-08 07:20 Don Cragun New Issue
2013-08-08 07:20 Don Cragun Name  => Don Cragun  
2013-08-08 07:20 Don Cragun Organization  => IEEE PASC   
2013-08-08 07:20 Don Cragun User Reference=> Integrate 1003.26
into Issue 8
2013-08-08 07:20 Don Cragun Section   => posix_devctl()  
2013-08-08 07:20 Don Cragun Page Number   => -   
2013-08-08 07:20 Don Cragun Line Number   => -   
2013-08-08 07:20 Don Cragun Interp Status => --- 
2013-08-08 07:46 Don Cragun Tag Attached: issue8 
2022-03-01 15:08 geoffclare Note Added: 0005723  
2022-03-17 15:47 geoffclare Note Edited: 0005723 
2022-03-17 15:49 geoffclare Final Accepted Text   =>
https://austingroupbugs.net/view.php?id=729#c5723
2022-03-17 15:49 geoffclare Status   New => Resolved 
2022-03-17 15:49 geoffclare Resolution   Open => Accepted As
Marked
2022-05-19 09:01 geoffclare Status   Resolved => Applied 
==




[1003.1(2016/18)/Issue7+TC2 0001554]: find's rationale has odd comment about -name pattern matching

2022-05-19 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


The following issue has a resolution that has been APPLIED. 
== 
https://austingroupbugs.net/view.php?id=1554 
== 
Reported By:andras_farkas
Assigned To:
== 
Project:1003.1(2016/18)/Issue7+TC2
Issue ID:   1554
Category:   Shell and Utilities
Type:   Clarification Requested
Severity:   Editorial
Priority:   normal
Status: Applied
Name:   Andras Farkas 
Organization:
User Reference:  
Section:find 
Page Number: 
Line Number: 
Interp Status:  --- 
Final Accepted Text:See
https://austingroupbugs.net/view.php?id=1554#c5760. 
Resolution: Accepted As Marked
Fixed in Version:   
== 
Date Submitted: 2022-01-17 03:15 UTC
Last Modified:  2022-05-19 08:42 UTC
== 
Summary:find's rationale has odd comment about -name pattern
matching
== 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2022-01-17 03:15 andras_farkas  New Issue
2022-01-17 03:15 andras_farkas  Name  => Andras Farkas   
2022-01-17 03:15 andras_farkas  Section   => find
2022-03-24 16:18 Don Cragun Note Added: 0005760  
2022-03-24 16:19 Don Cragun Interp Status => --- 
2022-03-24 16:19 Don Cragun Final Accepted Text   => See
https://austingroupbugs.net/view.php?id=1554#c5760.
2022-03-24 16:19 Don Cragun Status   New => Resolved 
2022-03-24 16:19 Don Cragun Resolution   Open => Accepted As
Marked
2022-03-24 16:20 Don Cragun Tag Attached: tc3-2008   
2022-05-19 08:42 geoffclare Status   Resolved => Applied 
==




[1003.1(2016/18)/Issue7+TC2 0001553]: find has two references to -n option that should be about -name

2022-05-19 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


The following issue has a resolution that has been APPLIED. 
== 
https://austingroupbugs.net/view.php?id=1553 
== 
Reported By:andras_farkas
Assigned To:
== 
Project:1003.1(2016/18)/Issue7+TC2
Issue ID:   1553
Category:   Shell and Utilities
Type:   Error
Severity:   Editorial
Priority:   normal
Status: Applied
Name:   Andras Farkas 
Organization:
User Reference:  
Section:find 
Page Number: 
Line Number: 
Interp Status:  --- 
Final Accepted Text:https://austingroupbugs.net/view.php?id=1553#c5759 
Resolution: Accepted As Marked
Fixed in Version:   
== 
Date Submitted: 2022-01-17 03:05 UTC
Last Modified:  2022-05-19 08:41 UTC
== 
Summary:find has two references to -n option that should be
about -name
==
Relationships   ID  Summary
--
related to  0001031 Add -iname (case-insensitive name searc...
== 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2022-01-17 03:05 andras_farkas  New Issue
2022-01-17 03:05 andras_farkas  Name  => Andras Farkas   
2022-01-17 03:05 andras_farkas  Section   => find
2022-01-17 10:00 geoffclare Note Added: 0005614  
2022-01-17 10:00 geoffclare Relationship added   related to 0001031  
2022-01-17 18:42 andras_farkas  Note Added: 0005618  
2022-03-24 16:00 geoffclare Note Added: 0005759  
2022-03-24 16:01 geoffclare Interp Status => --- 
2022-03-24 16:01 geoffclare Final Accepted Text   =>
https://austingroupbugs.net/view.php?id=1553#c5759
2022-03-24 16:01 geoffclare Status   New => Resolved 
2022-03-24 16:01 geoffclare Resolution   Open => Accepted As
Marked
2022-03-24 16:01 geoffclare Tag Attached: tc3-2008   
2022-05-19 08:41 geoffclare Status   Resolved => Applied 
==




[1003.1(2016/18)/Issue7+TC2 0001549]: Escaped newline in macro expansion in command line.

2022-05-19 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


The following issue has a resolution that has been APPLIED. 
== 
https://austingroupbugs.net/view.php?id=1549 
== 
Reported By:dmitry_goncharov
Assigned To:
== 
Project:1003.1(2016/18)/Issue7+TC2
Issue ID:   1549
Category:   Shell and Utilities
Type:   Clarification Requested
Severity:   Editorial
Priority:   normal
Status: Applied
Name:   Dmitry Goncharov 
Organization:
User Reference:  
Section:Makefile Syntax 
Page Number:2973 
Line Number:98627 
Interp Status:  Approved 
Final Accepted Text:https://austingroupbugs.net/view.php?id=1549#c5754 
Resolution: Accepted As Marked
Fixed in Version:   
== 
Date Submitted: 2022-01-13 16:18 UTC
Last Modified:  2022-05-19 08:39 UTC
== 
Summary:Escaped newline in macro expansion in command line.
== 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2022-01-13 16:18 dmitry_goncharovNew Issue
2022-01-13 16:18 dmitry_goncharovName  => Dmitry Goncharov
2022-01-13 16:18 dmitry_goncharovURL   =>
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/make.html
2022-01-13 16:18 dmitry_goncharovSection   => Makefile Syntax 
2022-01-14 09:46 geoffclare Project  Online Pubs =>
1003.1(2016/18)/Issue7+TC2
2022-01-14 09:47 geoffclare Page Number   => 2973
2022-01-14 09:47 geoffclare Line Number   => 98627   
2022-01-14 09:47 geoffclare Interp Status => --- 
2022-01-14 09:47 geoffclare Note Added: 0005604  
2022-01-14 09:53 geoffclare Note Added: 0005605  
2022-01-14 14:14 psmith Note Added: 0005606  
2022-01-14 20:31 dmitry_goncharovNote Added: 0005609  
2022-03-17 16:14 geoffclare Note Added: 0005754  
2022-03-17 16:15 geoffclare Interp Status--- => Pending  
2022-03-17 16:15 geoffclare Final Accepted Text   =>
https://austingroupbugs.net/view.php?id=1549#c5754
2022-03-17 16:15 geoffclare Status   New => Interpretation
Required
2022-03-17 16:15 geoffclare Resolution   Open => Accepted As
Marked
2022-03-17 16:15 geoffclare Tag Attached: tc3-2008   
2022-03-25 17:08 agadminInterp StatusPending => Proposed 
2022-03-25 17:08 agadminNote Added: 0005763  
2022-04-26 12:02 agadminInterp StatusProposed => Approved
2022-04-26 12:02 agadminNote Added: 0005822  
2022-05-19 08:39 geoffclare Status   Interpretation Required
=> Applied
==




[1003.1(2016/18)/Issue7+TC2 0001547]: wait* are cancellation points, but the fate of zombie processes in wait after cancellation are unspecified.

2022-05-19 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


The following issue has a resolution that has been APPLIED. 
== 
https://austingroupbugs.net/view.php?id=1547 
== 
Reported By:dannyniu
Assigned To:
== 
Project:1003.1(2016/18)/Issue7+TC2
Issue ID:   1547
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Objection
Priority:   normal
Status: Applied
Name:   DannyNiu/NJF 
Organization:   Individual 
User Reference:  
Section:wait, waitid, waitpid 
Page Number:2226-2239 
Line Number:70892-71402 
Interp Status:  --- 
Final Accepted Text:https://austingroupbugs.net/view.php?id=1547#c5739 
Resolution: Accepted As Marked
Fixed in Version:   
== 
Date Submitted: 2022-01-08 12:14 UTC
Last Modified:  2022-05-19 08:37 UTC
== 
Summary:wait* are cancellation points, but the fate of
zombie processes in wait after cancellation are unspecified.
== 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2022-01-08 12:14 dannyniu   New Issue
2022-01-08 12:14 dannyniu   Name  => DannyNiu/NJF
2022-01-08 12:14 dannyniu   Organization  => Individual  
2022-01-08 12:14 dannyniu   Section   => wait, waitid,
waitpid
2022-01-08 12:14 dannyniu   Page Number   => 2226-2239   
2022-01-08 12:14 dannyniu   Line Number   => 70892-71402 
2022-01-10 10:26 geoffclare Note Added: 0005585  
2022-01-11 01:36 dannyniu   Note Added: 0005594  
2022-03-10 17:20 geoffclare Note Added: 0005739  
2022-03-10 17:21 geoffclare Interp Status => --- 
2022-03-10 17:21 geoffclare Final Accepted Text   =>
https://austingroupbugs.net/view.php?id=1547#c5739
2022-03-10 17:21 geoffclare Status   New => Resolved 
2022-03-10 17:21 geoffclare Resolution   Open => Accepted As
Marked
2022-03-10 17:21 geoffclare Tag Attached: tc3-2008   
2022-05-19 08:37 geoffclare Status   Resolved => Applied 
==




[1003.1(2016/18)/Issue7+TC2 0001541]: Overabundance of parentheses in atoi() example

2022-05-19 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


The following issue has a resolution that has been APPLIED. 
== 
https://austingroupbugs.net/view.php?id=1541 
== 
Reported By:andras_farkas
Assigned To:
== 
Project:1003.1(2016/18)/Issue7+TC2
Issue ID:   1541
Category:   System Interfaces
Type:   Enhancement Request
Severity:   Editorial
Priority:   normal
Status: Applied
Name:   Andras Farkas 
Organization:
User Reference:  
Section:atoi 
Page Number:621 
Line Number:21493 
Interp Status:  --- 
Final Accepted Text: 
Resolution: Accepted
Fixed in Version:   
== 
Date Submitted: 2021-12-21 09:43 UTC
Last Modified:  2022-05-19 08:36 UTC
== 
Summary:Overabundance of parentheses in atoi() example
== 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2021-12-21 09:43 andras_farkas  New Issue
2021-12-21 09:43 andras_farkas  Name  => Andras Farkas   
2021-12-21 09:43 andras_farkas  Section   => atoi
2022-02-17 17:09 Don Cragun Page Number   => 621 
2022-02-17 17:09 Don Cragun Line Number   => 21493   
2022-02-17 17:09 Don Cragun Interp Status => --- 
2022-02-17 17:09 Don Cragun Status   New => Resolved 
2022-02-17 17:09 Don Cragun Resolution   Open => Accepted
2022-02-17 17:09 Don Cragun Tag Attached: tc3-2008   
2022-05-19 08:36 geoffclare Status   Resolved => Applied 
==




Re: [Issue 8 drafts 0001564]: clariy on what (character/byte) strings pattern matching notation should work

2022-05-19 Thread Harald van Dijk via austin-group-l at The Open Group

On 19/05/2022 02:46, Christoph Anton Mitterer wrote:

On Sun, 2022-05-15 at 16:14 +0100, Harald van Dijk wrote:

Please see the tests and results here.


So dash/ash/mksh/posh/pdksh,... and every other shell that doesn't
handle locales at all (and thus works in the C locale)... is anyway
always right (except for bugs), since any (non-NUL) byte is treated as
a character.


Correct.


For the other shells (and fncmatch):


String
Pattern
dash, busybox ash, mksh, posh, pdksh
glibc fnmatch
bash
bosh
gwsh
ksh
zsh
\303\244
[\303\244]
no match
match
match
match
match
match
match
\303\244
?
no match
match
match
match
match
match
match
\303
[\303]
match
match
match
match
match
match
match
\303
?
match
match
match
match
match
match
match


The above, AFAIU, mean that any shell/fnmatch matches a valid multibyte
character... but also a byte that is not a character in the locale.


Correct, though as I wrote later on, the way they go about it is different.


String
Pattern
dash, busybox ash, mksh, posh, pdksh
glibc fnmatch
bash
bosh
gwsh
ksh
zsh

\303.\303\244
[\303].[\303\244]
no match
no match
no match
match
match
match
match
\303.\303\244
?.?
no match
no match
no match
match
match
match
match
\303\303\244
[\303][\303\244]
no match
no match
no match
match
match
match
match
\303\303\244
??
no match
no match
no match
match
match
match
match
\303\244.\303
[\303\244].[\303]
no match
no match
no match
match
match
match
match
\303\244.\303
?.?
no match
no match
no match
match
match
match
match
\303\244\303
[\303\244][\303]
no match
no match
no match
match
match
match
match
\303\244\303
??
no match
no match
no match
match
match
match
match



The above, I'm not quite sure what these tell/prove...

I assume the ones with '?': that for all except bash/fnmatch   '?'
matches both, valid characters and a single byte that is no character.


Correct.


And the ones with bracket expression, that these also work when the BE
has either a valid character or a byte (that is not a character) and
vice-versa?


Correct.


If Chet is reading along, is the above intended in bash, or considered
a bug?


IMO it would have been interesting to see whether ? would also match
multiple bytes that are each for themselves and together no valid
character... cause for '*' one can kinda assume that it has this "match
anything" meaning... one could also say that is more or less reasonable
that '?' matches a single invalid byte... but why not several of them?


I tested this now. In that same list of shells, and in glibc fnmatch(), 
? only matches a single invalid byte. Tested in an UTF-8 locale with the 
string \200\200 and the patterns ? and ??. With ?, they do not match. 
With ??, they do.



String
Pattern
dash, busybox ash, mksh, posh, pdksh
glibc fnmatch
bash
bosh
gwsh
ksh
zsh

\303\244
\303*
match
match
match
match
no match
match
no match
\303\244
\303?
match
match
match
no match
no match
match
no match
\303\244
[\303]*
match
match
match
match
no match
match
no match
\303\244
[\303]?
match
match
match
no match
no match
match
no match
\303\244
*\204
match
match
match
no match
no match
no match
match
\303\244
?\204
match
match
match
no match
no match
no match
no match
\303\244
*[\204]
match
match
match
no match
no match
no match
no match
\303\244
?[\204]
match
match
match
no match
no match
no match
no match




So unlike before, in the above bash/fnmatch do seem to let '?' match a
single byte that is not a character... and the remaining ones have
quite mixed feelings
Not quite: all of them always let ? match a single invalid byte, but 
here we have a single byte that is invalid on its own, valid as part of 
a character, and appears in the string as part of that character. When 
processing \303\244, most shells don't process this as the single byte 
\303 followed by the single byte \244, they preprocess this so that by 
the time they actually check whether it matches, they just see the 
character U+00C4, so that if a pattern looks for \303 on its own, it 
will not be found.



String
Pattern
dash, busybox ash, mksh, posh, pdksh
glibc fnmatch
bash
bosh
gwsh
ksh
zsh

\243]
[\243]]
match
match
match
match
match
match
match
\243]
?
no match
match
match
match
match
match
match
\243
?
match
match
match
match
match
match
match
\243
[\243]
match
match
match
match
no match
no match
error
\243
[\243!]
match
match
match
match
match
match
match
\243]
[\243!]]
match
match
no match
no match
no match
match
no match
\243]
?]
match
match
no match
no match
no match
no match
no match
\243]
*]
match
match
no match
no match
no match
no match
match
The tests involving \243 are run in a Big5 environment. In Big5,
\243\135 is the representation of β, a single valid character, even
though \135 on its own is still the single character ].


Seem also a bit strange to me,... all shells match \243 against ? ...
i.e. ? matches a single byte that is not a character... but later on it
doesn't work again with \243] and ?]


Remember that \243] is a single character β. \243] is not