Re: [Issue 8 drafts 0001556]: clarify meaning of \n used in a bracket expression in a sed context address or s-command

Geoff Clare via austin-group-l at The Open Group Tue, 26 Apr 2022 02:24:39 -0700

Christoph Anton Mitterer wrote, on 26 Apr 2022:
>
> On Mon, 2022-04-25 at 10:21 +0100, Geoff Clare via austin-group-l at
> The Open Group wrote:
> > This was discussed during our work on bug 1233 and resulted in
> > additions
> > to the sed APPLICATION USAGE (line 106286 in draft 2.1) and FUTURE
> > DIRECTIONS.
> 
> Hmm... AFAICS, that was only done for sed, right?
> 
> Isn't it something that would apply to regular expressions in general
> (and thus also e.g. grep)?
> And via them, it would also affect bracket expressions in the pattern
> matching notation (which refer to the RE bracket expressions).
> 
> So shouldn't that be mentioned there, too?


That's what 1233 asked for, so you are effectively asking us to
revisit the decision we made on 1233.  I think our time would be
better spent on other things.

> Is it considered likely, that these future directions will actually
> ever be implemented?

It's quite uncommon for future directions, once stated in a published
revision, to be dropped in a later revision (as opposed to being
implemented, or left as-is for reconsideration in a later revision).
So yes, I consider it likely.

> When you take something like '\+' in BREs (and the future directions
> we've added for that recently), there's some big difference:
> 
> POSIX clearly said, that '\+' produces undefined results (9.3.2),... so
> anyone who wanted to be sure to stay portable, had the chance to do so
> by simply not using it.
> Should POSIX ever actually change '\+' to have *only* the special
> meaning of + and not the literal,... no one could really complain when
> he used it in the sense of the literal plus and his stuff breaks -
> because it was never defined so.
> 
> But with any '[\x]', it was (AFAIU) always be meant to be the literal
> character '\' or the literal character for which x stands.
> If someone faithfully relied on that, any actual future change would
> break that assumption.
> 
> If someone would say that it's unlikely that people ever used '[\x]'
> and wanted the literal '\' and the literal character for which x
> stands... then what about '[\^]', which people might have used when
> they mean '\' or '^' but couldn't write '[^\]'? Or what about a range
> like '[\-_]'?

This kind of thing is the reason future directions might be left
as-is, if it is felt that not enough real-world practice has changed
since the revision that added it.  We will need to make a decision
for Issue 9 whether to implement it or leave it for Issue 10.

> And in practise it would seem even more complicated:
> As my examples showed before, e.g. GNU sed only seems to do this for
> '\n' while e.g. '\s' in a bracket expression *is* taken as the literal
> character '\' or the literal character 's'.
> (btw: and so does GNU grep)
> 
> So should the standard ever allow them to be escape sequences there
> would be even more uncertainty on what means what.
>
> 
> Allowing escape sequences inside bracket expressions would also open up
> quite a few of the questions we've tried to deal with in #1550, #1551
> and #1552.

True, but if application writers follow the advice added by bug 1233
(to use two backslashes in bracket expressions) that shouldn't be a
problem.

-- 
Geoff Clare <[email protected]>
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England

Re: [Issue 8 drafts 0001556]: clarify meaning of \n used in a bracket expression in a sed context address or s-command

Reply via email to