Christoph Anton Mitterer wrote, on 26 Apr 2022: > > On Mon, 2022-04-25 at 10:21 +0100, Geoff Clare via austin-group-l at > The Open Group wrote: > > This was discussed during our work on bug 1233 and resulted in > > additions > > to the sed APPLICATION USAGE (line 106286 in draft 2.1) and FUTURE > > DIRECTIONS. > > Hmm... AFAICS, that was only done for sed, right? > > Isn't it something that would apply to regular expressions in general > (and thus also e.g. grep)? > And via them, it would also affect bracket expressions in the pattern > matching notation (which refer to the RE bracket expressions). > > So shouldn't that be mentioned there, too?
That's what 1233 asked for, so you are effectively asking us to revisit the decision we made on 1233. I think our time would be better spent on other things. > Is it considered likely, that these future directions will actually > ever be implemented? It's quite uncommon for future directions, once stated in a published revision, to be dropped in a later revision (as opposed to being implemented, or left as-is for reconsideration in a later revision). So yes, I consider it likely. > When you take something like '\+' in BREs (and the future directions > we've added for that recently), there's some big difference: > > POSIX clearly said, that '\+' produces undefined results (9.3.2),... so > anyone who wanted to be sure to stay portable, had the chance to do so > by simply not using it. > Should POSIX ever actually change '\+' to have *only* the special > meaning of + and not the literal,... no one could really complain when > he used it in the sense of the literal plus and his stuff breaks - > because it was never defined so. > > But with any '[\x]', it was (AFAIU) always be meant to be the literal > character '\' or the literal character for which x stands. > If someone faithfully relied on that, any actual future change would > break that assumption. > > If someone would say that it's unlikely that people ever used '[\x]' > and wanted the literal '\' and the literal character for which x > stands... then what about '[\^]', which people might have used when > they mean '\' or '^' but couldn't write '[^\]'? Or what about a range > like '[\-_]'? This kind of thing is the reason future directions might be left as-is, if it is felt that not enough real-world practice has changed since the revision that added it. We will need to make a decision for Issue 9 whether to implement it or leave it for Issue 10. > And in practise it would seem even more complicated: > As my examples showed before, e.g. GNU sed only seems to do this for > '\n' while e.g. '\s' in a bracket expression *is* taken as the literal > character '\' or the literal character 's'. > (btw: and so does GNU grep) > > So should the standard ever allow them to be escape sequences there > would be even more uncertainty on what means what. > > > Allowing escape sequences inside bracket expressions would also open up > quite a few of the questions we've tried to deal with in #1550, #1551 > and #1552. True, but if application writers follow the advice added by bug 1233 (to use two backslashes in bracket expressions) that shouldn't be a problem. -- Geoff Clare <[email protected]> The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
