2017-05-17 14:00:21 +0200, Steffen Nurpmeso:
[...]
>  |BTW, U+00A0, should really not be a [:blank:] or [:space:].
>  |That's the whole point of that "non-breaking space" character.
> 
> It is clearly defined as whitespace in Unicode and thus ISO, at
> least once i last worked with the Unicode tables.  Not that my MUA
> gets that right at the moment, but in theory it should.  (We yet
> use a homegrown byte table for that, which was designed to deal
> with email RFCs, but i hope i soon find some time for my ctext
> Unicode thing again, finally, and in a distant future that MUA
> will get this right, too.)

It's whitespace but it's non-breaking as in it should *not* be
used for delimiters. So either blank/space should not have
U+00A0 or the POSIX spec should be updated to *not* refer to
"blank" when it specifies delimiting behaviour IMO.

Now, http://www.unicode.org/L2/L2003/03139-posix-classes.htm
recommends nbsp be included in the POSIX "blank"/"space"
classes, so I suppose there are quite a few people that don't
agree with me on that (note that I don't object of U+00A0 being
considered a blank/space but of it being considered a
delimiter).

See also
http://www.unicode.org/L2/L2003/03139-posix-classes.htm#TR_14652
about ISO/IEC TR 14652:2004 including BS (backspace) and not
nbsp.

What's the opengroup position?

>  |That's a known oddity of Solaris. (that makes it the only
>  |single-byte blank I'm aware of, though of course one may always
>  |construct a rogue locale that has more).
> 
> The only one besides U+0020 SP and U+0009 HT.
[...]

Yes, of course, that's what I meant. The only non-ASCII single
byte character (0xa0 in many charsets, 0x9a in KOI8-U) that can
cause problem in practice with bash (or ksh88 it appears).

-- 
Stephane

Reply via email to