On Fri, 2005-04-15 at 17:44, Juerd wrote:
> Is there a <?ws>-like thingy that is always \s+?

Not sure what that means exactly.

> Do \s and <?ws> match non-breaking whitespace, U+00A0?

As I understood, Perl 6 was going to use the Unicode standard(s) to
determine the whitespacishness of each codepoint. Going to Google, I
find:

http://www.fileformat.info/info/unicode/category/Zs/list.htm

which lists all of the "separator, space" characters.

> How about:
> 
>     U+0008  backspace
Character.isWhitespace() No
>     U+00A0  no break space (Repeated for overview)
Character.isWhitespace() No
>     U+1361  ethiopic wordspace
Character.isWhitespace() No
>     U+2000  en quad
Character.isWhitespace() Yes
>     U+2001  em quad
Character.isWhitespace() Yes
>     U+2002  en space
Character.isWhitespace() Yes
>     U+2003  em space
Character.isWhitespace() Yes
>     U+2004  three per em space
Character.isWhitespace() Yes
>     U+2005  four per em space
Character.isWhitespace() Yes
>     U+2006  six per em space
Character.isWhitespace() Yes
>     U+2007  figure space
Character.isWhitespace() No
>     U+2008  punctuation space
Character.isWhitespace() Yes
>     U+2009  thin space 
Character.isWhitespace() Yes
>     U+200A  hair space
Character.isWhitespace() Yes
>     U+200B  zero width space
Character.isWhitespace() Yes
>     U+202F  narrow no break space
Character.isWhitespace() No
>     U+205F  medium mathematic space
Character.isWhitespace() Yes
>     U+2060  word joiner (What is that, anyway?)
Character.isWhitespace() No
Comments WJ
a zero width non-breaking space (only)
intended for disambiguation of functions for byte order mark
>     U+3000  ideographic space
Character.isWhitespace() Yes
>     U+FEFF  zero width non-breaking space
Character.isWhitespace() No

> \s is said (in S05) to match any unicode whitespace, but letting it
> match NBSP and then using \s for splitting things is wrong, I think.

Thankfully, NBSP (U+00A0) is not Unicode whitespace.

-- 
Aaron Sherman <[EMAIL PROTECTED]>
Senior Systems Engineer and Toolsmith
"It's the sound of a satellite saying, 'get me down!'" -Shriekback


Reply via email to