#284724: Interpretation of NON-BREAK SPACE

2005-02-12 Thread Alastair McKinstry
Hi, I'm investigating #284724, which is about problems caused with NON-BREAK SPACE. This is a character (that can be typed as AltGr-Space in the French and some other keymaps) that looks like a space, but has the added semantics 'don't break a line here', in word processing and display software.

Re: #284724: Interpretation of NON-BREAK SPACE

2005-02-12 Thread Giuseppe Sacco
Il giorno sab, 12-02-2005 alle 09:43 +, Alastair McKinstry ha scritto: [...] > Does anyone have copies (or pointers to free versions of) SUS and any > rulings on this matter? What are developers opinions on this: should > this be treated as a shell (and other scripting language) bug, ie. > shou

Re: #284724: Interpretation of NON-BREAK SPACE

2005-02-12 Thread Alastair McKinstry
On Sath, 2005-02-12 at 11:07 +0100, Giuseppe Sacco wrote: > Il giorno sab, 12-02-2005 alle 09:43 +, Alastair McKinstry ha > scritto: > [...] > > Does anyone have copies (or pointers to free versions of) SUS and any > > rulings on this matter? What are developers opinions on this: should > > thi

Re: #284724: Interpretation of NON-BREAK SPACE

2005-02-12 Thread Frederic Peters
Alastair McKinstry wrote: > As this is not a shell-specific problem, I was really wondering if the > scripting languages had encountered it, particularly the 'we support > Unicode' ones... like Perl and Python... [EMAIL PROTECTED]:~$ python -c 'printÂ"hello world"' File "", line 1 printÂ"he

Re: #284724: Interpretation of NON-BREAK SPACE

2005-02-12 Thread Ross Burton
On Sat, 2005-02-12 at 13:18 +0100, Frederic Peters wrote: > [EMAIL PROTECTED]:~$ python -c 'print "hello world"' > File "", line 1 > print "hello world" > ^ > SyntaxError: invalid syntax Don't you need to tell Python what charset you are using if you don't use ASCII? Ross -- Ross

Re: #284724: Interpretation of NON-BREAK SPACE

2005-02-12 Thread Steinar H. Gunderson
On Sat, Feb 12, 2005 at 01:18:06PM +0100, Frederic Peters wrote: > [EMAIL PROTECTED]:~$ perl -e 'printÂ"hello world\n";' > Unrecognized character \xC2 at -e line 1. 0xc2 is LATIN CAPITAL LETTER A WITH CIRCUMFLEX... Sounds like you tried to give UTF-8 to Perl without the "use utf-8" pragma to tell

Re: #284724: Interpretation of NON-BREAK SPACE

2005-02-12 Thread Adeodato Simó
* Frederic Peters [Sat, 12 Feb 2005 13:18:06 +0100]: > [EMAIL PROTECTED]:~$ ruby -e 'print "hello world\n"' > -e:1: Invalid char `\302' in expression > -e:1: Invalid char `\240' in expression Just to be fully correct, Ruby needs to be told it's dealing with UTF-8 in source: $ ruby -Ku -e '

Re: #284724: Interpretation of NON-BREAK SPACE

2005-02-12 Thread Frederic Peters
Ross Burton wrote: > On Sat, 2005-02-12 at 13:18 +0100, Frederic Peters wrote: > > [EMAIL PROTECTED]:~$ python -c 'print "hello world"' > > File "", line 1 > > print "hello world" > > ^ > > SyntaxError: invalid syntax > > Don't you need to tell Python what charset you are using if y

Re: #284724: Interpretation of NON-BREAK SPACE

2005-02-12 Thread Ross Burton
On Sat, 2005-02-12 at 13:54 +0100, Frederic Peters wrote: > > Don't you need to tell Python what charset you are using if you don't > > use ASCII? > > This is PEP 263 , its > purpose is to declare the encoding of Python source files but it > actually only

Re: #284724: Interpretation of NON-BREAK SPACE

2005-02-12 Thread Thaddeus H. Black
> ... should all characters in the class [:space:] be > treated as a token seperator in shells/languages, or > just the ASCII SPACE? If it seems pertinent to you, the C language standard sets this precedent [1]: "The source file is decomposed into preprocessing tokens and sequences of white-space