Re: [Chicken-users] Unicode inside csi

2007-04-03 Thread minh thu

2007/4/3, Thomas Christian Chust <[EMAIL PROTECTED]>:

minh thu wrote:

> [...]
> In particular, why value of line 1 is 195 and value of line 2 is 249 ?
> Actual decimal value for ù is 249.
> [...]

Hello,

by default, CHICKEN's strings are bytestrings, not character strings.
Therefore the string-ref call is returning the first byte of the
extended character marker of UTF-8, not the character itself.

If you want CHICKEN's strings to behave as UTF-8 character strings, have
a look at the utf-8 egg.

cu,
Thomas




Thanks Thomas, Thanks John !
thu


___
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Unicode inside csi

2007-04-03 Thread Thomas Christian Chust
minh thu wrote:

> [...]
> In particular, why value of line 1 is 195 and value of line 2 is 249 ?
> Actual decimal value for ù is 249.
> [...]

Hello,

by default, CHICKEN's strings are bytestrings, not character strings.
Therefore the string-ref call is returning the first byte of the
extended character marker of UTF-8, not the character itself.

If you want CHICKEN's strings to behave as UTF-8 character strings, have
a look at the utf-8 egg.

cu,
Thomas



___
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Unicode inside csi

2007-04-03 Thread John Cowan
minh thu scripsit:

> #;1> (char->integer (string-ref "\u00f9" 0))
> 195

What happens here is that when you use a \u escape, it stores
the correct UTF-8 representation into the string.  Unless you
have loaded the utf8 egg, though, string-ref will select the bytes
of the string instead of interpreting it as containing UTF-8
characters.

So you need:

(use syntax-case)
(use utf-8)
(import utf-8)

and then this will correctly return 249.

-- 
Mos Eisley spaceport.  You will never   John Cowan
see a more wretched hive of scum and[EMAIL PROTECTED]
villainy -- unless you watch thehttp://www.ccil.org/~cowan
Jerry Springer Show.   --georgettesworld.com


___
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users


[Chicken-users] Unicode inside csi

2007-04-03 Thread minh thu

Hi,

I try to print correctly unicode on the console (under Linux).
What should I do for this ?

I don't understand the following, can someone explain it ?

Version 2.6 - linux-unix-gnu-x86 - [ libffi dload ptables applyhook ]
(c)2000-2007 Felix L. Winkelmann
; loading /home/mt/.csirc ...
; loading /usr/local/lib/chicken/1/readline.so ...
; loading library regex ...
#;1> (char->integer (string-ref "\u00f9" 0))
195
#;2> #x00f9
249
#;3> (print "ù")
ù
"ù"
#;4> (char->integer (string-ref "ù" 0))
195
#;5> (print (string-ref "ù" 0))
�
#\�
#;6> (print 'ù')
ù
|ù|

In particular, why value of line 1 is 195 and value of line 2 is 249 ?
Actual decimal value for ù is 249.

Thanks,
thu
___
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users