Re: [Chicken-users] Unicode inside csi
2007/4/3, Thomas Christian Chust <[EMAIL PROTECTED]>: minh thu wrote: > [...] > In particular, why value of line 1 is 195 and value of line 2 is 249 ? > Actual decimal value for ù is 249. > [...] Hello, by default, CHICKEN's strings are bytestrings, not character strings. Therefore the string-ref call is returning the first byte of the extended character marker of UTF-8, not the character itself. If you want CHICKEN's strings to behave as UTF-8 character strings, have a look at the utf-8 egg. cu, Thomas Thanks Thomas, Thanks John ! thu ___ Chicken-users mailing list Chicken-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] Unicode inside csi
minh thu wrote: > [...] > In particular, why value of line 1 is 195 and value of line 2 is 249 ? > Actual decimal value for ù is 249. > [...] Hello, by default, CHICKEN's strings are bytestrings, not character strings. Therefore the string-ref call is returning the first byte of the extended character marker of UTF-8, not the character itself. If you want CHICKEN's strings to behave as UTF-8 character strings, have a look at the utf-8 egg. cu, Thomas ___ Chicken-users mailing list Chicken-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] Unicode inside csi
minh thu scripsit: > #;1> (char->integer (string-ref "\u00f9" 0)) > 195 What happens here is that when you use a \u escape, it stores the correct UTF-8 representation into the string. Unless you have loaded the utf8 egg, though, string-ref will select the bytes of the string instead of interpreting it as containing UTF-8 characters. So you need: (use syntax-case) (use utf-8) (import utf-8) and then this will correctly return 249. -- Mos Eisley spaceport. You will never John Cowan see a more wretched hive of scum and[EMAIL PROTECTED] villainy -- unless you watch thehttp://www.ccil.org/~cowan Jerry Springer Show. --georgettesworld.com ___ Chicken-users mailing list Chicken-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/chicken-users
[Chicken-users] Unicode inside csi
Hi, I try to print correctly unicode on the console (under Linux). What should I do for this ? I don't understand the following, can someone explain it ? Version 2.6 - linux-unix-gnu-x86 - [ libffi dload ptables applyhook ] (c)2000-2007 Felix L. Winkelmann ; loading /home/mt/.csirc ... ; loading /usr/local/lib/chicken/1/readline.so ... ; loading library regex ... #;1> (char->integer (string-ref "\u00f9" 0)) 195 #;2> #x00f9 249 #;3> (print "ù") ù "ù" #;4> (char->integer (string-ref "ù" 0)) 195 #;5> (print (string-ref "ù" 0)) � #\� #;6> (print 'ù') ù |ù| In particular, why value of line 1 is 195 and value of line 2 is 249 ? Actual decimal value for ù is 249. Thanks, thu ___ Chicken-users mailing list Chicken-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/chicken-users