Re: UTF-8 regression in guile 1.9.5
2009/12/6 Mike Gran : > >> > need to call (setlocale LC_ALL "") > > But for Guile to store characters as codepoints, declaring a locale > pretty much a requirement now. Would it make sense to add (setlocale LC_ALL "") to some default, e.g. boot-9.scm ? --linas
Re: UTF-8 regression in guile 1.9.5
> > Hmm. The "ã" is a dead giveaway that you are printing a UTF-8 string > > that is being interpreted as a ISO-8859-1 string. > > > > You've already said that you're in a UTF-8 locale. It could be that you > > need to call (setlocale LC_ALL "") > > That cured it. > > > as well as having a setlocale call in your program. > > Doesn't seem to be required, after the above. > > Thanks! > > Why this happened is strange; I'm now investigating. Sorry to > have bothered you with something that is dohh .. basic. 1.9.x does work fundamentally differently w.r.t. strings. The reason for that is because of how strings are now stored. In 1.8.x, a character was a byte. In 1.9.x a character is a codepoint. But for Guile to store characters as codepoints, declaring a locale pretty much a requirement now. -Mike
Re: UTF-8 regression in guile 1.9.5
2009/12/6 Mike Gran : >> From: Linas Vepstas > > >> Then, from the guile prompt, I can evaluate the following: >> >> (new-node "てみました。") >> >> and get the output "The name is てみました。" >> >> >> However, in guile-1.9.5, the above gives me: >> >> "The name is ã¦ã¿ã¾ããã" > > Hmm. The "ã" is a dead giveaway that you are printing a UTF-8 string > that is being interpreted as a ISO-8859-1 string. > > You've already said that you're in a UTF-8 locale. It could be that you > need to call (setlocale LC_ALL "") That cured it. > as well as having a setlocale call in your program. Doesn't seem to be required, after the above. Thanks! Why this happened is strange; I'm now investigating. Sorry to have bothered you with something that is dohh .. basic. --linas
Re: UTF-8 regression in guile 1.9.5
> From: Linas Vepstas > Then, from the guile prompt, I can evaluate the following: > >(new-node "てみました。") > > and get the output "The name is てみました。" > > > However, in guile-1.9.5, the above gives me: > >"The name is ã¦ã¿ã¾ããã" Hmm. The "ã" is a dead giveaway that you are printing a UTF-8 string that is being interpreted as a ISO-8859-1 string. You've already said that you're in a UTF-8 locale. It could be that you need to call (setlocale LC_ALL "") from the command line before entering (new-node "てみました。") as well as having a setlocale call in your program. Thanks, Mike Gran
UTF-8 regression in guile 1.9.5
Hi, I seem to see either a regression in guile-1.9.5 with regard to UTF-8 strings, or at least some sort of incompatible change. In guile-1.8.6, I am able to do the following: SCM new_node (SCM sname) { char * cname = scm_to_locale_string(sname); printf ("The name is %s\n", cname); free (cname); return SCM_EOL; } scm_c_define_gsubr("new-node", 1, 0, 0, ss_name); Then, from the guile prompt, I can evaluate the following: (new-node "てみました。") and get the output "The name is てみました。" However, in guile-1.9.5, the above gives me: "The name is ã¦ã¿ã¾ããã" Now, it is very possible that I've forgotten to say (use-modules some-new-utf8-module) but I am unclear on what that module is (and why its not specified by default). In both cases, my shell has: LANG=en_US.UTF-8 --linas