On 20 Mar 2014, at 17:39, Mark Wieder <mwie...@ahsoftware.net> wrote:

> put unidecode("hello bucko")
> 
> converts the text to 敨汬Ɐ戠捵潫.

Thinking about this a bit more, I ought to write something up about how text 
and binary work in the 7.0 engine and how this relates to the existing ways of 
doing things.

The short version is that text and binary data are now very different things 
and some unexpected things can happen when the engine converts between them. As 
a rough guide:

unicodeText of ...:    binary data, encoded in UTF-16
text of …:   text (unicode but transparent)
I/O:   expects and produces binary data
uniEncode/uniDecode: accept and produce binary data

When the engine implicitly converts binary data -> Unicode, it treats the 
binary data as native characters.

When the engine implicitly converts Unicode -> binary data, it converts to 
native characters and changes unrepresentable characters to '?'

The "byte" chunk expression operates on binary data.

The "word", "char", etc chunk expressions operate on text.

To convert from text to binary, use textEncode e.g. textEncode("Hello, World!", 
"UTF-8")

To convert from binary to text, use textDecode e.g. textDecode(url(...), 
"UTF-8")

Hope that helps explain what is going on. I'll write it up a bit more 
thoroughly so people have a guide to using Unicode (i.e it is transparent 
except where it can't be, like dealing with files*).

*except in some cases ;)

Regards,
Fraser
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to