On Jan 26, 2015, at 4:42 AM, Fraser Gordon <fraser.gor...@livecode.com> wrote:

> 
> On 26 Jan 2015, at 02:15, Peter Haworth <p...@lcsql.com> wrote:
> 
>> Thanks Peter.  If that's the case, I'm not seeing much in the way of a
>> coding advantage over pre 7.0.  Sounds like using textEncode/textDecode
>> instaed of uniencode/unidecode?
> 
> Assuming you have UTF-8 encoded data from a source outside LiveCode:
> 
> local tUTF8Data       — This is binary data
> local tString         — This is a textual string
> put textDecode(tUTF8Data, “UTF-8”) into tString
> 
> The important difference is that uniEncode becomes textDecode - because you 
> are decoding some binary data to text. 
> 
> The big difference between 7.0 and previous versions is that Unicode text 
> works everywhere - you don’t need to use special Unicode properties or 
> commands any more.
> 
>> 
>> That does answer another question I had though which is what is needed if
>> the database is UTF-16 encoded.  Sounds like nothing needs to be done.  I
>> guess I'll have to set up some tests.
> 
> If your external data is UTF-16 you still need to textDecode it - if you 
> don’t, it will treat the data as 8-bit text and you’ll get corrupted text 
> back. This 8-bit default is necessary from a backwards compatibility point of 
> view - if we changed it to accept UTF-16 by default, anyone who gets text 
> from an external source and doesn’t textDecode it will suddenly find that 
> their stacks don’t work.
> 
> One way of looking at things is that all external interfaces (files, 
> processes, etc) return binary data and you need to do something to turn that 
> into text (textDecode) and you need to turn your text into binary data when 
> writing to them (textEncode). By using something like UTF-8 as an encoding, 
> it also avoids the problems that occur because the “native” encoding differs 
> between our platforms - it is MacRoman on OSX, CP1252 on Windows and 
> ISO-8859-1 on Linux.
> 
> Regards,
> Fraser


It would be great if there were a stack property we could set that would 
specify what format outputted text would be. The default could be “native”; 
i.e., the native encoding for the platform, but then we could set it to things 
like “utf8” or “utf16” or “ISO”. It would essentially do the textEncode/decode 
for us.

Is this an idea that appeals to folks here?

Devin


Devin Asay
Office of Digital Humanities
Brigham Young University


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to