On Jan 26, 2015, at 4:42 AM, Fraser Gordon <fraser.gor...@livecode.com> wrote:
> > On 26 Jan 2015, at 02:15, Peter Haworth <p...@lcsql.com> wrote: > >> Thanks Peter. If that's the case, I'm not seeing much in the way of a >> coding advantage over pre 7.0. Sounds like using textEncode/textDecode >> instaed of uniencode/unidecode? > > Assuming you have UTF-8 encoded data from a source outside LiveCode: > > local tUTF8Data — This is binary data > local tString — This is a textual string > put textDecode(tUTF8Data, “UTF-8”) into tString > > The important difference is that uniEncode becomes textDecode - because you > are decoding some binary data to text. > > The big difference between 7.0 and previous versions is that Unicode text > works everywhere - you don’t need to use special Unicode properties or > commands any more. > >> >> That does answer another question I had though which is what is needed if >> the database is UTF-16 encoded. Sounds like nothing needs to be done. I >> guess I'll have to set up some tests. > > If your external data is UTF-16 you still need to textDecode it - if you > don’t, it will treat the data as 8-bit text and you’ll get corrupted text > back. This 8-bit default is necessary from a backwards compatibility point of > view - if we changed it to accept UTF-16 by default, anyone who gets text > from an external source and doesn’t textDecode it will suddenly find that > their stacks don’t work. > > One way of looking at things is that all external interfaces (files, > processes, etc) return binary data and you need to do something to turn that > into text (textDecode) and you need to turn your text into binary data when > writing to them (textEncode). By using something like UTF-8 as an encoding, > it also avoids the problems that occur because the “native” encoding differs > between our platforms - it is MacRoman on OSX, CP1252 on Windows and > ISO-8859-1 on Linux. > > Regards, > Fraser It would be great if there were a stack property we could set that would specify what format outputted text would be. The default could be “native”; i.e., the native encoding for the platform, but then we could set it to things like “utf8” or “utf16” or “ISO”. It would essentially do the textEncode/decode for us. Is this an idea that appeals to folks here? Devin Devin Asay Office of Digital Humanities Brigham Young University _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode