AW: AW: AW: Why do I still need MacToISO, when working with UTF-8?
Yep. And no, I didn't tested textEncode(myFile,"CP1252") Tiemo -Ursprüngliche Nachricht- Von: use-livecode [mailto:use-livecode-boun...@lists.runrev.com] Im Auftrag von Kay C Lan via use-livecode Gesendet: Mittwoch, 18. Januar 2017 05:36 An: How to use LiveCode Cc: Kay C Lan Betreff: Re: AW: AW: Why do I still need MacToISO, when working with UTF-8? On Tue, Jan 17, 2017 at 1:24 AM, Mark Waddingham via use-livecode wrote: > > However, the 'endpoints' (i.e. where the developer can 'see' encoded > text output - e.g. when writing to a file, or encoding for a URL) had > to remain as before otherwise all existing applications using anything > other than ASCII text would have broken when moving from 6.7 -> 7.0. > But isn't that the point of Tiemo's confusion - his scripts broke when moving to 7.0! Prior to 7.0 he didn't have to do anything, it all 'just worked'. When he moved to 7.0 where 'unicode' was suppose to 'just work' on all platforms, he's used textEncode/textDecode to/from UTF8 and it's not working for him (on Mac), instead he's found macToISO (MacRoman to Latin 1) is working for him, which seems to be a step backwards. There must be something more hidden in his scripts or PHP. I wonder if he replaced macToISO(myFile) with textEncode(myFile,"CP1252") he'd get the same result. If so, it may suggest that everything is expecting Latin 1, not unicode. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: AW: AW: Why do I still need MacToISO, when working with UTF-8?
On Tue, Jan 17, 2017 at 1:24 AM, Mark Waddingham via use-livecode wrote: > > However, the 'endpoints' (i.e. where the developer can 'see' encoded text > output - e.g. when writing to a file, or encoding for a URL) had to remain > as before otherwise all existing applications using anything other than > ASCII text would have broken when moving from 6.7 -> 7.0. > But isn't that the point of Tiemo's confusion - his scripts broke when moving to 7.0! Prior to 7.0 he didn't have to do anything, it all 'just worked'. When he moved to 7.0 where 'unicode' was suppose to 'just work' on all platforms, he's used textEncode/textDecode to/from UTF8 and it's not working for him (on Mac), instead he's found macToISO (MacRoman to Latin 1) is working for him, which seems to be a step backwards. There must be something more hidden in his scripts or PHP. I wonder if he replaced macToISO(myFile) with textEncode(myFile,"CP1252") he'd get the same result. If so, it may suggest that everything is expecting Latin 1, not unicode. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
AW: Why do I still need MacToISO, when working with UTF-8?
This is what I originally tested first (see my original post), but it didn't worked. The Umlaute (coming from a Mac client) entered the mySQL db corrupted. I don't know, if I could configure anything in my PHP or my db different to make this solution work. But since everything worked always fine with a windows client, I didn't wanted to change anything in my PHP or MySQL and so the MacToISO() conversion on a Mac client is a good solution for my case. Tiemo -Ursprüngliche Nachricht- Von: use-livecode [mailto:use-livecode-boun...@lists.runrev.com] Im Auftrag von Mark Waddingham via use-livecode Gesendet: Montag, 16. Januar 2017 18:30 An: How to use LiveCode Cc: Mark Waddingham Betreff: Re: Why do I still need MacToISO, when working with UTF-8? Hi Matthias, On 2017-01-16 18:25, Matthias Rebbe via use-livecode wrote: > It would have been nice if you had also put some sample code for how > to UTF8 encode the string. > That would have made your explanations complete. ;) Sure - here is how I'd slightly adjust Tiemo's code: *put fld "name" into myName* -- ... *open file myFile for binary write* *write textEncode(myName, "utf8") to file myFile* *close file myFile* -- ... *open file myFile for binary read* *read from file myFile until EOF* *close file myFile* *put textDecode(it, "utf8") into myName* -- ... *put URL ("http://myUser:myPW@myURL"; & "mySQL.php?" & URLEncode(textEncode(theName, "utf8"))) into rslt* -- mySQL.php writes to a MySQL db, where theName column is encoded as "utf8_general_ci" The missing piece here is PHP configuration on the other side. I'm assuming that PHP is doing the following: 1) URLDecode the part after '?' into bytes 2) Interpret the bytes as Latin-1 encoded text 3) Passing the text string to the appropriate MySQL function 4) The MySQL function is converting the text string to UTF-8 This may or may not be the case. If 'theName' is encoded as UTF8 before being URLEncode, all that needs to be checked is that the PHP (on the other end) is decoding it into a string as UTF-8 before passing it to MySQL. Warmest egards, Mark. -- Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/ LiveCode: Everyone can create apps ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: AW: AW: Why do I still need MacToISO, when working with UTF-8?
Hi Tiemo, As an additional note (if you don't absolutely need to write binary) it is also possible to use the syntax /open file for "utf8" text read/ or /open file for "utf8" text write/ in which cases the engine takes care of encoding/decoding the string using UTF-8 encoding. Then, when calling /write to file / /string/ can be a LiveCode string with Unicode characters, and will be written in UTF-8, and /read from file for 1 lines /will set /it/ to a string with the decoded UTF-8 string. Regards, Sebastien On 16/01/2017 17:14, Tiemo Hollmann TB via use-livecode wrote: Hi Mark, thank you for taking your time and clarifying. I wasn't aware that the internal format on a Mac client is MacRoman. I thought it would be a "neutral" UTF-8 format. Thanks Tiemo -Ursprüngliche Nachricht- Von: use-livecode [mailto:use-livecode-boun...@lists.runrev.com] Im Auftrag von Mark Waddingham via use-livecode Gesendet: Montag, 16. Januar 2017 17:42 An: How to use LiveCode Cc: Mark Waddingham Betreff: Re: AW: Why do I still need MacToISO, when working with UTF-8? Hi Tiemo, Okay so, I'm assuming that all this code is running on the Mac client... *put fld "name" into myName* At this point myName contains a (text) string - thus encoding issues don't exist (you should think of text strings in memory as being stored in an 'encoding neutral' format). *open file myFile for binary write* *write myName to file myFile* *close file myFile* This piece of code will open a file on disk in the native encoding of the platform - so MacRoman. It will convert this from the internal encoding to MacRoman on writing. Thus your text file will be a MacRoman encoded text file. *open file myFile for binary read* *read from file myFile until EOF* *close file myFile* *put it into myName* This piece of code will read from a file on disk and assume that it is in the native encoding of the platform - so, in this case, MacRoman. It will convert the content of the file from that to the internal encoding. Up to this point - because you saved and loaded the file on the same platform the content of myName should be as you expect -- unchanged. *if the platform is "MacOS" then put macToISO(theName) into theName* When run on Mac this line will execute and do the following: 1) Convert theName to a binary string - this uses the native platform encoding (MacRoman) 2) Map each byte from the MacRoman code index to the ISO Latin-1 code index This essentially converts theName from a text string to a binary string encoded in Latin-1. *put URL ("http://myUser:myPW@myURL"; & "mySQL.php?" & URLEncode(theName)) into rslt* This line constructs the URL - it is making the assumption that PHP (at the other end) will interpret the bytes after the '?' as representing Latin-1 encoded text. Without macToISO on a Mac client theName enters corrupted in the mySQL db This is most likely because PHP is defaulting to 8859-1 or Latin-1 as the encoding used in URLEncoded fields in a URL. If you don't do MacToIso, then you will be passing up MacRoman encoded text (URLencoded) to PHP, which can happily be decoded as Latin-1 or 8859-1 (Latin-1 is a superset of 8859-1), but with some chars (such as accented letters) in different places. What you need to do here is explicitly UTF8 encode theName before passing it to URLEncode, then explicitly decode it as UTF8 on the PHP side (or set a property in PHP which changes the default assumption about URLs - I apologise for not being more accurate here, my knowledge of PHP is a little stale these days!). Warmest Regards, Mark. -- Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/ LiveCode: Everyone can create apps ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: AW: AW: Why do I still need MacToISO, when working with UTF-8?
Hi Tiemo, thank you for taking your time and clarifying. I wasn't aware that the internal format on a Mac client is MacRoman. I thought it would be a "neutral" UTF-8 format. Internally, the engine uses either MacRoman/ISO-Latin1 *or* UTF-16 depending on platform and what the string contains. However, the 'endpoints' (i.e. where the developer can 'see' encoded text output - e.g. when writing to a file, or encoding for a URL) had to remain as before otherwise all existing applications using anything other than ASCII text would have broken when moving from 6.7 -> 7.0. You can use the 'utf8' keyword to open utf-8 encoded files; however, you have to deal with urlEncode manually (which isn't necessarily a bad thing, since your server scripts determines what the URL Encoded bytes mean after the '?' - NOT LiveCode). Warmest Regards, Mark. -- Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/ LiveCode: Everyone can create apps ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
AW: AW: Why do I still need MacToISO, when working with UTF-8?
Hi Mark, thank you for taking your time and clarifying. I wasn't aware that the internal format on a Mac client is MacRoman. I thought it would be a "neutral" UTF-8 format. Thanks Tiemo -Ursprüngliche Nachricht- Von: use-livecode [mailto:use-livecode-boun...@lists.runrev.com] Im Auftrag von Mark Waddingham via use-livecode Gesendet: Montag, 16. Januar 2017 17:42 An: How to use LiveCode Cc: Mark Waddingham Betreff: Re: AW: Why do I still need MacToISO, when working with UTF-8? Hi Tiemo, Okay so, I'm assuming that all this code is running on the Mac client... > *put fld "name" into myName* At this point myName contains a (text) string - thus encoding issues don't exist (you should think of text strings in memory as being stored in an 'encoding neutral' format). > *open file myFile for binary write* > *write myName to file myFile* > *close file myFile* This piece of code will open a file on disk in the native encoding of the platform - so MacRoman. It will convert this from the internal encoding to MacRoman on writing. Thus your text file will be a MacRoman encoded text file. > *open file myFile for binary read* > *read from file myFile until EOF* > *close file myFile* > *put it into myName* This piece of code will read from a file on disk and assume that it is in the native encoding of the platform - so, in this case, MacRoman. It will convert the content of the file from that to the internal encoding. Up to this point - because you saved and loaded the file on the same platform the content of myName should be as you expect -- unchanged. > *if the platform is "MacOS" then put macToISO(theName) into theName* When run on Mac this line will execute and do the following: 1) Convert theName to a binary string - this uses the native platform encoding (MacRoman) 2) Map each byte from the MacRoman code index to the ISO Latin-1 code index This essentially converts theName from a text string to a binary string encoded in Latin-1. > *put URL ("http://myUser:myPW@myURL"; & "mySQL.php?" & > URLEncode(theName)) > into rslt* This line constructs the URL - it is making the assumption that PHP (at the other end) will interpret the bytes after the '?' as representing Latin-1 encoded text. > Without macToISO on a Mac client theName enters corrupted in the mySQL > db This is most likely because PHP is defaulting to 8859-1 or Latin-1 as the encoding used in URLEncoded fields in a URL. If you don't do MacToIso, then you will be passing up MacRoman encoded text (URLencoded) to PHP, which can happily be decoded as Latin-1 or 8859-1 (Latin-1 is a superset of 8859-1), but with some chars (such as accented letters) in different places. What you need to do here is explicitly UTF8 encode theName before passing it to URLEncode, then explicitly decode it as UTF8 on the PHP side (or set a property in PHP which changes the default assumption about URLs - I apologise for not being more accurate here, my knowledge of PHP is a little stale these days!). Warmest Regards, Mark. -- Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/ LiveCode: Everyone can create apps ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: AW: Why do I still need MacToISO, when working with UTF-8?
Hi Tiemo, Okay so, I'm assuming that all this code is running on the Mac client... *put fld "name" into myName* At this point myName contains a (text) string - thus encoding issues don't exist (you should think of text strings in memory as being stored in an 'encoding neutral' format). *open file myFile for binary write* *write myName to file myFile* *close file myFile* This piece of code will open a file on disk in the native encoding of the platform - so MacRoman. It will convert this from the internal encoding to MacRoman on writing. Thus your text file will be a MacRoman encoded text file. *open file myFile for binary read* *read from file myFile until EOF* *close file myFile* *put it into myName* This piece of code will read from a file on disk and assume that it is in the native encoding of the platform - so, in this case, MacRoman. It will convert the content of the file from that to the internal encoding. Up to this point - because you saved and loaded the file on the same platform the content of myName should be as you expect -- unchanged. *if the platform is "MacOS" then put macToISO(theName) into theName* When run on Mac this line will execute and do the following: 1) Convert theName to a binary string - this uses the native platform encoding (MacRoman) 2) Map each byte from the MacRoman code index to the ISO Latin-1 code index This essentially converts theName from a text string to a binary string encoded in Latin-1. *put URL ("http://myUser:myPW@myURL"; & "mySQL.php?" & URLEncode(theName)) into rslt* This line constructs the URL - it is making the assumption that PHP (at the other end) will interpret the bytes after the '?' as representing Latin-1 encoded text. Without macToISO on a Mac client theName enters corrupted in the mySQL db This is most likely because PHP is defaulting to 8859-1 or Latin-1 as the encoding used in URLEncoded fields in a URL. If you don't do MacToIso, then you will be passing up MacRoman encoded text (URLencoded) to PHP, which can happily be decoded as Latin-1 or 8859-1 (Latin-1 is a superset of 8859-1), but with some chars (such as accented letters) in different places. What you need to do here is explicitly UTF8 encode theName before passing it to URLEncode, then explicitly decode it as UTF8 on the PHP side (or set a property in PHP which changes the default assumption about URLs - I apologise for not being more accurate here, my knowledge of PHP is a little stale these days!). Warmest Regards, Mark. -- Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/ LiveCode: Everyone can create apps ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
AW: Why do I still need MacToISO, when working with UTF-8?
Hi Mark, as a simplified pseudo code it looks like this: *put fld "name" into myName* -- ... *open file myFile for binary write* *write myName to file myFile* *close file myFile* -- ... *open file myFile for binary read* *read from file myFile until EOF* *close file myFile* *put it into myName* -- ... *if the platform is "MacOS" then put macToISO(theName) into theName* *put URL ("http://myUser:myPW@myURL"; & "mySQL.php?" & URLEncode(theName)) into rslt* -- mySQL.php writes to a MySQL db, where theName column is encoded as "utf8_general_ci" -- ... Without macToISO on a Mac client theName enters corrupted in the mySQL db Tiemo -Ursprüngliche Nachricht- Von: use-livecode [mailto:use-livecode-boun...@lists.runrev.com] Im Auftrag von Mark Waddingham via use-livecode Gesendet: Montag, 16. Januar 2017 14:45 An: How to use LiveCode Cc: Mark Waddingham Betreff: Re: Why do I still need MacToISO, when working with UTF-8? Hi Tiemo, On 2017-01-16 11:57, Tiemo Hollmann TB via use-livecode wrote: > Now my German Umlaute don't get corrupted in the MySQL db and > everything is fine, but I would like to understand the technical > background. Why do I still need MacToISO() in LC 8 on a Mac and even > worse, I didn't needed it in LC 6 in the same program. What am I > missing here? Can you explain (with code examples if possible) what you are calling MacToIso() on, and how you are using its output? It isn't entirely clear exactly what steps you are taking in your previous email and it sounds like the problem is at the point of LC's communication with PHP, rather than within LC. Thanks in advance! Mark. -- Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/ LiveCode: Everyone can create apps ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode