Re: RE : [fpc-pascal] JSON and UTF8
Em 10/7/2012 23:19, waldo kitty escreveu: On 7/10/2012 07:00, Luiz Americo Pereira Camara wrote: With the old behavior, in an system with a system code page UTF8, if i try to show the parsed value of \u4E01 in e.g. a LCL app will get garbage. I would expect to work correctly in any enviroment this means that some environments will end up with garbage for those UTF-8 characters that cannot be translated back to the local codepage... i've been running headlong into this with another project and needing to convert from UTF-8 back to at least CP437... there are more than 255 characters in UTF-8 and there's no way i know of to translate them all back to 255 characters... even with trying to use multiples like ae for æ ( alt-145 in CP437 i think realizing that this editor can do whatever it wants to :/ )... the doublet and the character i typed the ones i was thinking of for this example, though... In the previous behavior (conversion UTF16 - system code page) you will get a meaningless character anyway, i.e., those unicode characters are not correctly translated to the system code page correctly since is impossible. BTW: the original issue is already fixed. Thanks Michael Luiz ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
RE : RE : [fpc-pascal] JSON and UTF8
Because the old behaviour is not buggy. It simply did not support Unicode, and does the next best thing, in casu: it transforms to the system codepage. A car without ABS and SAT-Nav is not buggy. It just doesn't support features which are nowadays called standard. You can perfectly drive it. An old car without a seat belt is buggy although when it was built seat belts where not standard. Just as Unicode is nowadays standard, N years ago, it was not. Since backwards compatibility is very important, we must offer the option. The Json standard specifies unicode. Why have utf8 default off? Old cars are retro-fitted with seat belts. Why sell new cars with seat belts as an option? Ludo ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: RE : RE : [fpc-pascal] JSON and UTF8
On Tue, 10 Jul 2012, Ludo Brands wrote: Because the old behaviour is not buggy. It simply did not support Unicode, and does the next best thing, in casu: it transforms to the system codepage. A car without ABS and SAT-Nav is not buggy. It just doesn't support features which are nowadays called standard. You can perfectly drive it. An old car without a seat belt is buggy although when it was built seat belts where not standard. I still wouldn't call the car buggy. For me a car must be able to drive. That is its essential function. If it doesn't start, or stops every 5 minutes, then it is buggy. All the rest are options. Whether they are legally required or not is irrelevant to the car-ness of the car. But I drive a motorcycle, so the point is moot ;-) Just as Unicode is nowadays standard, N years ago, it was not. Since backwards compatibility is very important, we must offer the option. The Json standard specifies unicode. Why have utf8 default off? For backwards compatibility by default. Old cars are retro-fitted with seat belts. Why sell new cars with seat belts as an option? So you'd reverse the constructor boolean argument to specify Utf8 as default, and let the user choose the old behaviour if he needs it ? Michael. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
RE : RE : RE : [fpc-pascal] JSON and UTF8
So you'd reverse the constructor boolean argument to specify Utf8 as default, and let the user choose the old behaviour if he needs it ? If that is unthinkable then define new contructors TJSONParser.Create2(...,AUseUTF8 : Boolean = True) or Create2(...,AUseUTF8 : Boolean = True) and mark the Create(...) as deprecated. Or deprecate TJSONParser and make it a descendant of TJSONParser2 that has Create(...,AUseUTF8 : Boolean = True) constructors. Ludo ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: RE : RE : RE : [fpc-pascal] JSON and UTF8
On Tue, 10 Jul 2012, Ludo Brands wrote: So you'd reverse the constructor boolean argument to specify Utf8 as default, and let the user choose the old behaviour if he needs it ? If that is unthinkable then define new contructors TJSONParser.Create2(...,AUseUTF8 : Boolean = True) or Create2(...,AUseUTF8 : Boolean = True) and mark the Create(...) as deprecated. Or deprecate TJSONParser and make it a descendant of TJSONParser2 that has Create(...,AUseUTF8 : Boolean = True) constructors. Nothing is unthinkable. The other constructs are very ugly. I reversed the argument default value to True. UTF8 is now the default. Michael. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
RE : RE : RE : RE : [fpc-pascal] JSON and UTF8
Nothing is unthinkable. The other constructs are very ugly. I reversed the argument default value to True. UTF8 is now the default. A very wise decision;) Ludo ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: RE : RE : RE : RE : [fpc-pascal] JSON and UTF8
On Tue, 10 Jul 2012, Ludo Brands wrote: Nothing is unthinkable. The other constructs are very ugly. I reversed the argument default value to True. UTF8 is now the default. A very wise decision;) I must be getting older :) Seriously: when json support was written, Unicode/UTF8 support in FPC was sketchy at best. It makes sense to change it, but backward compatibility must be preserved as much as possible. I have added a section to http://wiki.freepascal.org/index.php?title=User_Changes_Trunk to notify people of this change. Michael. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: RE : RE : RE : RE : [fpc-pascal] JSON and UTF8
On 10 Jul 2012, at 11:40, Michael Van Canneyt wrote: I have added a section to http://wiki.freepascal.org/index.php?title=User_Changes_Trunk to notify people of this change. Could you change it to use the same format/template as the other entries? Jonas ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: RE : RE : RE : RE : [fpc-pascal] JSON and UTF8
In our previous episode, Michael Van Canneyt said: http://wiki.freepascal.org/index.php?title=User_Changes_Trunk to notify people of this change. Could you change it to use the same format/template as the other entries? We can always trust Jonas to notice such things :-) Done. Jonas remark reminded me of something else. Delphi uses TEncoding everywhere as parameter (specially load/savetofile/stream like methods) to select codepage. Maybe do that too here? ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: RE : [fpc-pascal] JSON and UTF8
Em 10/7/2012 04:32, Ludo Brands escreveu: Following up on bug 22310 http://bugs.freepascal.org/view.php?id=22310 I enabled the use of UTF8 in the FPC JSON support. The constructors of the JSON parser/scanner now accept an extra argument UseUTF8 which tells them to convert JSON strings to UTF8, not the system codepage. I don't understand why you want to keep buggy behavior for backwards compatibility. I explain: StringToJSONString doesn't do any character conversion except for some special characters. The json spec rfc4627 par 3 says: JSON text SHALL be encoded in Unicode. The default encoding is UTF-8. Since no conversion is done logic says that fpjson expects unicode. Since TJSONStringType = AnsiString, UTF-8 is the only unicode encoding possible. So I don't understand why writing a TJSONStringType has to be utf8 to be compliant with the spec and with the outside world and when reading the same string back it is converted in system encoding. If system encoding support is needed then StringToJSONString should do also a system to utf8 encoding. Agree with Ludo Luiz ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: RE : [fpc-pascal] JSON and UTF8
Em 10/7/2012 04:59, Michael Van Canneyt escreveu: On Tue, 10 Jul 2012, Ludo Brands wrote: Following up on bug 22310 http://bugs.freepascal.org/view.php?id=22310 I enabled the use of UTF8 in the FPC JSON support. The constructors of the JSON parser/scanner now accept an extra argument UseUTF8 which tells them to convert JSON strings to UTF8, not the system codepage. I don't understand why you want to keep buggy behavior for backwards compatibility. Because the old behaviour is not buggy. It's. With the old behavior, in an system with a system code page UTF8, if i try to show the parsed value of \u4E01 in e.g. a LCL app will get garbage. I would expect to work correctly in any enviroment Luiz ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal
Re: RE : [fpc-pascal] JSON and UTF8
On 7/10/2012 07:00, Luiz Americo Pereira Camara wrote: With the old behavior, in an system with a system code page UTF8, if i try to show the parsed value of \u4E01 in e.g. a LCL app will get garbage. I would expect to work correctly in any enviroment this means that some environments will end up with garbage for those UTF-8 characters that cannot be translated back to the local codepage... i've been running headlong into this with another project and needing to convert from UTF-8 back to at least CP437... there are more than 255 characters in UTF-8 and there's no way i know of to translate them all back to 255 characters... even with trying to use multiples like ae for æ ( alt-145 in CP437 i think realizing that this editor can do whatever it wants to :/ )... the doublet and the character i typed the ones i was thinking of for this example, though... ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal