Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Michael Van Canneyt via fpc-pascal
On Tue, 9 Mar 2021, Graeme Geldenhuys via fpc-pascal wrote: On 08/03/2021 2:49 pm, Michael Van Canneyt via fpc-pascal wrote: In that sense, unicode conversion support is something optional and so we require you to enable it explicitly, since enabling it has some drawbacks: Surely if you

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Sven Barth via fpc-pascal
Graeme Geldenhuys via fpc-pascal schrieb am Di., 9. März 2021, 00:56: > > On 07/03/2021 5:48 pm, Nikolay Nikolov via fpc-pascal wrote: > > It depends on what you mean by "just working". > > No, "just worked" is exactly what it says on the tin. It is FPC that > overcomplicating matters. > > > As

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Nikolay Nikolov via fpc-pascal
On 3/9/21 2:18 AM, Graeme Geldenhuys via fpc-pascal wrote: On 08/03/2021 7:49 pm, Jonas Maebe via fpc-pascal wrote: It's not possible to safely use unicodestring without knowing how 16bit unicode works. The compiler can't solve that. I disagree. Java does just that! The issue is the

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Martin Frb via fpc-pascal
On 08/03/2021 23:26, Tomas Hajny via fpc-pascal wrote: On 2021-03-08 21:36, Martin Frb via fpc-pascal wrote: I can think of 2 groups already. 1) Conversion due to explicit declared different encoding.    AnAnsiString := SomeWideString;   AnAsciiString := AnUtf8String; // declared as "type

Re: [fpc-pascal] Cannot write datetime field on sqlite3 database on ARM

2021-03-08 Thread Toru Takubo via fpc-pascal
On 2021/03/08 16:54, Michael Van Canneyt via fpc-pascal wrote: On Mon, 8 Mar 2021, Toru Takubo via fpc-pascal wrote: Hi, I am developing my app on Windows and building apps for other platforms by using cross compiler. Now I have a problem only occurred on Linux ARM. The problem is that it

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Graeme Geldenhuys via fpc-pascal
On 08/03/2021 7:49 pm, Jonas Maebe via fpc-pascal wrote: > It's not possible to safely use unicodestring without > knowing how 16bit unicode works. The compiler can't solve that. I disagree. Java does just that! The issue is the assumption of using array indexing into the a string. I guess

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Graeme Geldenhuys via fpc-pascal
On 08/03/2021 2:49 pm, Michael Van Canneyt via fpc-pascal wrote: > In that sense, unicode conversion support is something optional and so we > require you to enable it explicitly, since enabling it has some drawbacks: Surely if you explicitly use the UnicodeString type, the compiler should know

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Graeme Geldenhuys via fpc-pascal
On 07/03/2021 5:48 pm, Nikolay Nikolov via fpc-pascal wrote: > It depends on what you mean by "just working". No, "just worked" is exactly what it says on the tin. It is FPC that overcomplicating matters. As an example, here is Java that also uses UTF-16 encoding, just like FPC's UnicodeString

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Tomas Hajny via fpc-pascal
On 2021-03-08 21:36, Martin Frb via fpc-pascal wrote: . . In the example the index access should have returned a single codeunit, which was known to be a complete codepoint. As far as I understand the unexpected part was, that the unicode string did not contain the content of the string

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Michael Van Canneyt via fpc-pascal
On Mon, 8 Mar 2021, Martin Frb via fpc-pascal wrote: Obviously knowing the presence/absence of a widestring manager allows to refine warnings. It does not. The compiler has no way to know if the widestring manager actually does a complete or even a good job. Maybe it just does logging and

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Martin Frb via fpc-pascal
On 08/03/2021 20:49, Jonas Maebe via fpc-pascal wrote: On 08/03/2021 19:16, Ryan Joseph via fpc-pascal wrote: I agree it would be nice to have some warning that indexing the unicodeString wouldn't work as expected. Then the compiler would have to give a warning for any indexing of

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Jonas Maebe via fpc-pascal
On 08/03/2021 19:16, Ryan Joseph via fpc-pascal wrote: > I agree it would be nice to have some warning that indexing the unicodeString > wouldn't work as expected. Then the compiler would have to give a warning for any indexing of unicodestring. That would render it useless, because everyone

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Ryan Joseph via fpc-pascal
So I was indeed able to solve the problem using {$codepage utf8} and using the CWString unit. Does this do anything besides change the backend of the UnicodeString/UnicodeChar type? I using other string types in that unit and I'm curious if I've put some kind of performance burden on the other

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Michael Van Canneyt via fpc-pascal
On Mon, 8 Mar 2021, Tomas Hajny via fpc-pascal wrote: On 2021-03-08 15:49, Michael Van Canneyt via fpc-pascal wrote: On Mon, 8 Mar 2021, Adriaan van Os via fpc-pascal wrote: Michael Van Canneyt via fpc-pascal wrote: You didn't configure your environment to deal correctly with Unicode.

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Michael Van Canneyt via fpc-pascal
On Mon, 8 Mar 2021, Adriaan van Os via fpc-pascal wrote: Michael Van Canneyt wrote: The output for me is the same, regardless of the -FcUTF-8 flag being present or not: question marks. But if I add uses cwstring; all will be well. Rationale: Without that, the RTL cannot convert

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Adriaan van Os via fpc-pascal
Michael Van Canneyt wrote: The output for me is the same, regardless of the -FcUTF-8 flag being present or not: question marks. But if I add uses cwstring; all will be well. Rationale: Without that, the RTL cannot convert whatever the compiler wrote in the binary to UTF8 to display it on

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Tomas Hajny via fpc-pascal
On 2021-03-08 11:59, Adriaan van Os via fpc-pascal wrote: Hi, adriaan% cat uniquizz-utf8.pas {$codepage utf8} program uniquizz; var chars: UnicodeString; begin chars := '⌘ key'; writeln(chars); writeln(chars[1]); writeln( 'size ', sizeOf( chars)); writeln( 'length ', length(

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Michael Van Canneyt via fpc-pascal
On Mon, 8 Mar 2021, Adriaan van Os via fpc-pascal wrote: adriaan% cat uniquizz-utf8.pas {$codepage utf8} program uniquizz; var chars: UnicodeString; begin chars := '⌘ key'; writeln(chars); writeln(chars[1]); writeln( 'size ', sizeOf( chars)); writeln( 'length ', length( chars));

Re: [fpc-pascal] Unicode chars losing information

2021-03-08 Thread Adriaan van Os via fpc-pascal
adriaan% cat uniquizz-utf8.pas {$codepage utf8} program uniquizz; var chars: UnicodeString; begin chars := '⌘ key'; writeln(chars); writeln(chars[1]); writeln( 'size ', sizeOf( chars)); writeln( 'length ', length( chars)); end. adriaan% fpc uniquizz-utf8.pas -FcUTF-8 Free Pascal