Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-21 Thread Marco van de Voort
In our previous episode, Felipe Monteiro de Carvalho said: > > When Marco said "yet another 2.2 fixes branch release", he meant 2.2.6. > > Ah, ok ... =) > > So my commend would then be changed to: > > Unicode is what is most discussed and needed at the moment. What is > the point in making a maj

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-21 Thread Felipe Monteiro de Carvalho
On Fri, Nov 21, 2008 at 7:43 AM, Vincent Snijders <[EMAIL PROTECTED]> wrote: > When Marco said "yet another 2.2 fixes branch release", he meant 2.2.6. Ah, ok ... =) So my commend would then be changed to: Unicode is what is most discussed and needed at the moment. What is the point in making a m

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-21 Thread Vincent Snijders
Felipe Monteiro de Carvalho schreef: On Fri, Nov 21, 2008 at 7:01 AM, Marco van de Voort <[EMAIL PROTECTED]> wrote: Is it? Because that might mean yet another 2.2 fixes branch release to fix up the delay that this will cause to 2.4 Another 2.2 fixes branch release is a good idea, because it co

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-21 Thread Felipe Monteiro de Carvalho
On Fri, Nov 21, 2008 at 7:01 AM, Marco van de Voort <[EMAIL PROTECTED]> wrote: > Is it? Because that might mean yet another 2.2 fixes branch release to fix > up the delay that this will cause to 2.4 Another 2.2 fixes branch release is a good idea, because it contains a fix for static methods which

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-21 Thread Marco van de Voort
In our previous episode, Dani?l Mantione said: > >> If you want to help, we need to implement the Delphi 2009 encoding aware > >> string type, both runtime support as well as the compiler support. > > A previous discussion showed that this also breaks a lot of old code and is > > not really nice.

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-21 Thread Daniël Mantione
Op Fri, 21 Nov 2008, schreef Marco van de Voort: In our previous episode, Dani?l Mantione said: Full Unicode support is for FPC 2.4. If you need it today, widestrings are your best option. Is it? Because that might mean yet another 2.2 fixes branch release to fix up the delay that this will

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-21 Thread Marco van de Voort
In our previous episode, Florian Klaempfl said: > They add it only because they insist on using utf-8 :) That's perfectly normal on *nix. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-21 Thread Marco van de Voort
In our previous episode, Dani?l Mantione said: > Full Unicode support is for FPC 2.4. If you need it today, widestrings are > your best option. Is it? Because that might mean yet another 2.2 fixes branch release to fix up the delay that this will cause to 2.4 _

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Michael Schnell
Zaher Dirkey wrote: I meant TStringList must not make Converting, If it's known that a file is in some encoding and the instance of TStringList uses another one, I suppose LoadFromFile needs to do the re-encoding appropriately. -Michael ___ fpc-deve

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Zaher Dirkey
I meant TStringList must not make Converting, convert string must be outside of TStringList (or add special methods to it), and without detecting the encode inside the file when LoadFromFile or Stream, Detecting may use Seek function in the stream, and that break load from tcp/ip connection or comp

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Zaher Dirkey
That must name Convert not Hack it is same when you work with Ansi version of Lazarus/Delphi and then try to load from unicode file. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Vincent Snijders
Graeme Geldenhuys schreef: Hello again, We are seeing more and more "hacks" being applied to projects trying to scramble around the missing FPC feature - no built-in Unicode supporting. A simple example in Lazarus Loading a UTF-8 encoded file into a TMemo. Normally you would write code as

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Mattias Gärtner
Zitat von Graeme Geldenhuys <[EMAIL PROTECTED]>: > On Thu, Nov 20, 2008 at 1:22 PM, peter green <[EMAIL PROTECTED]> wrote: > > > > The thing is we can't reasonablly provide functions based on what a user > > would see as a character because doing so would require huge lookup tables > > (one user v

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Jonas Maebe
On 20 Nov 2008, at 13:13, Graeme Geldenhuys wrote: I think basing those functions on code points should suffice. I also think as soon as strings are assigned or loaded from file, they should be normalized. So two code points like the A and Umlaut code points would become one. How would one

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Graeme Geldenhuys
On Thu, Nov 20, 2008 at 1:22 PM, peter green <[EMAIL PROTECTED]> wrote: > > The thing is we can't reasonablly provide functions based on what a user > would see as a character because doing so would require huge lookup tables > (one user visible character != one code point) so the best we can do is

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread listmember
Ok, two questions for the example above: - how do you maintain backward compatibility? - how do you load a plain old ansi file? You could alter the LoadFromFile(), LoadFromStream(), SaveToFile(), SaveToStrwam() routines like below: procedure TStringList.LoadFromFile(AFileName: TFilename; cons

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Michael Schnell
type cp850string=ansistring(CP_850); utf8string=ansistring(CP_UTF8); Why not use the current locale for this ? Would that be just ANSIString ? a:=b; {Compiler knows conversion to perform at compile time. I suppose the conversion function is provided with the locale and this it as

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Michael Schnell
The thing is we can't reasonablly provide functions based on what a user would see as a character because doing so would require huge lookup tables (one user visible character != one code point) so the best we can do is code point based which isn't really much better for most tasks than code

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Michael Schnell
For best backward compatibility, I would say Copy, Length, Pos etc should work by "character based" by default. Agreed. Then introduce more optimised versions like ElementCopy, ElementLength, etc... Old programs will work out of the box, but might experience a minor speed penalty, until the

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread peter green
For best backward compatibility, I would say Copy, Length, Pos etc should work by "character based" by default. The thing is we can't reasonablly provide functions based on what a user would see as a character because doing so would require huge lookup tables (one user visible character != one

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Daniël Mantione
Op Thu, 20 Nov 2008, schreef Michael Schnell: Isn't this the same?? I understand that D2009 uses dynamic code information, while my suggestion is based on several different (static) types. As I understand it is static. type cp850string=ansistring(CP_850); utf8string=ansistring(

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Graeme Geldenhuys
On Thu, Nov 20, 2008 at 12:55 PM, Michael Schnell <[EMAIL PROTECTED]> wrote: >>> * What about usage like: SomeString[x] := 'A'; >> >> String element based. > > This also holds for Copy, Length, Pos, etc. > > I thinks if would be a good idea to provide dedicated functions for the > "element based"

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Michael Schnell
UCS16 UTF16 :) -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Michael Schnell
Isn't this the same?? I understand that D2009 uses dynamic code information, while my suggestion is based on several different (static) types. I feel that static types are a lot easier to implement and if using them correctly, the user can tune the program to be as fast as possible or as

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Michael Schnell
System encoding is the encoding your files are written in when doing a "echo Hello > file.txt". nice point :) I Suppose with my German WinXP system encoding is German ANSI Does it hold only for files ? I suppose WinXP provides an OS API with WideStrings (supposedly UCS16). But how do I ha

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Michael Schnell
* Copy, Length, Pos etc...? Yup. * What about usage like: SomeString[x] := 'A'; String element based. This also holds for Copy, Length, Pos, etc. I thinks if would be a good idea to provide dedicated functions for the "element based" (fast) and the "character based" (old style compa

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Graeme Geldenhuys
On Thu, Nov 20, 2008 at 12:50 PM, Daniël Mantione <[EMAIL PROTECTED]> wrote: >> >> What is "system encoding" regarding different OS, locale, ... ? > > System encoding is the encoding your files are written in when doing a > "echo Hello > file.txt". Good explanation Daniël. :-) I always wonder tha

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Daniël Mantione
Op Thu, 20 Nov 2008, schreef Michael Schnell: The file is assumed to be in system encoding (which can be UTF-8). Support for reading of other encodings has not been decided on about yet and is not part of the initial plan. What is "system encoding" regarding different OS, locale, ... ? S

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Michael Schnell
The file is assumed to be in system encoding (which can be UTF-8). Support for reading of other encodings has not been decided on about yet and is not part of the initial plan. What is "system encoding" regarding different OS, locale, ... ? -Michael ___

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Daniël Mantione
Op Thu, 20 Nov 2008, schreef Michael Schnell: If you want to help, we need to implement the Delphi 2009 encoding aware string type, both runtime support as well as the compiler support. A previous discussion showed that this also breaks a lot of old code and is not really nice. As I und

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Michael Schnell
If you want to help, we need to implement the Delphi 2009 encoding aware string type, both runtime support as well as the compiler support. A previous discussion showed that this also breaks a lot of old code and is not really nice. So a better concept seems to have a dedicated type for any

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Graeme Geldenhuys
On Thu, Nov 20, 2008 at 12:07 PM, Michael Schnell <[EMAIL PROTECTED]> wrote: > >> Russian locale requires a >1 byte char. > > Hmmm. We did lots of non-Unicode Delphi programs with a Russian ANSI > variant. Well, I have a Russian user of fpGUI. He noted quite a few issues with FPC's locale variab

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Daniël Mantione
Op Thu, 20 Nov 2008, schreef Daniël Mantione: * Does UnicodeString work on all platforms? Linux, Windows for a start? Yes, but all platforms will get string=unicodestring. There is a "not" missing: Yes, but not all platforms will get string=unicodestring. Daniël__

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Michael Schnell
Full Unicode support is for FPC 2.4. If you need it today, widestrings are your best option. Unfortunately working with WideString in Lazarus is close to impossible as the LCL API is done with UTF8String and there is no correct automatic conversion between UTF8String and WideString, as the com

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Daniël Mantione
Op Thu, 20 Nov 2008, schreef Graeme Geldenhuys: On Thu, Nov 20, 2008 at 11:37 AM, Florian Klaempfl <[EMAIL PROTECTED]> wrote: FPC supports Unicode, in 2.3.x is the UnicodeString type available being a ref. counted utf-16 string on all platforms. OK, I'll try to switch fpGUI's TfpgString ty

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Michael Schnell
Russian locale requires a >1 byte char. Hmmm. We did lots of non-Unicode Delphi programs with a Russian ANSI variant. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Michael Schnell
FPC supports Unicode, in 2.3.x is the UnicodeString type available being a ref. counted utf-16 string on all platforms. Is same used by TStringList ? I don't think so, otherwise LoadFromFile should need to be aware of several possible file encodings. And I suppose the utf8-API of the LCL w

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Graeme Geldenhuys
On Thu, Nov 20, 2008 at 11:37 AM, Florian Klaempfl <[EMAIL PROTECTED]> wrote: > > FPC supports Unicode, in 2.3.x is the UnicodeString type available being a > ref. counted utf-16 string on all platforms. OK, I'll try to switch fpGUI's TfpgString type to alias UnicodeString an see what happens. Obv

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Aleksa Todorovic
On Thu, Nov 20, 2008 at 10:06, Graeme Geldenhuys <[EMAIL PROTECTED]> wrote: > > Unfortunately that doesn't work if the file contains unicode content, > so the following "hack" is required which is quite nasty: > > ls := TStringList.Create; > ls.LoadFromFile('someunicodefile.txt'); > for i :=

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Daniël Mantione
Op Thu, 20 Nov 2008, schreef Graeme Geldenhuys: On Thu, Nov 20, 2008 at 11:28 AM, Daniël Mantione <[EMAIL PROTECTED]> wrote: These instructions are highly unproductive. Work on being able to compile the RTL in either ansi/unicode depending on the platform has started. Full Unicode support is

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread petr . kristan
On Thu, Nov 20, 2008 at 10:39:00AM +0100, Daniël Mantione wrote: > > > Op Thu, 20 Nov 2008, schreef Graeme Geldenhuys: > >> On Thu, Nov 20, 2008 at 11:12 AM, Florian Klaempfl >> <[EMAIL PROTECTED]> wrote: >>> >>> Ok, two questions for the example above: >>> - how do you maintain backward compatibil

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Graeme Geldenhuys
On Thu, Nov 20, 2008 at 11:28 AM, Daniël Mantione <[EMAIL PROTECTED]> wrote: > > These instructions are highly unproductive. Work on being able to compile > the RTL in either ansi/unicode depending on the platform has started. > Full Unicode support is for FPC 2.4. Well, that's the first I heard o

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Daniël Mantione
Op Thu, 20 Nov 2008, schreef Graeme Geldenhuys: On Thu, Nov 20, 2008 at 11:12 AM, Florian Klaempfl <[EMAIL PROTECTED]> wrote: Ok, two questions for the example above: - how do you maintain backward compatibility? - how do you load a plain old ansi file? If the file is UTF-8 or ANSI, the a

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Florian Klaempfl
Graeme Geldenhuys schrieb: On Thu, Nov 20, 2008 at 11:12 AM, Florian Klaempfl <[EMAIL PROTECTED]> wrote: Ok, two questions for the example above: - how do you maintain backward compatibility? - how do you load a plain old ansi file? If the file is UTF-8 or ANSI, the above should work. UTF-8

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Graeme Geldenhuys
On Thu, Nov 20, 2008 at 11:12 AM, Florian Klaempfl <[EMAIL PROTECTED]> wrote: > > Ok, two questions for the example above: > - how do you maintain backward compatibility? > - how do you load a plain old ansi file? If the file is UTF-8 or ANSI, the above should work. UTF-8 was designed to be back

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Daniël Mantione
Op Thu, 20 Nov 2008, schreef Graeme Geldenhuys: All that crap just to load a simple text file that contains unicode content!!! :-( And the other problem is that the hack above assumes the files content is UTF-8 encoded. If the content is UTF-16 encoded, you need yet another hack. :-( As far a

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Florian Klaempfl
Graeme Geldenhuys schrieb: Hello again, We are seeing more and more "hacks" being applied to projects trying to scramble around the missing FPC feature - no built-in Unicode supporting. A simple example in Lazarus Loading a UTF-8 encoded file into a TMemo. Normally you would write code as

Re: [fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread dmitry boyarintsev
shorter (and faster) hacky crap: ls := TStringList.Create; ls.LoadFromFile('someunicodefile.txt'); Memo.Text := UTF8Encode(ls.Text); ls.Free ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fp

[fpc-devel] Unicode support - for the 20th time... ;-)

2008-11-20 Thread Graeme Geldenhuys
Hello again, We are seeing more and more "hacks" being applied to projects trying to scramble around the missing FPC feature - no built-in Unicode supporting. A simple example in Lazarus Loading a UTF-8 encoded file into a TMemo. Normally you would write code as follows (for ANSI text):