Re: [fpc-devel] UTF8 RTL

2014-11-26 Thread Michael Schnell
On 11/24/2014 10:45 AM, Michael Schnell wrote: . I'll post the wiki paper on this tomorrow. Please see a preliminary version of the text -> http://wiki.freepascal.org/not_Delphi_compatible_enhancement_for_Unicode_Support#Three_more_RAW_types . (Please use a new thread for any discussion on th

Re: [fpc-devel] UTF8 RTL

2014-11-24 Thread Mattias Gaertner
On Mon, 24 Nov 2014 15:25:00 +0100 Jonas Maebe wrote: >[...] > Probably, yes. However: > * since we're close to branching/releasing 2.8, I am not very much in > favour of still modifying core RTL routines like this for inclusion in > 2.8 (this also goes for the defaultformatsettings related p

Re: [fpc-devel] UTF8 RTL

2014-11-24 Thread Jonas Maebe
Mattias Gaertner wrote on ma, 24 nov 2014: On Sun, 23 Nov 2014 17:42:06 +0100 (CET) At the moment uuchar ParamStr only contains a typecast: if (Param=0) then Paramstr:=System.Paramstr(0) else if (Param>0) and (Param Probably, yes. However: * since we're close to branching/releas

Re: [fpc-devel] UTF8 RTL

2014-11-24 Thread Mattias Gaertner
On Sun, 23 Nov 2014 17:42:06 +0100 (CET) mar...@stack.nl (Marco van de Voort) wrote: >[...] > And the 2-byte unicode version exists, in unit uuchar. (the "objpas" of > $mode delphiunicode). For now, simply make a utf8 wrapper that returns an > utf8string. At the moment uuchar ParamStr only cont

Re: [fpc-devel] UTF8 RTL

2014-11-24 Thread Michael Schnell
On 11/24/2014 10:41 AM, Michael Schnell wrote: Maybe a way to allow the user to define the project-wide default encoding branding for the unqualified type "String" can be invented (e.g. OS-Default if not explicitly set) Obviously this asks for a versatile String type (not existing in Delphi

Re: [fpc-devel] UTF8 RTL

2014-11-24 Thread Mattias Gaertner
On Sun, 23 Nov 2014 18:57:33 +0100 Jonas Maebe wrote: > On 23/11/14 17:23, Mattias Gaertner wrote: >[...] > Maybe we need another Default*CodePage variable which indicates the > "real" system code page... +1 > > I also want to update the UTF-8 wiki pages. For example "If you use the > > then

Re: [fpc-devel] UTF8 RTL

2014-11-24 Thread Michael Schnell
On 11/24/2014 10:26 AM, Michael Van Canneyt wrote: How do you reconcile this with the fact that pascal should be easy, and it should be usable for teaching ? .. Let's not get carried away, please... keep it simple. I do know that this is a decent argument That is why I put my text in bra

Re: [fpc-devel] UTF8 RTL

2014-11-24 Thread Michael Van Canneyt
On Mon, 24 Nov 2014, Michael Schnell wrote: On 11/23/2014 05:28 PM, Marco van de Voort wrote: I meant in the way Mattias proposed, continueing making the default "string" type utf8 on Windows. As with Windows, the OS requires API access with UTF-16 encoded strings this would force lots of au

Re: [fpc-devel] UTF8 RTL

2014-11-24 Thread Michael Schnell
On 11/23/2014 05:28 PM, Marco van de Voort wrote: I meant in the way Mattias proposed, continueing making the default "string" type utf8 on Windows. As with Windows, the OS requires API access with UTF-16 encoded strings this would force lots of auto-conversions, not only in the RTL but with an

Re: [fpc-devel] UTF8 RTL

2014-11-24 Thread Michael Schnell
On 11/23/2014 04:31 PM, Michael Van Canneyt wrote: What about "RTL with UTF8 as default"? I am nearly done with the would-be "designdocument" Wiki text, you seemed to want to see done on a suggestion for an extension of the string type variants, that allow for e.g. (1) TStrings siblings an

Re: [fpc-devel] UTF8 RTL

2014-11-23 Thread Jonas Maebe
On 23/11/14 17:23, Mattias Gaertner wrote: > I started the thread about ParamStr, which only supports the system > codepage. I would like to improve it so that it supports > DefaultSystemCodepage. Or at least add an Unicode version of > ParamStr. We need both, as Marco mentioned. The main issue is

Re: [fpc-devel] UTF8 RTL

2014-11-23 Thread Jonas Maebe
On 23/11/14 17:46, Marco van de Voort wrote: > So now we support changing defaultsystemcodepage formally by endusers? I was > not made aware of that. There is the SetMultiByteConversionCodePage() routine in the interface of the system unit that does exactly that. It's kind of strange to have such

Re: [fpc-devel] UTF8 RTL

2014-11-23 Thread Michael Van Canneyt
On Sun, 23 Nov 2014, Marco van de Voort wrote: In our previous episode, Michael Van Canneyt said: To make things clear: I meant in the way Mattias proposed, continueing making the default "string" type utf8 on Windows. Utf8string is fine, but limited. That basically perpetuates the current h

Re: [fpc-devel] UTF8 RTL

2014-11-23 Thread Marco van de Voort
In our previous episode, Michael Van Canneyt said: > > To make things clear: > > I meant in the way Mattias proposed, continueing making the default > > "string" type utf8 on Windows. Utf8string is fine, but limited. > > > > That basically perpetuates the current hack, just slightly more elegant. >

Re: [fpc-devel] UTF8 RTL

2014-11-23 Thread Marco van de Voort
In our previous episode, Mattias Gaertner said: > > Let's try to understand first why do you insist on the "UTF-8" in the name ? > > > > Maybe "UTF-8 aware" is better, if you really want the UTF-8 in the name. > > Maybe there is a misunderstanding. At least I can't follow you here. > > I started

Re: [fpc-devel] UTF8 RTL

2014-11-23 Thread Michael Van Canneyt
On Sun, 23 Nov 2014, Marco van de Voort wrote: In our previous episode, Michael Van Canneyt said: It is not, and neither is it "aware". It is only so when you set defaultsystemcodepage to utf8 and ignore the problems that causes. That is not intended functionality. To make things clear: I

Re: [fpc-devel] UTF8 RTL

2014-11-23 Thread Marco van de Voort
In our previous episode, Michael Van Canneyt said: > > It is not, and neither is it "aware". It is only so when you set > > defaultsystemcodepage to utf8 and ignore the problems that causes. That is > > not intended functionality. To make things clear: I meant in the way Mattias proposed, continu

Re: [fpc-devel] UTF8 RTL

2014-11-23 Thread Mattias Gaertner
On Sun, 23 Nov 2014 16:31:10 +0100 (CET) Michael Van Canneyt wrote: >[...] > That seems wrong, since UTF-8 is not the default on Windows ? > > Let's try to understand first why do you insist on the "UTF-8" in the name ? > > Maybe "UTF-8 aware" is better, if you really want the UTF-8 in the name

Re: [fpc-devel] UTF8 RTL

2014-11-23 Thread Michael Van Canneyt
On Sun, 23 Nov 2014, Marco van de Voort wrote: In our previous episode, Mattias Gaertner said: doesn't support all characters that can be encoded using UTF-8, ...) or on all platforms (some platforms don't even support multiple code pages). Hmm, maybe you have a point there. It is similar t

Re: [fpc-devel] UTF8 RTL

2014-11-23 Thread Marco van de Voort
In our previous episode, Mattias Gaertner said: > > doesn't support all characters that can be encoded using UTF-8, ...) or > > on all platforms (some platforms don't even support multiple code pages). > > Hmm, maybe you have a point there. > It is similar to normal RTL and RTL with CurrencyString

Re: [fpc-devel] UTF8 RTL

2014-11-23 Thread Michael Van Canneyt
On Sun, 23 Nov 2014, Mattias Gaertner wrote: On Sun, 23 Nov 2014 14:26:08 +0100 Jonas Maebe wrote: On 18/11/14 19:51, Mattias Gaertner wrote: On Tue, 18 Nov 2014 18:17:25 +0100 Thanks, but there is no UTF-8 RTL. That's what I thought too a week ago. FPC 2.7 made an old dream come true. :

Re: [fpc-devel] UTF8 RTL

2014-11-23 Thread Mattias Gaertner
On Sun, 23 Nov 2014 14:26:08 +0100 Jonas Maebe wrote: > On 18/11/14 19:51, Mattias Gaertner wrote: > > On Tue, 18 Nov 2014 18:17:25 +0100 > >> > Thanks, but there is no UTF-8 RTL. > > That's what I thought too a week ago. > > FPC 2.7 made an old dream come true. :) > > Nevertheless, please stop

Re: [fpc-devel] UTF8 RTL

2014-11-23 Thread Jonas Maebe
On 18/11/14 19:51, Mattias Gaertner wrote: > On Tue, 18 Nov 2014 18:17:25 +0100 >> > Thanks, but there is no UTF-8 RTL. > That's what I thought too a week ago. > FPC 2.7 made an old dream come true. :) Nevertheless, please stop calling it the UTF-8 RTL. It will give people the wrong impression, be

Re: [fpc-devel] UTF8 RTL

2014-11-21 Thread Marco van de Voort
In our previous episode, Michael Schnell said: > On 11/21/2014 09:53 AM, Marco van de Voort wrote: > > The versatile string type is vaporware. There is no designdocument, > > Do you want me to create a document ? I could easily do it. The point was that even though odds of implementation are low,

Re: [fpc-devel] UTF8 RTL

2014-11-21 Thread Michael Schnell
On 11/21/2014 10:44 AM, Michael Van Canneyt wrote: But as it is, without anything for anyone to work with, your ideas are simply dead-born childs. Right you are and Marco is right that it is nothing but Vaporware right now. I did post a kind of "documentation draft" here to provide a base to

Re: [fpc-devel] UTF8 RTL

2014-11-21 Thread Michael Van Canneyt
On Fri, 21 Nov 2014, Michael Schnell wrote: On 11/21/2014 09:53 AM, Marco van de Voort wrote: The versatile string type is vaporware. There is no designdocument, Do you want me to create a document ? I could easily do it. But as it would need compiler magic to work with, I don't have any c

Re: [fpc-devel] UTF8 RTL

2014-11-21 Thread Michael Schnell
On 11/21/2014 09:53 AM, Marco van de Voort wrote: The versatile string type is vaporware. There is no designdocument, Do you want me to create a document ? I could easily do it. But as it would need compiler magic to work with, I don't have any chance to do the necessary patches. So I think i

Re: [fpc-devel] UTF8 RTL

2014-11-21 Thread Marco van de Voort
In our previous episode, Mattias Gaertner said: > > Changing this would need a new "versatile" String Type that is not > > available in Delphi and hence not compatible. > > What has this to do with UTF8 RTL? Nothing. The versatile string type is vaporware. There is no designdocument, nothing. __

Re: [fpc-devel] UTF8 RTL

2014-11-21 Thread Marco van de Voort
In our previous episode, Michael Schnell said: > > Not old Delphi compatible. One can go new delphi compatible, and make > > everything 2-byte as much as possible. > > Of course I did mean compatibility to "New" Delphi Strings. > > But here (seemingly) TStrings (and with this TStringList) works

Re: [fpc-devel] UTF8 RTL

2014-11-20 Thread Mattias Gaertner
On Thu, 20 Nov 2014 15:02:55 +0100 Michael Schnell wrote: >[...] > But here (seemingly) TStrings (and with this TStringList) works on a > single pre-defined (2-Byte) encoding. And hence any other encoding (as > well 2 Byte as 1 Byte) needs time consuming conversions in and out. > > Changing th

Re: [fpc-devel] UTF8 RTL

2014-11-20 Thread Michael Schnell
On 11/19/2014 06:46 PM, Marco van de Voort wrote: Not old Delphi compatible. One can go new delphi compatible, and make everything 2-byte as much as possible. Of course I did mean compatibility to "New" Delphi Strings. But here (seemingly) TStrings (and with this TStringList) works on a singl

Re: [fpc-devel] UTF8 RTL

2014-11-19 Thread Marco van de Voort
In our previous episode, Michael Schnell said: > > But I meant that even if you use utf8string in many places as soon as > > you stuff it in a container like tstringlist, that is gone. (forced > > ansi conversion, since tstringlist's interface is defined using plain > > string(0)) > > AFAI Unde

Re: [fpc-devel] UTF8 RTL

2014-11-19 Thread Sven Barth
Am 19.11.2014 14:46 schrieb "Marco van de Voort" : > > In our previous episode, Sven Barth said: > > Am 19.11.2014 11:39 schrieb "Mattias Gaertner" < nc-gaert...@netcologne.de>: > > > The RTL on Windows now uses the "W" functions and the AnsiString and > > > ShortString are encoded in CP_ACP. Chang

Re: [fpc-devel] UTF8 RTL

2014-11-19 Thread Michael Schnell
On 11/19/2014 09:12 AM, Marco van de Voort wrote: But I meant that even if you use utf8string in many places as soon as you stuff it in a container like tstringlist, that is gone. (forced ansi conversion, since tstringlist's interface is defined using plain string(0)) AFAI Understand (having

Re: [fpc-devel] UTF8 RTL

2014-11-19 Thread Marco van de Voort
In our previous episode, Sven Barth said: > Am 19.11.2014 11:39 schrieb "Mattias Gaertner" : > > The RTL on Windows now uses the "W" functions and the AnsiString and > > ShortString are encoded in CP_ACP. Changing the DefaultSystemCodePage > > to CP_UTF8 does the trick. > > AFAIK we don't use the

Re: [fpc-devel] UTF8 RTL

2014-11-19 Thread Mattias Gaertner
On Wed, 19 Nov 2014 13:54:00 +0100 Sven Barth wrote: > Am 19.11.2014 11:39 schrieb "Mattias Gaertner" : > > The RTL on Windows now uses the "W" functions and the AnsiString and > > ShortString are encoded in CP_ACP. Changing the DefaultSystemCodePage > > to CP_UTF8 does the trick. > > AFAIK we d

Re: [fpc-devel] UTF8 RTL

2014-11-19 Thread Jonas Maebe
On 19 Nov 2014, at 13:54, Sven Barth wrote: Am 19.11.2014 11:39 schrieb "Mattias Gaertner" >: The RTL on Windows now uses the "W" functions and the AnsiString and ShortString are encoded in CP_ACP. Changing the DefaultSystemCodePage to CP_UTF8 does the trick. AFAIK we don't use the "W" functi

Re: [fpc-devel] UTF8 RTL

2014-11-19 Thread Sven Barth
Am 19.11.2014 11:39 schrieb "Mattias Gaertner" : > The RTL on Windows now uses the "W" functions and the AnsiString and > ShortString are encoded in CP_ACP. Changing the DefaultSystemCodePage > to CP_UTF8 does the trick. AFAIK we don't use the "W" functions yet on non-CE Windows. Regards, Sven __

Re: [fpc-devel] UTF8 RTL

2014-11-19 Thread Mattias Gaertner
On Wed, 19 Nov 2014 09:22:21 +0100 Jonas Maebe wrote: > On 19/11/14 09:12, Marco van de Voort wrote: > > In our previous episode, Jonas Maebe said: > >>> As Jonas said, not using utf8 on Windows. > >> > >> No, that's not what I said. There is no problem with using UTF-8 on > >> Windows. > > > >

Re: [fpc-devel] UTF8 RTL

2014-11-19 Thread Jonas Maebe
On 19/11/14 09:12, Marco van de Voort wrote: > In our previous episode, Jonas Maebe said: >>> As Jonas said, not using utf8 on Windows. >> >> No, that's not what I said. There is no problem with using UTF-8 on Windows. > > As long as you explicitely use utf8string. An ansistring with a dynamic c

Re: [fpc-devel] UTF8 RTL

2014-11-19 Thread Marco van de Voort
In our previous episode, Jonas Maebe said: > > As Jonas said, not using utf8 on Windows. > > No, that's not what I said. There is no problem with using UTF-8 on Windows. As long as you explicitely use utf8string. > > A TStringlist with a ansistrings > > in them passed to an RTL routine will b

Re: [fpc-devel] UTF8 RTL

2014-11-18 Thread Jonas Maebe
On 18/11/14 22:00, Marco van de Voort wrote: > As Jonas said, not using utf8 on Windows. No, that's not what I said. There is no problem with using UTF-8 on Windows. > A TStringlist with a ansistrings > in them passed to an RTL routine will be seen as ansi. That is incorrect (although right no

Re: [fpc-devel] UTF8 RTL

2014-11-18 Thread Marco van de Voort
In our previous episode, Mattias Gaertner said: > > It will always only support system code page on Unix platforms, because > > everything that comes from the shell must be assumed to be in that code > > page. On Windows, it can be changed though. > > True. > >From Lazarus point of view most Unico

Re: [fpc-devel] UTF8 RTL

2014-11-18 Thread Mattias Gaertner
On Tue, 18 Nov 2014 18:17:25 +0100 Jonas Maebe wrote: > On 18/11/14 16:59, Mattias Gaertner wrote: > > Hi and much kudos for those who made the UTF8 RTL. > > Thanks, but there is no UTF-8 RTL. That's what I thought too a week ago. FPC 2.7 made an old dream come true. :) > > GetCurrentDir, Fi

Re: [fpc-devel] UTF8 RTL

2014-11-18 Thread Jonas Maebe
On 18/11/14 16:59, Mattias Gaertner wrote: > Hi and much kudos for those who made the UTF8 RTL. Thanks, but there is no UTF-8 RTL. > GetCurrentDir, FindFirst, FileExist, TStringList, etc. all work well. :) They are not guaranteed to use UTF-8, and you must not assume that they do (and TStringLis

Re: [fpc-devel] UTF8 RTL

2014-11-18 Thread Mattias Gaertner
On Tue, 18 Nov 2014 17:05:49 +0100 (CET) mar...@stack.nl (Marco van de Voort) wrote: > In our previous episode, Mattias Gaertner said: > > Hi and much kudos for those who made the UTF8 RTL. > > > > GetCurrentDir, FindFirst, FileExist, TStringList, etc. all work well. :) > > > > ParamStr is not y

Re: [fpc-devel] UTF8 RTL

2014-11-18 Thread Marco van de Voort
In our previous episode, Mattias Gaertner said: > Hi and much kudos for those who made the UTF8 RTL. > > GetCurrentDir, FindFirst, FileExist, TStringList, etc. all work well. :) > > ParamStr is not yet converted and only supports system codepage. I > would like to help improving it. > > Has some