subject:"\\\[fpc\\\-pascal\\\] Unicode file routines proposal"

Re: [fpc-pascal] Unicode file routines proposal

2008-07-02 Thread Mattias Gaertner

On Tue, 1 Jul 2008 18:55:44 +0200 Martin Schreiber [EMAIL PROTECTED] wrote: On Tuesday 01 July 2008 18.32:30 Mattias Gärtner wrote: In this routines length(widestring), widestring[index], pwidechar^, pwidechar[index], pwidechar + offset, pwidechar - pwidechar and

Re: [fpc-pascal] Unicode file routines proposal

2008-07-02 Thread Martin Schreiber

On Wednesday 02 July 2008 11.08:31 Mattias Gärtner wrote: Zitat von Martin Schreiber [EMAIL PROTECTED]: For example lib/common/kernel/msedrawtext.pas:223, procedure layouttext. Nice code. As far as I can see, it handles tabs, linebreaks, c_softhyphen and charwidth. It uses single array

Re: [fpc-pascal] Unicode file routines proposal

2008-07-02 Thread Mattias Gärtner

Zitat von Martin Schreiber [EMAIL PROTECTED]: On Wednesday 02 July 2008 11.08:31 Mattias Gärtner wrote: Zitat von Martin Schreiber [EMAIL PROTECTED]: For example lib/common/kernel/msedrawtext.pas:223, procedure layouttext. Nice code. As far as I can see, it handles tabs, linebreaks,

Re: [fpc-pascal] Unicode file routines proposal

2008-07-02 Thread Martin Schreiber

On Wednesday 02 July 2008 12.44:46 Mattias Gärtner wrote: I don't see how this have a big impact on the performance. The code is complicated enough, don't you think? ;-) Ah, sorry, now I understand. You meant, the performance penalty of the *programmer* is not negligible comparing

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Luca Olivetti

En/na Martin Schreiber ha escrit: I'd say to take a look at how python managed to integrate unicode support: http://www.google.com/search?domains=www.python.orgsitesearch=www.python.o rgsourceid=google-searchq=unicodesubmit=search They have a UTF-16/UCS-2 internal representation, same as

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

On Mon, Jun 30, 2008 at 11:35 AM, Marco van de Voort [EMAIL PROTECTED] wrote: borders? Gtk can load XML files, somewhat equivalent to our LFMs. They use UTF-8 everywhere. GTK is unix centric on other systems. They don't have a firm leg in both the Unix as the Windows world as we do. I

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

They have a UTF-16/UCS-2 internal representation, same as MSEgui which works very well and is fast and handy BTW. And len, slicing, etc. work as expected. Note that if you need characters beyond $ you have to compile it with wide unicode support, and in that case every character

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Luca Olivetti

En/na Marco van de Voort ha escrit: They have a UTF-16/UCS-2 internal representation, same as MSEgui which works very well and is fast and handy BTW. And len, slicing, etc. work as expected. Note that if you need characters beyond $ you have to compile it with wide unicode support, and in

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gaertner

On Tue, 01 Jul 2008 09:35:35 +0200 Luca Olivetti [EMAIL PROTECTED] wrote: En/na Marco van de Voort ha escrit: They have a UTF-16/UCS-2 internal representation, same as MSEgui which works very well and is fast and handy BTW. And len, slicing, etc. work as expected. Note that if you need

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

En/na Marco van de Voort ha escrit: with wide unicode support, and in that case every character will use 4 bytes. That's IMHO a faulty system. It requires you to choose between an incomplete solution or making strings a horrible memory hog. OTOH using variable length characters will

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gaertner

On Tue, 1 Jul 2008 09:23:52 +0200 (CEST) [EMAIL PROTECTED] (Marco van de Voort) wrote: [...] multiple encodings: Are we talking about one encoding per platform or two encodings for all platforms? Under Unix the encoding preference is clear: UTF-8. Under Windows there are a lot of current code

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber

On Tuesday 01 July 2008 09.56:29 Mattias Gaertner wrote: On Tue, 01 Jul 2008 09:35:35 +0200 Luca Olivetti [EMAIL PROTECTED] wrote: OTOH using variable length characters will make string operations expensive (since you can't just multiply the index by 2 or 4 but you have to examine the

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

On Tue, 1 Jul 2008 09:23:52 +0200 (CEST) (note that this is all IMHO, not necessarily core viewpoint) Are we talking about one encoding per platform or two encodings for all platforms? My proposition was: Two encodings, two stringtypes for all. Florian's stand was thinking about one

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gaertner

On Tue, 1 Jul 2008 10:23:32 +0200 Martin Schreiber [EMAIL PROTECTED] wrote: On Tuesday 01 July 2008 09.56:29 Mattias Gaertner wrote: On Tue, 01 Jul 2008 09:35:35 +0200 Luca Olivetti [EMAIL PROTECTED] wrote: OTOH using variable length characters will make string operations expensive

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gaertner

On Tue, 1 Jul 2008 10:33:28 +0200 (CEST) [EMAIL PROTECTED] (Marco van de Voort) wrote: On Tue, 1 Jul 2008 09:23:52 +0200 (CEST) (note that this is all IMHO, not necessarily core viewpoint) Same for me: mine are not lazarus core. Are we talking about one encoding per platform or two

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber

On Tuesday 01 July 2008 10.35:00 Mattias Gaertner wrote: A good example is text layout calculation where it is necessary to iterate over characters (glyphs) over and over again. Text layout nowadays need to consider font widths and unicode specials. Iterating from character to character

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

On Tue, 1 Jul 2008 10:33:28 +0200 (CEST) all platforms? My proposition was: Two encodings, two stringtypes for all. Both at the same time? Yes, utf8string and utf16string. Whatever Tiburon introduces aliased to utf16string, so that will be compat on non-windows too. And the utf16

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gärtner

Zitat von Martin Schreiber [EMAIL PROTECTED]: On Tuesday 01 July 2008 10.35:00 Mattias Gaertner wrote: A good example is text layout calculation where it is necessary to iterate over characters (glyphs) over and over again. Text layout nowadays need to consider font widths and unicode

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber

On Tuesday 01 July 2008 12.19:26 Mattias Gärtner wrote: Zitat von Martin Schreiber [EMAIL PROTECTED]: I did it with utf-8 and UCS-2, beleave me, it was not negligible. Where is the code in msegui? (the code that was formerly UTF-8, not the old UTF-8 code)

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gärtner

Zitat von Martin Schreiber [EMAIL PROTECTED]: On Tuesday 01 July 2008 12.19:26 Mattias GÃ¤rtner wrote: Zitat von Martin Schreiber [EMAIL PROTECTED]: I did it with utf-8 and UCS-2, beleave me, it was not negligible. Where is the code in msegui? (the code that was formerly UTF-8, not

Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl

I read most of the discussion and I think there is no way around a string type containing an encoding field. First, it allows also to support non utf encodings or utf-32 encoding. Having the encoding field does not mean that all target support all encoding. In case an encoding is not supported,

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber

On Tuesday 01 July 2008 13.13:19 Mattias Gärtner wrote: Zitat von Martin Schreiber [EMAIL PROTECTED]: On Tuesday 01 July 2008 12.19:26 Mattias GÃ¤rtner wrote: Zitat von Martin Schreiber [EMAIL PROTECTED]: I did it with utf-8 and UCS-2, beleave me, it was not negligible. Where is

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho

On Tue, Jul 1, 2008 at 4:23 AM, Marco van de Voort [EMAIL PROTECTED] wrote: Certainly. Can you imagine loading a non trivial file in a tstringlist and saving it again and the heaps of conversions? And how do you know that the file to be loaded will be in the system encoding? We should simply

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt

On Tue, 1 Jul 2008, Florian Klaempfl wrote: I read most of the discussion and I think there is no way around a string type containing an encoding field. [cut] I know this approach contains some hacks and requires some work but I think this is the only way to solve things for once and

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

On Tue, Jul 1, 2008 at 4:23 AM, Marco van de Voort [EMAIL PROTECTED] wrote: Certainly. Can you imagine loading a non trivial file in a tstringlist and saving it again and the heaps of conversions? And how do you know that the file to be loaded will be in the system encoding? Not at all.

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

On Tue, 1 Jul 2008, Florian Klaempfl wrote: I read most of the discussion and I think there is no way around a string type containing an encoding field. [cut] I know this approach contains some hacks and requires some work but I think this is the only way to solve things for once

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho

A string type which you don't know the encoding is very inconvenient, because you need to convert it to something else anytime you wish to do any routine which will require knowing the encoding. How will Pos be implemented? And UpperCase? Any cross-platform string manipulation routine will

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl

Marco van de Voort wrote: On Tue, 1 Jul 2008, Florian Klaempfl wrote: I read most of the discussion and I think there is no way around a string type containing an encoding field. [cut] I know this approach contains some hacks and requires some work but I think this is the only way to

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho

On Tue, Jul 1, 2008 at 9:02 AM, Marco van de Voort [EMAIL PROTECTED] wrote: A solution for unicode should be for everything, not just for UIs and filenames. I should be able to carry data within it also, because otherwise we are having this dicussion next week again if Joost needs unicode for

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt

On Tue, 1 Jul 2008, Florian Klaempfl wrote: Marco van de Voort wrote: On Tue, 1 Jul 2008, Florian Klaempfl wrote: I read most of the discussion and I think there is no way around a string type containing an encoding field. [cut] I know this approach contains some hacks and

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl

Felipe Monteiro de Carvalho wrote: ansistrings don't mean everything. They mean either ISO or utf-8. This assumption is wrong. ansistring means the system encoding which uses 8 bit chars. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Vincent Snijders

Florian Klaempfl schreef: Felipe Monteiro de Carvalho wrote: ansistrings don't mean everything. They mean either ISO or utf-8. This assumption is wrong. ansistring means the system encoding which uses 8 bit chars. Even if the system encoding is UTF8? Vincent

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Graeme Geldenhuys

2008/7/1 Felipe Monteiro de Carvalho [EMAIL PROTECTED]: In my system I propose that simply a TWideStringList be implemented, so both ways of storing data are available everwhere. I have a TWideStringList implementation if you are interrested. I got the code somewhere and kept it for a rainy

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

On Tue, Jul 1, 2008 at 9:02 AM, Marco van de Voort [EMAIL PROTECTED] wrote: A solution for unicode should be for everything, not just for UIs and filenames. I should be able to carry data within it also, because otherwise we are having this dicussion next week again if Joost needs unicode

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber

On Tuesday 01 July 2008 14.03:19 Felipe Monteiro de Carvalho wrote: About UCS-2 this is absurd. We certainlly cannot have half the chinese characters ignored in the Free Pascal RTL. ??? Where did you get the information that half of the Chinese characters won't fit in base plane? And utf-16

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl

Vincent Snijders wrote: Florian Klaempfl schreef: Felipe Monteiro de Carvalho wrote: ansistrings don't mean everything. They mean either ISO or utf-8. This assumption is wrong. ansistring means the system encoding which uses 8 bit chars. Even if the system encoding is UTF8? Then it

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho

Why not just introduce a set of utf-16 routines with utf16string type like the new Delphi? This proposal is at least better then the one from Marco as we at least can get the encoding somehow, but is still inconvenient for cross-platform software. -- Felipe Monteiro de Carvalho

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl

Felipe Monteiro de Carvalho wrote: Why not just introduce a set of utf-16 routines with utf16string type like the new Delphi? Because it's not cross platform. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho

On Tue, Jul 1, 2008 at 9:28 AM, Martin Schreiber [EMAIL PROTECTED] wrote: Where did you get the information that half of the Chinese characters won't fit in base plane? http://unicode.org/roadmaps/sip/index.html CJK means Chinese Japanese Korean -- Felipe Monteiro de Carvalho

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

Why not just introduce a set of utf-16 routines with utf16string type like the new Delphi? This proposal is at least better then the one from Marco My is having both an UTF8string and a UTF16string, on all platforms that support unicode. So I don't get this remark. It is just that on unix,

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho

On Tue, Jul 1, 2008 at 9:21 AM, Marco van de Voort [EMAIL PROTECTED] wrote: Well, euh, the main reason is that euh, most programs and data on the system uses the system encoding? So you are saying that FPC should privilege platform-specific software development to cross-platform software

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho

On Tue, Jul 1, 2008 at 9:24 AM, Florian Klaempfl [EMAIL PROTECTED] wrote: Why not just introduce a set of utf-16 routines with utf16string type like the new Delphi? Because it's not cross platform. Why isn't is cross-platform? -- Felipe Monteiro de Carvalho

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho

On Tue, Jul 1, 2008 at 9:42 AM, Marco van de Voort [EMAIL PROTECTED] wrote: I don't like the runtime nature. At all. I want to be able to say hey look, I've a bunch of units here, and they only accept utf16, (e.g. because they were ported Tiburon code). Convert if necessary Tiburon code will

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

On Tue, Jul 1, 2008 at 9:21 AM, Marco van de Voort [EMAIL PROTECTED] wrote: Well, euh, the main reason is that euh, most programs and data on the system uses the system encoding? So you are saying that FPC should privilege platform-specific software development to cross-platform

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho

On Tue, Jul 1, 2008 at 9:30 AM, Marco van de Voort [EMAIL PROTECTED] wrote: My is having both an UTF8string and a UTF16string, on all platforms that support unicode. So I don't get this remark. Unless I understood your proposal wrong it involves a TMarcoString which will be declared like

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

On Tue, Jul 1, 2008 at 9:42 AM, Marco van de Voort [EMAIL PROTECTED] wrote: I don't like the runtime nature. At all. I want to be able to say hey look, I've a bunch of units here, and they only accept utf16, (e.g. because they were ported Tiburon code). Convert if necessary Tiburon

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl

Felipe Monteiro de Carvalho wrote: On Tue, Jul 1, 2008 at 9:24 AM, Florian Klaempfl [EMAIL PROTECTED] wrote: Why not just introduce a set of utf-16 routines with utf16string type like the new Delphi? Because it's not cross platform. Why isn't is cross-platform? Because using utf-16 on

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

[ Charset ISO-8859-1 unsupported, converting... ] On Tue, Jul 1, 2008 at 9:30 AM, Marco van de Voort [EMAIL PROTECTED] wrote: My is having both an UTF8string and a UTF16string, on all platforms that support unicode. So I don't get this remark. Unless I understood your proposal wrong it

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho

On Tue, Jul 1, 2008 at 9:56 AM, Florian Klaempfl [EMAIL PROTECTED] wrote: Because using utf-16 on linux is very unnatural, same for utf-8 on windows. Platforms like go32 even don't have any unicode. Coding platform independent but fast applications is really ugly having fixed types. Well,

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho

It is just that on unix, the fileroutines will be defined as utf8string So you are going to convert in non utf8 unix? Maybe I should have said in the native encoding then. So if the it's a utf-16 unix it will be utf-16. In principle at least. We will have to see how this fares with the

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

On Tue, Jul 1, 2008 at 9:56 AM, Florian Klaempfl [EMAIL PROTECTED] wrote: platform independent but fast applications is really ugly having fixed types. Well, then you mean that it requires conversion in some platforms rather then it not being cross-platform. What I am trying to say is

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt

On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: On Tue, Jul 1, 2008 at 9:56 AM, Florian Klaempfl [EMAIL PROTECTED] wrote: Because using utf-16 on linux is very unnatural, same for utf-8 on windows. Platforms like go32 even don't have any unicode. Coding platform independent but

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

It is just that on unix, the fileroutines will be defined as utf8string So you are going to convert in non utf8 unix? Maybe I should have said in the native encoding then. So if the it's a utf-16 unix it will be utf-16. In principle at least. We will have to see how this fares with

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: I don't see what is difficult about Florians proposition. On the contrary, it is the simplest possible solution, and quite elegant in my eyes. To be honest, I flabbergasted that the two of you agreed on such a runtime construct. It goes

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho

On Tue, Jul 1, 2008 at 10:28 AM, Marco van de Voort [EMAIL PROTECTED] wrote: C/C++ support the native encoding on all platforms. I did some googling and they don't support unicode filenames. So we are back to zero systems using this method again =)

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho

On Tue, Jul 1, 2008 at 10:50 AM, Felipe Monteiro de Carvalho I did some googling and they don't support unicode filenames. So we are back to zero systems using this method again =) Actually I think that Carbon uses a system very similar to the one proposed by Florian. The string is an opaque

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt

On Tue, 1 Jul 2008, Marco van de Voort wrote: On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: I don't see what is difficult about Florians proposition. On the contrary, it is the simplest possible solution, and quite elegant in my eyes. To be honest, I flabbergasted that the

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Jeff Wormsley

Marco van de Voort wrote: I don't understand how this can work, how can I have a compiletime solution for a runtime problem? procedure mystringproc (s:FlorianUnicodeString); begin if encodingof(s)=utf-16 then begin // utf-16 code here with shiftsize 2 [] needed end else

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt

On Tue, 1 Jul 2008, Jeff Wormsley wrote: Marco van de Voort wrote: I don't understand how this can work, how can I have a compiletime solution for a runtime problem? procedure mystringproc (s:FlorianUnicodeString); begin if encodingof(s)=utf-16 then begin //

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

On Tue, 1 Jul 2008, Marco van de Voort wrote: On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: I don't see what is difficult about Florians proposition. On the contrary, it is the simplest possible solution, and quite elegant in my eyes. To be honest, I flabbergasted that

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

Marco van de Voort wrote: If compiler magic is at work, wouldn't all this reduce to s[1] giving the first char no matter the char size? Where does the magic gets its information is my point. ___ fpc-pascal maillist -

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

On Tue, 1 Jul 2008, Jeff Wormsley wrote: is defined as char, it gets converted to a standard 0-255 value, but c could be defined as FlorianChar and be the native char size. Or am I smoking crack? No, you understand it correct. Obviously, with Florian's type, simple low-level access

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt

On Tue, 1 Jul 2008, Marco van de Voort wrote: On Tue, 1 Jul 2008, Jeff Wormsley wrote: is defined as char, it gets converted to a standard 0-255 value, but c could be defined as FlorianChar and be the native char size. Or am I smoking crack? No, you understand it correct.

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl

Marco van de Voort wrote: Marco van de Voort wrote: If compiler magic is at work, wouldn't all this reduce to s[1] giving the first char no matter the char size? Where does the magic gets its information is my point. I described this already in detail in my first mail: just in one of

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Paul Ishenin

Michael Van Canneyt wrote: You can still do C:=S[i]. What you cannot do is P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Why you cannot? PChar(S) should represent S as raw bytes. If you know what you are doing - it will not harm. In other case, if you corrupt the string then

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt

On Tue, 1 Jul 2008, Paul Ishenin wrote: Michael Van Canneyt wrote: You can still do C:=S[i]. What you cannot do is P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Why you cannot? PChar(S) should represent S as raw bytes. If you know what you are doing - it will not

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl

Michael Van Canneyt wrote: On Tue, 1 Jul 2008, Paul Ishenin wrote: Michael Van Canneyt wrote: You can still do C:=S[i]. What you cannot do is P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Why you cannot? PChar(S) should represent S as raw bytes. If you know what you

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho

I think you can still do the byte-size operations this way: ForceEncoding(S, iso-) P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Similarly for any other code supposing an encoding. -- Felipe Monteiro de Carvalho ___ fpc-pascal maillist -

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt

On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: I think you can still do the byte-size operations this way: ForceEncoding(S, iso-) P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Similarly for any other code supposing an encoding. Absolutely. Michael.

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber

On Tuesday 01 July 2008 17.06:34 Florian Klaempfl wrote: Michael Van Canneyt wrote: On Tue, 1 Jul 2008, Paul Ishenin wrote: Michael Van Canneyt wrote: You can still do C:=S[i]. What you cannot do is P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Why you cannot?

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gärtner

Zitat von Florian Klaempfl [EMAIL PROTECTED]: Michael Van Canneyt wrote: On Tue, 1 Jul 2008, Paul Ishenin wrote: Michael Van Canneyt wrote: You can still do C:=S[i]. What you cannot do is P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Why you cannot? PChar(S)

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gärtner

Zitat von Martin Schreiber [EMAIL PROTECTED]: On Tuesday 01 July 2008 13.13:19 Mattias Gärtner wrote: Zitat von Martin Schreiber [EMAIL PROTECTED]: On Tuesday 01 July 2008 12.19:26 Mattias GÃ¤rtner wrote: Zitat von Martin Schreiber [EMAIL PROTECTED]: I did it with utf-8 and UCS-2,

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl

Mattias Gärtner wrote: Zitat von Florian Klaempfl [EMAIL PROTECTED]: Michael Van Canneyt wrote: On Tue, 1 Jul 2008, Paul Ishenin wrote: Michael Van Canneyt wrote: You can still do C:=S[i]. What you cannot do is P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Why you

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber

On Tuesday 01 July 2008 18.32:30 Mattias Gärtner wrote: In this routines length(widestring), widestring[index], pwidechar^, pwidechar[index], pwidechar + offset, pwidechar - pwidechar and inc(pwidechar)/dec(pwidechar) are used often. This can't be done with utf-8 strings. Ehm, do you

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort

Mattias G?rtner wrote: example you could tell it that all strings should be utf-8 encoded. Of course, you get into trouble if some user plays unfair but you could still protect your code with some EnforceUTF8Encoding. It's exactly the See earlier mail. Tiburon code shouldn't need mods. That

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marc Weustink

Florian Klaempfl wrote: [..some of my thoughts..] this suits a construct I saw somewhere: type SomeString = type String(CP_KOI8); Marc ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marc Weustink

Martin Schreiber wrote: On Tuesday 01 July 2008 18.32:30 Mattias Gärtner wrote: In this routines length(widestring), widestring[index], pwidechar^, pwidechar[index], pwidechar + offset, pwidechar - pwidechar and inc(pwidechar)/dec(pwidechar) are used often. This can't be done with utf-8

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber

On Tuesday 01 July 2008 22.23:12 Marc Weustink wrote: Martin Schreiber wrote: On Tuesday 01 July 2008 18.32:30 Mattias Gärtner wrote: In this routines length(widestring), widestring[index], pwidechar^, pwidechar[index], pwidechar + offset, pwidechar - pwidechar and

[fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Felipe Monteiro de Carvalho

Hello, There is already another thread about that, but the thread got too long, and I would like to make a concrete proposal about unicode file routines. It looks simple to me, there are just 2 ways to go, either utf-8 or utf-16. Correct me if I am wrong, but I beliave that FPC developers prefer

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Mattias Gärtner

Zitat von Felipe Monteiro de Carvalho [EMAIL PROTECTED]: Hello, There is already another thread about that, but the thread got too long, and I would like to make a concrete proposal about unicode file routines. It looks simple to me, there are just 2 ways to go, either utf-8 or utf-16.

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Marco van de Voort

There is already another thread about that, but the thread got too long, and I would like to make a concrete proposal about unicode file routines. It looks simple to me, there are just 2 ways to go, either utf-8 or utf-16. There are more possibilities: - native encoding (utf-8 on *nix,

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Vincent Snijders

Marco van de Voort schreef: It looks simple to me, there are just 2 ways to go, either utf-8 or utf-16. There are more possibilities: - native encoding (utf-8 on *nix, utf-16 on windows) - have two types. How can one write portable code with these options? - an unified type (type contains

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Graeme Geldenhuys

2008/6/30 Felipe Monteiro de Carvalho [EMAIL PROTECTED]: It looks simple to me, there are just 2 ways to go, either utf-8 or utf-16. Correct me if I am wrong, but I beliave that FPC developers prefer utf-16, so we can have a widestring version of every routine in the RTL which involves

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Felipe Monteiro de Carvalho

On Mon, Jun 30, 2008 at 9:31 AM, Mattias Gärtner [EMAIL PROTECTED] wrote: But what about all existing code? For example the FCL? How will TStringList.LoadFromFile be converted? TStringList.LoadFromFile(AFileName: widestring); overload The ansi version could call the wide version and just do

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Felipe Monteiro de Carvalho

On Mon, Jun 30, 2008 at 9:55 AM, Graeme Geldenhuys [EMAIL PROTECTED] wrote: I thought UTF-8 was prefered. Hence the reason Lazarus followed the UTF-8 route in LCL and Unicode support. UTF-8 is much better for the LCL because it just fits much better in out existing codebase. For the RTL we

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Marco van de Voort

Marco van de Voort schreef: It looks simple to me, there are just 2 ways to go, either utf-8 or utf-16. There are more possibilities: - native encoding (utf-8 on *nix, utf-16 on windows) - have two types. How can one write portable code with these options? How can you consider

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Vincent Snijders

Marco van de Voort schreef: Marco van de Voort schreef: It looks simple to me, there are just 2 ways to go, either utf-8 or utf-16. There are more possibilities: - native encoding (utf-8 on *nix, utf-16 on windows) - have two types. How can one write portable code with these options? How

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Marco van de Voort

Marco van de Voort schreef: At the borders of my I convert all strings to the 'internal type' and encoding and use it like that. Kind of like we are doing nowadays to convert the line-endings in text files. I don't like this. This makes e.g. processing a database export on Unix

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Felipe Monteiro de Carvalho

On Mon, Jun 30, 2008 at 10:32 AM, Marco van de Voort [EMAIL PROTECTED] wrote: It should be possible to work in the native encoding. One doesn't want to wrap _every_ function in _every_ header with conversions procs. It is not possible to work with a ever changing encoding. MyLabel.Caption :=

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Marco van de Voort

On Mon, Jun 30, 2008 at 10:32 AM, Marco van de Voort [EMAIL PROTECTED] wrote: It should be possible to work in the native encoding. One doesn't want to wrap _every_ function in _every_ header with conversions procs. It is not possible to work with a ever changing encoding.

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread John Coppens

On Mon, 30 Jun 2008 10:03:18 -0300 Felipe Monteiro de Carvalho [EMAIL PROTECTED] wrote: On Mon, Jun 30, 2008 at 9:55 AM, Graeme Geldenhuys [EMAIL PROTECTED] wrote: I thought UTF-8 was prefered. Hence the reason Lazarus followed the UTF-8 route in LCL and Unicode support. UTF-8 is much

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Felipe Monteiro de Carvalho

On Mon, Jun 30, 2008 at 11:35 AM, Marco van de Voort [EMAIL PROTECTED] wrote: I understand the simplicity of one encoding is appealing, but you have to look at all aspects, and that is not just representation in the GUI. It will mean that _every_ string transactie to the outside will have to

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Luca Olivetti

En/na John Coppens ha escrit: This may have been discussed before - but should the encoding not be dependent on the locale? What would happen if I write a FPC program, if the internal routines are, eg., UTF-16, and my locale is set to en_US.UTF8? Anyway, I have the impression that most of

Re: [fpc-pascal] Unicode file routines proposal

2008-06-30 Thread Martin Schreiber

On Monday 30 June 2008 22.19:49 Luca Olivetti wrote: En/na John Coppens ha escrit: This may have been discussed before - but should the encoding not be dependent on the locale? What would happen if I write a FPC program, if the internal routines are, eg., UTF-16, and my locale is set to

94 matches

Mail list logo