Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
I read most of the discussion and I think there is no way around a string type containing an encoding field. First, it allows also to support non utf encodings or utf-32 encoding. Having the encoding field does not mean that all target support all encoding. In case an encoding is not supported, the

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Florian Klaempfl wrote: > I read most of the discussion and I think there is no way around a > string type containing an encoding field. [cut] > I know this approach contains some hacks and requires some work but I > think this is the only way to solve things for once and

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
> On Tue, 1 Jul 2008, Florian Klaempfl wrote: > > > I read most of the discussion and I think there is no way around a > > string type containing an encoding field. > > [cut] > > > I know this approach contains some hacks and requires some work but I > > think this is the only way to solve thin

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
Marco van de Voort wrote: >> On Tue, 1 Jul 2008, Florian Klaempfl wrote: >> >>> I read most of the discussion and I think there is no way around a >>> string type containing an encoding field. >> [cut] >> >>> I know this approach contains some hacks and requires some work but I >>> think this is t

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Florian Klaempfl wrote: > Marco van de Voort wrote: > >> On Tue, 1 Jul 2008, Florian Klaempfl wrote: > >> > >>> I read most of the discussion and I think there is no way around a > >>> string type containing an encoding field. > >> [cut] > >> > >>> I know this approach conta

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
Why not just introduce a set of utf-16 routines with utf16string type like the new Delphi? This proposal is at least better then the one from Marco as we at least can get the encoding somehow, but is still inconvenient for cross-platform software. -- Felipe Monteiro de Carvalho _

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
Felipe Monteiro de Carvalho wrote: > Why not just introduce a set of utf-16 routines with utf16string type > like the new Delphi? Because it's not cross platform. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mai

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
> Why not just introduce a set of utf-16 routines with utf16string type > like the new Delphi? > > This proposal is at least better then the one from Marco My is having both an UTF8string and a UTF16string, on all platforms that support unicode. So I don't get this remark. It is just that on uni

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
Ok a quick pointwise comment then. > I read most of the discussion and I think there is no way around a > string type containing an encoding field. First, it allows also to > support non utf encodings or utf-32 encoding. Having the encoding field > does not mean that all target support all encodi

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 9:24 AM, Florian Klaempfl <[EMAIL PROTECTED]> wrote: >> Why not just introduce a set of utf-16 routines with utf16string type >> like the new Delphi? > > Because it's not cross platform. Why isn't is cross-platform? -- Felipe Monteiro de Carvalho __

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 9:42 AM, Marco van de Voort <[EMAIL PROTECTED]> wrote: > I don't like the runtime nature. At all. I want to be able to say "hey look, > I've a bunch of units here, and they only accept utf16, (e.g. because they > were > ported Tiburon code). Convert if necessary" Tiburon co

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 9:30 AM, Marco van de Voort <[EMAIL PROTECTED]> wrote: > My is having both an UTF8string and a UTF16string, on all platforms that > support > unicode. So I don't get this remark. Unless I understood your proposal wrong it involves a TMarcoString which will be declared like

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
> On Tue, Jul 1, 2008 at 9:42 AM, Marco van de Voort <[EMAIL PROTECTED]> wrote: > > I don't like the runtime nature. At all. I want to be able to say "hey look, > > I've a bunch of units here, and they only accept utf16, (e.g. because they > > were > > ported Tiburon code). Convert if necessary" >

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
Felipe Monteiro de Carvalho wrote: > On Tue, Jul 1, 2008 at 9:24 AM, Florian Klaempfl <[EMAIL PROTECTED]> wrote: >>> Why not just introduce a set of utf-16 routines with utf16string type >>> like the new Delphi? >> Because it's not cross platform. > > Why isn't is cross-platform? > Because using

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
[ Charset ISO-8859-1 unsupported, converting... ] > On Tue, Jul 1, 2008 at 9:30 AM, Marco van de Voort <[EMAIL PROTECTED]> wrote: > > My is having both an UTF8string and a UTF16string, on all platforms that > > support > > unicode. So I don't get this remark. > > Unless I understood your proposal

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 9:56 AM, Florian Klaempfl <[EMAIL PROTECTED]> wrote: > Because using utf-16 on linux is very unnatural, same for utf-8 on > windows. Platforms like go32 even don't have any unicode. Coding > platform independent but fast applications is really ugly having fixed > types. Well

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
>> > It is just that on unix, the fileroutines will be defined as utf8string >> So you are going to convert in non utf8 unix? > > Maybe I should have said "in the native encoding" then. So if the it's a > utf-16 unix it will be utf-16. In principle at least. We will have to see > how this fares wi

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
> On Tue, Jul 1, 2008 at 9:56 AM, Florian Klaempfl <[EMAIL PROTECTED]> wrote: > > platform independent but fast applications is really ugly having fixed > > types. > > Well, then you mean that it requires conversion in some platforms > rather then it not being cross-platform. > > What I am trying

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: > On Tue, Jul 1, 2008 at 9:56 AM, Florian Klaempfl <[EMAIL PROTECTED]> wrote: > > Because using utf-16 on linux is very unnatural, same for utf-8 on > > windows. Platforms like go32 even don't have any unicode. Coding > > platform independen

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
> >> > It is just that on unix, the fileroutines will be defined as utf8string > >> So you are going to convert in non utf8 unix? > > > > Maybe I should have said "in the native encoding" then. So if the it's a > > utf-16 unix it will be utf-16. In principle at least. We will have to see > > how t

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
> On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: > I don't see what is difficult about Florians proposition. > On the contrary, it is the simplest possible solution, > and quite elegant in my eyes. To be honest, I flabbergasted that the two of you agreed on such a runtime construct. It goe

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 10:28 AM, Marco van de Voort <[EMAIL PROTECTED]> wrote: > C/C++ support the native encoding on all platforms. I did some googling and they don't support unicode filenames. So we are back to zero systems using this method again =) http://www.google.com/search?q=C%2B%2B+unico

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 10:50 AM, Felipe Monteiro de Carvalho > I did some googling and they don't support unicode filenames. So we > are back to zero systems using this method again =) Actually I think that Carbon uses a system very similar to the one proposed by Florian. The string is an opaque t

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Marco van de Voort wrote: > > On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: > > I don't see what is difficult about Florians proposition. > > On the contrary, it is the simplest possible solution, > > and quite elegant in my eyes. > > To be honest, I flabbergasted t

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Jeff Wormsley
Marco van de Voort wrote: I don't understand how this can work, how can I have a compiletime solution for a runtime problem? procedure mystringproc (s:FlorianUnicodeString); begin if encodingof(s)=utf-16 then begin // utf-16 code here with shiftsize 2 [] needed end else b

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Jeff Wormsley wrote: > > Marco van de Voort wrote: > > I don't understand how this can work, how can I have a compiletime solution > > for a runtime problem? > > > > procedure mystringproc (s:FlorianUnicodeString); > > > > begin > > if encodingof(s)=utf-16 then > > beg

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
> On Tue, 1 Jul 2008, Marco van de Voort wrote: > > > > On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: > > > I don't see what is difficult about Florians proposition. > > > On the contrary, it is the simplest possible solution, > > > and quite elegant in my eyes. > > > > To be honest, I f

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
> Marco van de Voort wrote: > > > If compiler magic is at work, wouldn't all this reduce to s[1] giving > the first char no matter the char size? Where does the "magic" gets its information is my point. ___ fpc-pascal maillist - fpc-pascal@lists.

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
> On Tue, 1 Jul 2008, Jeff Wormsley wrote: > > is defined as char, it gets converted to a standard 0-255 value, but c could > > be defined as FlorianChar and be the native char size. Or am I smoking > > crack? > > No, you understand it correct. > > Obviously, with Florian's type, simple low-lev

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Marco van de Voort wrote: > > On Tue, 1 Jul 2008, Jeff Wormsley wrote: > > > is defined as char, it gets converted to a standard 0-255 value, but c > > > could > > > be defined as FlorianChar and be the native char size. Or am I smoking > > > crack? > > > > No, you unders

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
Marco van de Voort wrote: >> Marco van de Voort wrote: >>> >> If compiler magic is at work, wouldn't all this reduce to s[1] giving >> the first char no matter the char size? > > Where does the "magic" gets its information is my point. I described this already in detail in my first mail: jus

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Paul Ishenin
Michael Van Canneyt wrote: You can still do C:=S[i]. What you cannot do is P:=PChar(S); While (P^<>#0) do SomeByteSizedOperation; Why you cannot? PChar(S) should represent S as raw bytes. If you know what you are doing - it will not harm. In other case, if you corrupt the string then

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Paul Ishenin wrote: > Michael Van Canneyt wrote: > > You can still do C:=S[i]. What you cannot do is > > > > P:=PChar(S); > > While (P^<>#0) do > >SomeByteSizedOperation; > > > Why you cannot? PChar(S) should represent S as raw bytes. If you know what you > are doi

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
Michael Van Canneyt wrote: > > On Tue, 1 Jul 2008, Paul Ishenin wrote: > >> Michael Van Canneyt wrote: >>> You can still do C:=S[i]. What you cannot do is >>> >>> P:=PChar(S); >>> While (P^<>#0) do >>>SomeByteSizedOperation; >>> >> Why you cannot? PChar(S) should represent S as raw byt

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
I think you can still do the byte-size operations this way: ForceEncoding(S, iso-) P:=PChar(S); While (P^<>#0) do SomeByteSizedOperation; Similarly for any other code supposing an encoding. -- Felipe Monteiro de Carvalho ___ fpc-pascal maillist

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: > I think you can still do the byte-size operations this way: > > ForceEncoding(S, iso-) > P:=PChar(S); > While (P^<>#0) do > SomeByteSizedOperation; > > Similarly for any other code supposing an encoding. Absolutely. Michael. _

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber
On Tuesday 01 July 2008 17.06:34 Florian Klaempfl wrote: > Michael Van Canneyt wrote: > > On Tue, 1 Jul 2008, Paul Ishenin wrote: > >> Michael Van Canneyt wrote: > >>> You can still do C:=S[i]. What you cannot do is > >>> > >>> P:=PChar(S); > >>> While (P^<>#0) do > >>>SomeByteSizedOperatio

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gärtner
Zitat von Florian Klaempfl <[EMAIL PROTECTED]>: > Michael Van Canneyt wrote: > > > > On Tue, 1 Jul 2008, Paul Ishenin wrote: > > > >> Michael Van Canneyt wrote: > >>> You can still do C:=S[i]. What you cannot do is > >>> > >>> P:=PChar(S); > >>> While (P^<>#0) do > >>>SomeByteSizedOperatio

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
Mattias Gärtner wrote: > Zitat von Florian Klaempfl <[EMAIL PROTECTED]>: > >> Michael Van Canneyt wrote: >>> On Tue, 1 Jul 2008, Paul Ishenin wrote: >>> Michael Van Canneyt wrote: > You can still do C:=S[i]. What you cannot do is > > P:=PChar(S); > While (P^<>#0) do >

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
> Mattias G?rtner wrote: > example you could tell it that all strings should be utf-8 encoded. Of > course, you get into trouble if some user plays unfair but you could > still protect your code with some EnforceUTF8Encoding. It's exactly the See earlier mail. Tiburon code shouldn't need mods. Tha

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marc Weustink
Florian Klaempfl wrote: [..some of my thoughts..] this suits a construct I saw somewhere: type SomeString = type String(CP_KOI8); Marc ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-02 Thread Marco van de Voort
> Florian Klaempfl wrote: > > [..some of my thoughts..] > > this suits a construct I saw somewhere: > > type >SomeString = type String(CP_KOI8); This isn't the case with florian's type?. Because the first copy from a source with an other encoding would force it to the encoding of the source