http://wiki.freepascal.org/FPC_Unicode_support#Roadmap_of_RTL_Unicode_support
This page does not talk about UTF8Strings being counted in code elements
vs in code points.
I don't consider it understood that they in any case are counted in code
elements. IMHO this should be seriously disc
On Fri, Nov 21, 2008 at 7:30 AM, Michael Schnell <[EMAIL PROTECTED]> wrote:
> This page does not talk about UTF8Strings being counted in code elements vs
> in code points.
>
> I don't consider it understood that they in any case are counted in code
> elements. IMHO this should be seriously discusse
I prefer it to be counted in bytes. If it is counted in Bytes then I
can build a routine that counts in real chars. And we already have a
lot of code to handle utf-8 inside ansisstring which depends on that.
Counting the elements in real chars is very ineficient.
This is commonly agreed, But
On Fri, Nov 21, 2008 at 11:30 AM, Michael Schnell <[EMAIL PROTECTED]> wrote:
>
>> http://wiki.freepascal.org/FPC_Unicode_support#Roadmap_of_RTL_Unicode_support
>
> This page does not talk about UTF8Strings being counted in code elements vs
> in code points.
I only added the roadmap section, the re
I only added the roadmap section, the rest of the content existed
before. You are welcome to amend the content.
I'd rightfully be severely bashed by those who actually will be required
to do the work ;) .
-Michael
___
fpc-devel maillist - fpc-d
Michael Schnell wrote:
I prefer it to be counted in bytes. If it is counted in Bytes then I
can build a routine that counts in real chars. And we already have a
lot of code to handle utf-8 inside ansisstring which depends on that.
Counting the elements in real chars is very ineficient.
This
If Length() would return its value in chars, what length in *bytes*
would the following call set:
SetLength(utfstring_1), Length(utfstring_2));
I don't really understand your question.
I think would would need to have two different function
UTF8ElementlLength(UTF8String) and UTF8PointL
On 21 Nov 2008, at 14:50, Michael Schnell wrote:
If Length() would return its value in chars, what length in *bytes*
would the following call set:
SetLength(utfstring_1), Length(utfstring_2));
I don't really understand your question.
I think would would need to have two different function
Michael Schnell wrote:
I don't really understand your question.
I think would would need to have two different function
UTF8ElementlLength(UTF8String) and UTF8PointLength(UTF8String), first
giving the string length in code elements (byte) and second giving the
length in code points (unicode
So UTF8ElementlLength('Ü') would be 2 and UTF8PointLength('Ü') would
be 1.
Or 2, depending on whether it's predcomposed or decomposed.
I seem to remember that we discussed this some time ago and the result
was that the compose (MAC style ?) characters in fact are a single code
point (Unicode
you also cannot freely change the return value of Pos() from elements
to codepoints.
Of course the counting needs to be consistent for all string functions.
So changing it "on the fly" is dangerous (if you keep a count value in
an integer variable). But this is up to the user.
-Michael
On 21 Nov 2008, at 16:16, Michael Schnell wrote:
So UTF8ElementlLength('Ü') would be 2 and UTF8PointLength('Ü')
would be 1.
Or 2, depending on whether it's predcomposed or decomposed.
I seem to remember that we discussed this some time ago and the
result was that the compose (MAC style ?)
If your point is that there is no way to allow for legacy code to be
used with a "String" type that holds UTF8 code and that it is not
possible (or desirable) to allow for code used in simple occasions that
is understandable to someone who does not want to go into the complete
depth of the UTF8
Op Fri, 21 Nov 2008, schreef Michael Schnell:
If your point is that there is no way to allow for legacy code to be used
with a "String" type that holds UTF8 code and that it is not possible (or
desirable) to allow for code used in simple occasions that is understandable
to someone who does n
Folks, before your waste your time again with endless discussions, have
a look at Yury's work on an unicode rtl, test it and help with patches
and suggestions, it's available in svn at
http://svn.freepascal.org/svn/fpc/branches/unicodertl
___
fpc-devel ma
Legacy code that assumes ASCII can be used in UTF-8. Code that needs
to deal with higher code points needs to be rewritten
This is any Program that formerly used (ANSIS) String and now is
automatically converted to use UTF8 and that is to be released in
Germany, France
With that of co
From: "Florian Klaempfl" <[EMAIL PROTECTED]>
Folks, before your waste your time again with endless discussions,
have
a look at Yury's work on an unicode rtl, test it and help with
patches
and suggestions, it's available in svn at
http://svn.freepascal.org/svn/fpc/branches/unicodertl
It is wor
On Fri, Nov 21, 2008 at 2:42 PM, Michael Schnell <[EMAIL PROTECTED]> wrote:
> And thus forces all users to "understand the full UTF-8 spec" and to rewrite
> their programs, even though the old code perfectly compiles and up to a
> certain extent seems to work.
>
> This is what I think is "not at al
Felipe Monteiro de Carvalho wrote:
On Fri, Nov 21, 2008 at 2:42 PM, Michael Schnell <[EMAIL PROTECTED]> wrote:
And thus forces all users to "understand the full UTF-8 spec" and to rewrite
their programs, even though the old code perfectly compiles and up to a
certain extent seems to work.
Th
Graeme Geldenhuys escreveu:
Hi,
I have added a Roadmap section in the following wiki page. If you find
anything missing or not 100% implemented, please add it to the wiki
page.
http://wiki.freepascal.org/FPC_Unicode_support#Roadmap_of_RTL_Unicode_support
I started a wiki page to list the u
Your comments are absolutely vague and meaningless.
Sorry, but this was discussed already several times, so I supposed that
the problems I see are known to the discussion members:
But here a simple example Lazarus project with all options left in
standard setting:
procedure TForm1.Button
if compiled using *none* utf8 mode.
I did not find a way to set "none utf8 mode" with Lazarus, so that I
just can use ANSIString (and WideString) like I did in the previous version.
Did I miss this option ?
If it exists, why not set same as default so that it works for someone
ignoring Unic
It is works for win32 only for now. Only system unit is finished. Work
in progress...
Sounds great so far !
Is there a document on how exactly it is going to work (will a common
String type get a dynamic coding specification or will there be
different String types for any coding variants ?
Martin Friebe wrote:
I must agree with the "FPC can not to it all automatically" line (as
much as I regret, and admit the beauty there was, if fpc could).
What I mean is:
1) Any Application/Program, that currently compiles and works (using
none utf8, never mind if ascii or ansi) will keep wo
On Mon, Nov 24, 2008 at 3:55 PM, Jeff Wormsley <[EMAIL PROTECTED]> wrote:
> such as SetLength, Length, stringvar[index], copy(string, index, count), pos
> etc. cannot work 100% reliably. You don't know what the programmer wants
> when he says stringvar[3]. Does he mean the third character in the
With plain strings, or Ansi strings, we have code that works today.
If you change any of those to UTF*, then code that uses things such as
SetLength, Length, stringvar[index], copy(string, index, count), pos
etc. cannot work 100% reliably. You don't know what the programmer
wants when he sa
From: "Michael Schnell" <[EMAIL PROTECTED]>
It is works for win32 only for now. Only system unit is finished.
Work in progress...
Sounds great so far !
Is there a document on how exactly it is going to work (will a
common String type get a dynamic coding specification or will there
be diffe
27 matches
Mail list logo