Am 22.08.2012 21:45, schrieb Graeme Geldenhuys:
On 22 August 2012 10:19, Sven Barth wrote:
Depending on how they implement it this might indeed be an interesting
feature that we could implement (cherry picking Delphi features ^^).
It's already possible, just use IInterface (and TInterfacedOb
On 08/22/2012 07:30 PM, Ivanko B wrote:
...
You need to stop your mailer from all the time replying to the wrong
message. This makes the forum rather unreadable.
-Michael
___
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepasc
On 22 August 2012 10:19, Sven Barth wrote:
> Depending on how they implement it this might indeed be an interesting
> feature that we could implement (cherry picking Delphi features ^^).
It's already possible, just use IInterface (and TInterfacedObject)
everywhere. :)
--
Regards,
- Graeme -
On Wed, 22 Aug 2012 22:30:52 +0500
Ivanko B wrote:
> Even if you would implement something like the Unix "find" or "ls"
> programs, they would be more likely to be limited by I/O and all sorts
> of file/directory attribute lookups than code page conversions of file
> names.
>
> 1) I/
Even if you would implement something like the Unix "find" or "ls"
programs, they would be more likely to be limited by I/O and all sorts
of file/directory attribute lookups than code page conversions of file
names.
1) I/O is heavily cached on modern a-lot-of-RAM machines & 2)
conversi
Marco van de Voort schrieb:
In our previous episode, Hans-Peter Diettrich said:
this is a huge move for a native code compiler. If FPC will follow, this
sounds like a lot of work.
I don't see much work here. The code for handling interface references
exists, it only has to be applied to the new
In our previous episode, Hans-Peter Diettrich said:
> > this is a huge move for a native code compiler. If FPC will follow, this
> > sounds like a lot of work.
>
> I don't see much work here. The code for handling interface references
> exists, it only has to be applied to the new TObject type,
Graeme Geldenhuys wrote on Wed, 22 Aug 2012:
Accessing a 100k of files (filenames to be exact) in a UTF-8
environment (Linux), which must all be stored in a UTF-16 string type.
That's lots and lots of encoding conversions right there - in a tight
loop.
It's nevertheless a bad example, because
On 08/21/2012 02:53 PM, Graeme Geldenhuys wrote:
I have a program that does exactly that... Loads files to do CRC
checking to see what changed.
Hmm. I feel that reading files takes a lot m,ore CPU time than
converting the stings at the border of the LCL.
This of course does not include convert
Graeme Geldenhuys schrieb:
On 22 August 2012 00:54, Hans-Peter Diettrich wrote:
IMO string conversion and CRC are mutually exclusive.
Accessing a 100k of files (filenames to be exact) in a UTF-8
environment (Linux), which must all be stored in a UTF-16 string type.
Filenames typically deser
Marco van de Voort schrieb:
In our previous episode, Hans-Peter Diettrich said:
utf8/16 -> ansi are a bit more involved. (since mapping many chars to few,
naieve implementation requiring large lookupsets)
A single 256 element array can be used for both directions. In Ansi to
Unicode the char va
Michael Schnell schrieb:
On 08/21/2012 02:53 PM, Graeme Geldenhuys wrote:
http://blogs.embarcadero.com/jtembarcadero/2012/08/20/xe3-and-beyond/
Other than politics, the big news regarding technology seems to be that
Objects (or whatever) seem to get reference counted and thus I
understand ".
Am 22.08.2012 11:44, schrieb Marco van de Voort:
In our previous episode, Sven Barth said:
Objects (or whatever) seem to get reference counted and thus I
understand ".Free" gets obsolete (like with Prism). Without assessment -
this is a huge move for a native code compiler. If FPC will follow, t
On Wed, 22 Aug 2012 11:35:17 +0200
Michael Schnell wrote:
> On 08/22/2012 10:56 AM, Mattias Gaertner wrote:
> > The UTF-8 optimized functions needs UTF-16 versions. But why do you
> > mean it needs a "really thorough rework"?
> Guesswork :-)
> > The LCL itself already has some widgetsets using
In our previous episode, Sven Barth said:
> > Objects (or whatever) seem to get reference counted and thus I
> > understand ".Free" gets obsolete (like with Prism). Without assessment -
> > this is a huge move for a native code compiler. If FPC will follow, this
> > sounds like a lot of work.
>
>
On 08/22/2012 11:19 AM, Sven Barth wrote:
Depending on how they implement it this might indeed be an interesting
feature that we could implement (cherry picking Delphi features ^^).
It will be interesting to watch if they might implement other Prism
goodies as well (e.g. parallel loops and f
On 08/22/2012 10:56 AM, Mattias Gaertner wrote:
The UTF-8 optimized functions needs UTF-16 versions. But why do you
mean it needs a "really thorough rework"?
Guesswork :-)
The LCL itself already has some widgetsets using UTF-16.
Yep. So there the conversion needs to be dropped, while with the
Am 22.08.2012 11:08, schrieb Michael Schnell:
On 08/21/2012 02:53 PM, Graeme Geldenhuys wrote:
http://blogs.embarcadero.com/jtembarcadero/2012/08/20/xe3-and-beyond/
Other than politics, the big news regarding technology seems to be that
Objects (or whatever) seem to get reference counted and th
On 08/21/2012 02:53 PM, Graeme Geldenhuys wrote:
http://blogs.embarcadero.com/jtembarcadero/2012/08/20/xe3-and-beyond/
Other than politics, the big news regarding technology seems to be that
Objects (or whatever) seem to get reference counted and thus I
understand ".Free" gets obsolete (like wi
On Wed, 22 Aug 2012 10:37:45 +0200
Michael Schnell wrote:
> On 08/21/2012 02:53 PM, Mattias Gaertner wrote:
> > If the FCL moves to another string or starts enforcing an encoding the
> > LCL has to be adapted.
>
> I believe if "String" becomes a sequence of 16 bit entities instead of 8
> bit e
On 08/21/2012 02:53 PM, Mattias Gaertner wrote:
If the FCL moves to another string or starts enforcing an encoding the
LCL has to be adapted.
I believe if "String" becomes a sequence of 16 bit entities instead of 8
bit entities, the LCL needs a really thorough rework.
In the Lazarus form som
On 08/21/2012 02:53 PM, Mattias Gaertner wrote:
The LCL uses the same string as the FCL classes.
Yep:
type
TCaption = TTranslateString;
...
TTranslateString = type String;
The FCL uses 8-bit strings ...
Isn't this exactly what I tried to point out ? AFAIK in newer Delphi
TCaption is St
On 22 August 2012 00:54, Hans-Peter Diettrich wrote:
> IMO string conversion and CRC are mutually exclusive.
Accessing a 100k of files (filenames to be exact) in a UTF-8
environment (Linux), which must all be stored in a UTF-16 string type.
That's lots and lots of encoding conversions right there
In our previous episode, Hans-Peter Diettrich said:
> > utf8/16 -> ansi are a bit more involved. (since mapping many chars to few,
> > naieve implementation requiring large lookupsets)
>
> A single 256 element array can be used for both directions. In Ansi to
> Unicode the char value is used to i
Graeme Geldenhuys schrieb:
On 21 August 2012 13:03, Michael Schnell wrote:
With "not so often" I meant program runtime: it is usually not called in a
close long running loop.
I have a program that does exactly that... Loads files to do CRC
checking to see what changed. It's a recursive find-
Mattias Gaertner schrieb:
length returns the number of characters.
the number of elements, which can be of any size (in arrays in general).
UTF8Length the number of codepoints.
There must also be a function to return the number of bytes.
Does someone know the name?
Length(s)*sizeof(s[1])
D
Marco van de Voort schrieb:
utf8/16 -> ansi are a bit more involved. (since mapping many chars to few,
naieve implementation requiring large lookupsets)
A single 256 element array can be used for both directions. In Ansi to
Unicode the char value is used to index the array of Unicode values,
Am 21.08.2012 17:27, schrieb Paul Ishenin:
21.08.12, 23:21, Sven Barth пишет:
There must also be a function to return the number of bytes.
Does someone know the name?
Length(s) * SizeOf(s[1])
It has the name ByteLength()
O.o
Again what learned...
Regards,
Sven
_
On Tue, 21 Aug 2012 17:21:27 +0200
Sven Barth wrote:
>[...]
> > length returns the number of characters.
> > UTF8Length the number of codepoints.
> > There must also be a function to return the number of bytes.
> > Does someone know the name?
>
> Length(s) * SizeOf(s[1])
Cheater. ;)
Mattias
__
21.08.12, 23:21, Sven Barth пишет:
There must also be a function to return the number of bytes.
Does someone know the name?
Length(s) * SizeOf(s[1])
It has the name ByteLength()
Best regards,
Paul Ishenin
___
fpc-devel maillist - fpc-devel@lists
Am 21.08.2012 16:44, schrieb Mattias Gaertner:
On Tue, 21 Aug 2012 15:11:56 +0100
Graeme Geldenhuys wrote:
On 21 August 2012 14:54, Marco van de Voort wrote:
Doesn't sound wise. length(stringtype)=n should mean that the string takes
sizeof(char)*n bytes. (give or take the #0#0)
I'm not
On Tue, 21 Aug 2012 15:38:31 +0200
"Ludo Brands" wrote:
>
> > > There is the large category of network apps. Most protocols
> > are utf8
> > > or have a clear preference for utf8 (json for example).
> > Databases are
> > > an extension of that and have the additional complication that they
On Tue, 21 Aug 2012 15:11:56 +0100
Graeme Geldenhuys wrote:
> On 21 August 2012 14:54, Marco van de Voort wrote:
> >
> > Doesn't sound wise. length(stringtype)=n should mean that the string takes
> > sizeof(char)*n bytes. (give or take the #0#0)
>
>
> I'm not sure what you are trying to accom
On Tue, 21 Aug 2012 10:23:10 -0300
Marcos Douglas wrote:
>[...]
> >> I guess there is no good solution for TStrings. Whatever string type is
> >> chosen, some programs will suffer.
> >
> > Why will some suffer? Simply default UnicodeString to the correct
> > encoding on each platform, and no perf
On 21 August 2012 14:54, Marco van de Voort wrote:
>
> Doesn't sound wise. length(stringtype)=n should mean that the string takes
> sizeof(char)*n bytes. (give or take the #0#0)
I'm not sure what you are trying to accomplish? Give me sample code
that will cause a problem.
In fpGUI I have UTF8L
In our previous episode, Graeme Geldenhuys said:
> The Char type would be defined as String[4] (max size in bytes of a
> unicode codepoint)
Doesn't sound wise. length(stringtype)=n should mean that the string takes
sizeof(char)*n bytes. (give or take the #0#0)
_
On 21 August 2012 14:13, Mattias Gaertner wrote:
> One string type and native encoding. Do you mean the current AnsiString?
I meant a string type that changes it's encoding based on the platform
it is compiled for. UTF-16 under Windows, UTF-8 under others. The RTL
then uses that sinle string type
> > There is the large category of network apps. Most protocols
> are utf8
> > or have a clear preference for utf8 (json for example).
> Databases are
> > an extension of that and have the additional complication that they
> > can mix codepages at any level. These apps can be quite
> sensit
On Tue, Aug 21, 2012 at 6:09 AM, Graeme Geldenhuys
wrote:
> Hi,
>
> On 21 August 2012 09:32, Mattias Gaertner wrote:
>>
>> IMO unicodestring should be the same on all platforms, because
>> otherwise the character size switches per platform,
>
>
> Please define "character" in your sentence above.
On Tue, 21 Aug 2012 13:53:14 +0100
Graeme Geldenhuys wrote:
> On 21 August 2012 13:03, Michael Schnell wrote:
> > With "not so often" I meant program runtime: it is usually not called in a
> > close long running loop.
>
> I have a program that does exactly that... Loads files to do CRC
> check
On Tue, 21 Aug 2012 14:05:33 +0200
"Ludo Brands" wrote:
>
> >
> > Yes. But maybe these applications can be adapted easily.
> > This discussion should be about the issues where the
> > conversions matter and there is no simple workaround. It
> > would be good if everyone who knows such a probl
On 21 August 2012 13:03, Michael Schnell wrote:
> With "not so often" I meant program runtime: it is usually not called in a
> close long running loop.
I have a program that does exactly that... Loads files to do CRC
checking to see what changed. It's a recursive find-all that goes
through 100k
On Tue, 21 Aug 2012 14:22:17 +0200
Michael Schnell wrote:
> On 08/21/2012 11:22 AM, Mattias Gaertner wrote:
> >
> > Lazarus does not force "unicodestring" to anything for the simple
> > reason, that it does not use it. It only provides some functions for
> > converting UTF-8 to/from unicodestring
On 08/21/2012 11:22 AM, Mattias Gaertner wrote:
Lazarus does not force "unicodestring" to anything for the simple
reason, that it does not use it. It only provides some functions for
converting UTF-8 to/from unicodestring.
At the moment Lazarus does not even use UTF8String, because the RTL
does
On 08/21/2012 02:11 PM, Michael Schnell wrote:
So maybe it should not compile myString[i] at all
... and provide a decent enumerator syntax instead.
-Michael
___
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailm
On 08/21/2012 12:02 PM, Aleksa Todorovic wrote:
Yes, they will most probably be scattered all around, but then - it's
developer-related organizational challenge, not compiler one.
The compiler should not in a large area produce code that does not work
as a former version (that did not use Unicod
>
> Yes. But maybe these applications can be adapted easily.
> This discussion should be about the issues where the
> conversions matter and there is no simple workaround. It
> would be good if everyone who knows such a problem comes up
> with it now, so the FPC team can give an advice and/or
On 08/21/2012 01:09 PM, Graeme Geldenhuys wrote:
Maybe so, but it does debunk the statement "does not happen too often".
With "not so often" I meant program runtime: it is usually not called
in a close long running loop.
-Michael
___
fpc-devel mail
Mattias Gaertner wrote on Tue, 21 Aug 2012:
But let's be realistic. Some conversions are not measurable
and are ok.
Case in point: the FPC Win32 RTL until now. It always uses the
ansistring versions of OS interface functions, while NT-based Windows
OSes internally all work with UTF-16. Th
On 08/21/2012 11:09 AM, Graeme Geldenhuys wrote:
Can't we just introduce UTF8String and UTF16String types. By the name
they clearly state what encoding the hold.
It does make sense to (optionally) provide dynamically encoded strings,
so that it is possible to do library functions that work wit
On Tue, 21 Aug 2012 12:09:52 +0100
Graeme Geldenhuys wrote:
>[...]
> >> This is a simple example, but look at all the conversions already. Now
> >> if UnicodeString uses the correct encoding on each platform, the
> >> conversions would be zero!
> >
> > No. On Windows you have to open UTF-8 files
Aleksa Todorovic schrieb:
The problem here is that libraries floating around (including RTL and
FCL) use different string types (UnicodeString, UTF8String,
AnsiString), so the question is - is it possible to (re)write those
libraries in a generic way (RawByteString?), so they can work with any
s
Aleksa Todorovic schrieb:
On Tue, Aug 21, 2012 at 10:16 AM, Ivanko B wrote:
Handling 1..4(6) bytes is less efficient than handling surrogate
*pairs*.
===
But surrogate pairs break array-like fast char access anyway, isn't it ?
It's also "broken" in UTF8 in the same way - so none
Hi,
On 21 August 2012 11:45, Mattias Gaertner wrote:
> I agree that TStringList can easily create a performance problem, but
> afaik loading a text into a GUI is not a good example to
> show conversion overhead.
Maybe so, but it does debunk the statement "does not happen too often".
>> This is
On Tue, 21 Aug 2012 15:12:03 +0500
Ivanko B wrote:
> Because these documents are in UTF-8 parsing is about 2-3
> times faster on these documents, searching is about 20 to 50% faster
> =
> Because You name is latin ANSISTRING "Mattias Gaertner" :)
Actually my name is Gärtner.
The te
On 21 August 2012 11:32, Marco van de Voort wrote:
> All routines like capitalization (routinely used for case insensitve
> comparison) get a lot more complicated.
Obviously Unicode is a lot more complicated, because it is design for
_all_ spoken and non-spoken languages. ASCII is minute in comp
On Tue, 21 Aug 2012 10:24:38 +0100
Graeme Geldenhuys wrote:
> On 21 August 2012 10:01, Mattias Gaertner wrote:
> >> > The conversion is done only when entering and exiting the OS / GUI
> >> > framework
> >> > calls. I understand this does not happen too often.
> >>
> >> I beg to differ.
> >
> >
In our previous episode, Graeme Geldenhuys said:
> On 21 August 2012 10:19, Ivanko B wrote:
> > Sure no problems for GUI. But how about processing large texts ?
>
> Same experience as before. I must add "processing large text" is a
> vague statement.
I think unicode or not is a bigger performanc
Because these documents are in UTF-8 parsing is about 2-3
times faster on these documents, searching is about 20 to 50% faster
=
Because You name is latin ANSISTRING "Mattias Gaertner" :) But
Imagine gigabytes of 4 bytes/char UTF-8 text.
__
On Tue, 21 Aug 2012 14:19:44 +0500
Ivanko B wrote:
> I have implemented multiple text edit/display widgets that do plenty
> of string manipulation... all based on the UTF-8 encoding. I have
> suffered NO speed penalties.
>
> Sure no problems for GUI. But how about processing la
On Tue, Aug 21, 2012 at 11:41 AM, Mattias Gaertner
wrote:
>
> Theoretically you could rewrite the FCL to support UTF8String,
> UnicodeString and AnsiString. But not at the same time. In an
> application there is always be only one of them. So you have to ship for
> each flavor a whole FCL plus all
marcov wrote on Tue, 21 Aug 2012:
In our previous episode, Mattias Gaertner said:
For example under Linux file names are treated as UTF-8 but are only
bytes. They can and they do contain invalid UTF-8 characters.
If your program should support this, you must use a FindFirst
with UTF-8. To be
How well will your "access char via index" code perform on
that?
=
It'll mean "now is the time to switch to UCS-4" :)
___
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel
On Tue, 21 Aug 2012 11:07:26 +0200
Michael Schnell wrote:
> On 08/21/2012 10:17 AM, Graeme Geldenhuys wrote:
> > if you want to do string comparisons, one option is to normalise the
> > text before you do a compare.
> Other than the conversion necessary with system-calls when a different
> en
In our previous episode, Mattias Gaertner said:
> > On 08/21/2012 10:32 AM, Mattias Gaertner wrote:
> > > IMO unicodestring should be the same on all platforms, because
> > > otherwise the character size switches per platform, which is hard to
> > > test and asking for trouble.
> > This does see
On Tue, 21 Aug 2012 11:17:24 +0200
Aleksa Todorovic wrote:
> On Tue, Aug 21, 2012 at 9:53 AM, Martin Schreiber wrote:
> > Am 21.08.2012 09:31, schrieb Graeme Geldenhuys:
> >
> >
> > Ehm, I did both. In the beginning MSEgui switched from Widestring to utf-8
> > encoded Ansistring because of the b
On 21 August 2012 10:19, Ivanko B wrote:
> Sure no problems for GUI. But how about processing large texts ?
Same experience as before. I must add "processing large text" is a
vague statement.
--
Regards,
- Graeme -
___
fpGUI - a cross-platform Fre
In our previous episode, Mattias Gaertner said:
>
> IMO unicodestring should be the same on all platforms, because
> otherwise the character size switches per platform, which is hard to
> test and asking for trouble.
I think the big issue is more about what "string" will be when the FPC is
compil
On 21 August 2012 10:16, Ivanko B wrote:
> Though me'm sure that latin people don't suffer from slowliness of
> utf-8 where utf-8 = ansistring.
And I gather you base your assumptions on MSEgui. MSEgui uses UCS-2,
*not* UTF-16. I also believe MSEgui doesn't bother with surrogate
pairs (please corr
On 21 August 2012 10:01, Mattias Gaertner wrote:
>> > The conversion is done only when entering and exiting the OS / GUI
>> > framework
>> > calls. I understand this does not happen too often.
>>
>> I beg to differ.
>
> Maybe you can name some example.
OK, lets assume I'm under Linux and fpGUI
On Tue, 21 Aug 2012 11:09:28 +0200
Michael Schnell wrote:
> On 08/21/2012 10:32 AM, Mattias Gaertner wrote:
> > IMO unicodestring should be the same on all platforms, because
> > otherwise the character size switches per platform, which is hard to
> > test and asking for trouble.
> This does s
I have implemented multiple text edit/display widgets that do plenty
of string manipulation... all based on the UTF-8 encoding. I have
suffered NO speed penalties.
Sure no problems for GUI. But how about processing large texts ?
___
fpc
On Tue, Aug 21, 2012 at 9:53 AM, Martin Schreiber wrote:
> Am 21.08.2012 09:31, schrieb Graeme Geldenhuys:
>
>
> Ehm, I did both. In the beginning MSEgui switched from Widestring to utf-8
> encoded Ansistring because of the buggy FPC widestring implementation
> (MSEgui started with Delphi/Kylix).
Performance heavily depends on what you do and you can find good
examples
==
Hmm.. are there implementations of UTF-8 substringing, string
comparision etc - but not using intermediate HEAVY normalizations
from/to fixed char length type for BOTH input arguments ?
Though me'm sure th
On 21 August 2012 08:53, Martin Schreiber wrote:
>>
>> Yet another myth
>
>
> Ehm, I did both. In the beginning MSEgui switched from Widestring to utf-8
Just because you had a bad experience doesn't doom the utf-8 encoding
forever. Maybe you just had a buggy implementation. No coder is
perfe
On 21 August 2012 09:41, Ivanko B wrote:
> UTF-8 is very-very slow compared to UCS-2 as to string manipulations
> so its best usage is encoding source files (as done in MSEide).
Please supply a test program that proves this. I don't believe you are correct.
I have implemented multiple text edit
Hi,
On 21 August 2012 09:32, Mattias Gaertner wrote:
>
> IMO unicodestring should be the same on all platforms, because
> otherwise the character size switches per platform,
Please define "character" in your sentence above. Are you referring to
a Unicode codepoint, or a "printable character"? I
On 08/21/2012 10:32 AM, Mattias Gaertner wrote:
IMO unicodestring should be the same on all platforms, because
otherwise the character size switches per platform, which is hard to
test and asking for trouble.
This does seem appropriate. But right now Delphi comparability forces 16
Bits and Laz
On 08/21/2012 10:17 AM, Graeme Geldenhuys wrote:
if you want to do string comparisons, one option is to normalise the
text before you do a compare.
Other than the conversion necessary with system-calls when a different
encoding is used internally, comparing strings happens very often within
t
On 08/21/2012 10:15 AM, Graeme Geldenhuys wrote:
You're in for a surprise... With a statement that reads "It provides
direct access to serial ports, TAPI, and the Microsoft Speech API." it
should start sounding alarm bells for Linux developers.
Of course you are very right and silly me did not
On Tue, 21 Aug 2012 09:23:30 +0100
Graeme Geldenhuys wrote:
>[...]
> > The conversion is done only when entering and exiting the OS / GUI framework
> > calls. I understand this does not happen too often.
>
> I beg to differ.
Maybe you can name some example. Concrete problems can be solved,
abst
On Tue, Aug 21, 2012 at 10:16 AM, Ivanko B wrote:
>
> Handling 1..4(6) bytes is less efficient than handling surrogate
> *pairs*.
> ===
> But surrogate pairs break array-like fast char access anyway, isn't it ?
It's also "broken" in UTF8 in the same way - so none of them gets +1
on
On Tue, 21 Aug 2012 13:41:38 +0500
Ivanko B wrote:
> But if you are such a UTF-16 (actually UCS-2 as
> that is what MSEgui supports) fan
> =
> If Martin can implement UTF-16 (with surrogate pair) support in MSEgui
> string units (and these units fully cover absenting code of FPC
Am 21.08.2012 09:32, schrieb Mattias Gaertner:
On Mon, 20 Aug 2012 20:56:46 +0200
Florian Klämpfl wrote:
[...]
The current situation is:
- either somebody starts to implement support for unicodestring being
utf-8 (or whatever) on linux in a compatible way with the current
approach, then 2.8.0
Am 21.08.2012 09:31, schrieb Graeme Geldenhuys:
On 21 August 2012 09:13, Martin Schreiber wrote:
I disagree. Handling 1..4(6) bytes is less efficient than handling surrogate
*pairs*.
Yet another myth
Ehm, I did both. In the beginning MSEgui switched from Widestring to
utf-8 encoded Ans
But if you are such a UTF-16 (actually UCS-2 as
that is what MSEgui supports) fan
=
If Martin can implement UTF-16 (with surrogate pair) support in MSEgui
string units (and these units fully cover absenting code of FPC RTL )
then the things are excellent.
PS:
UTF-8 is very-very sl
On Mon, 20 Aug 2012 20:56:46 +0200
Florian Klämpfl wrote:
>[...]
> The current situation is:
> - either somebody starts to implement support for unicodestring being
> utf-8 (or whatever) on linux in a compatible way with the current
> approach, then 2.8.0 will use this
> - nobody works on it, the
On 21 August 2012 09:13, Martin Schreiber wrote:
> I disagree. Handling 1..4(6) bytes is less efficient than handling surrogate
> *pairs*.
Yet another myth But if you are such a UTF-16 (actually UCS-2 as
that is what MSEgui supports) fan, why isn't MSEgui source code stored
in UTF-16 encoding
Hi,
On 21 August 2012 08:37, Michael Schnell wrote:
>
> But does that really suggest taking the effort to support other Unicode
> variants ?
Yes, if you want to to make the statement "FPC fully supports Unicode"
> The conversion is done only when entering and exiting the OS / GUI framework
> c
Hi,
On 21 August 2012 08:28, Michael Schnell wrote:
>
> How can it be OK regarding comparing strings, when all Unicode variants
> allow for multiple codings for the same single printable "character" (and
> moreover what "character" do the users regard as "equal").
The Unicode Standard covers al
Handling 1..4(6) bytes is less efficient than handling surrogate
*pairs*.
===
But surrogate pairs break array-like fast char access anyway, isn't it ?
And there's a lot of room for optimizing utf-8 operation for instance
http://bjoern.hoehrmann.de/utf-8/decoder/dfa/.
Also a publicatio
On 21 August 2012 08:27, Michael Schnell wrote:
>
> I doubt that it will be possible to just compile it (e.g. for Linux) but
> with optimum compatibility of the compiler, porting the source code should
> be rather easy.
You're in for a surprise... With a statement that reads "It provides
direct
On Tuesday 21 August 2012 09:56:57 Ivanko B wrote:
> For non-fixed char length there's nothing better than UTF8 (default
> ASCII compatible, ready for any future alphabets,..). For fixed-char
> length (fast string operations etc) also there's nothing better than
> UCS-2 (the Earth coverage ) & UCS-
For non-fixed char length there's nothing better than UTF8 (default
ASCII compatible, ready for any future alphabets,..). For fixed-char
length (fast string operations etc) also there's nothing better than
UCS-2 (the Earth coverage ) & UCS-4 (the galaxy coverage).
The non-fixed char length UTF-
For non-fixed char length there's nothing better than UTF8 (default
ASCII compatible, ready for any future alphabets,..). For fixed-char
length (fast string operations etc) also there's nothing better than
UCS-2 (the Earth coverage ) & UCS-4 (the galaxy coverage).
The non-fixed char length UTF-16 (
On Mon, 20 Aug 2012 18:46:29 +0100
Hans-Peter Diettrich wrote:
> Mattias Gaertner schrieb:
>
> > I guess most people would say that "good multi language Unicode support
> > in FPC" requires a Unicode supporting RTL.
>
> Please clarify: *Unicode* or UTF-16 support?
>
> Unicode is covered by bot
HI,
On 20 August 2012 23:26, Hans-Peter Diettrich wrote:
>
> UCS2 is nowadays known as the BMP (Basic Multilingual Plane) of full
> Unicode.
The UCS2 is considered obsolete! Nothing else needs to be said. :)
> Have a look at the full Unicode codepages, what is and what is not
> part of the BMP
Sorry:
I do think it would not harm to use UTF-16 as a default.
-Michael
___
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel
On 08/20/2012 06:05 PM, Graeme Geldenhuys wrote:
* UnicodeString is always UTF-16 (so everything but Windows takes a
conversion penalty)!
This is true of course,
But does that really suggest taking the effort to support other Unicode
variants ?
The conversion is done only when entering and e
On 08/20/2012 08:53 PM, Ivanko B wrote:
Really the team seems to fights to FPC + Lazarus be capable of
building thousands of Delphi based components - archivers, cyphers,
audio processors etc things which people mostly like Delphi for and
which seldom use specific Delphi features causing problems
1 - 100 of 157 matches
Mail list logo