Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
UpperCase, LowerCase, CapitalCase, WordBreak, ParagraphBreak, ... almost all have some language exceptions. I don't doubt that you are right here, but I don't think that there is any support for this in the RTL. So it seems to be a lot less relevant than general Unicode handling. So I thin

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread listmember
On 2008-10-24 02:46, Felipe Monteiro de Carvalho wrote: I agree with Daniël on this one. Simplify. ë --> Ë always If you need something which takes into consideration the language then build another routine with more parameters. It's not that simple. How would you uppercase this piece of str

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Felipe Monteiro de Carvalho
I agree with Daniël on this one. Simplify. ë --> Ë always If you need something which takes into consideration the language then build another routine with more parameters. -- Felipe Monteiro de Carvalho ___ fpc-devel maillist - fpc-devel@lists.freep

Re[2]: [fpc-devel] assign constant text to widestring

2008-10-23 Thread JoshyFun
Hello listmember, Thursday, October 23, 2008, 11:58:51 PM, you wrote: l> Yes, it is impretative that we know the language of the word is in, so that l> UpperCase("sólo", langSpanish) --> "SÓLO" l> UpperCase("solo", langSpanish) --> "SOLO" l> Otherwise, we may end up altering the meaning of the te

Re: [fpc-devel] FPC_HAS_FEATURE_SUPPORT

2008-10-23 Thread Vincent Snijders
Michael Van Canneyt schreef: On Thu, 23 Oct 2008, Vincent Snijders wrote: Michael Van Canneyt schreef: And did you fix the 'TObject not found' with a short-term solution ? :-) Maybe svn up -r11887 (in fpc/trunk) home: >svn log -r 11887 .

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread listmember
> DM> Example: In Dutch uppercase characters generally do not get > tremas: Daniël becomes DANIEL. Should an uppercase routine worry? > No, this is a spelling convention, the correct uppercase of ë is > Ë, we should not confuse spelling with uppercasing. No. This is not a spelling convention. It

Re: [fpc-devel] FPC_HAS_FEATURE_SUPPORT

2008-10-23 Thread Michael Van Canneyt
On Thu, 23 Oct 2008, Vincent Snijders wrote: > Michael Van Canneyt schreef: > > > > And did you fix the 'TObject not found' with a short-term solution ? :-) > > Maybe svn up -r11887 (in fpc/trunk) home: >svn log -r 11887 . --

Re: [fpc-devel] FPC_HAS_FEATURE_SUPPORT

2008-10-23 Thread Vincent Snijders
Michael Van Canneyt schreef: And did you fix the 'TObject not found' with a short-term solution ? :-) Maybe svn up -r11887 (in fpc/trunk) Vincent ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/

Re: [fpc-devel] FPC_HAS_FEATURE_SUPPORT

2008-10-23 Thread Michael Van Canneyt
On Thu, 23 Oct 2008, Mattias Gaertner wrote: > On Thu, 23 Oct 2008 08:53:27 +0200 (CEST) > "Peter Vreman" <[EMAIL PROTECTED]> wrote: > > > > On Wed, 22 Oct 2008 10:32:36 +0200 (CEST) > > > "Peter Vreman" <[EMAIL PROTECTED]> wrote: > > > > > >> > As of version 2.3.1, the compiler by itself indic

Re: [fpc-devel] FPC_HAS_FEATURE_SUPPORT

2008-10-23 Thread Mattias Gaertner
On Thu, 23 Oct 2008 08:53:27 +0200 (CEST) "Peter Vreman" <[EMAIL PROTECTED]> wrote: > > On Wed, 22 Oct 2008 10:32:36 +0200 (CEST) > > "Peter Vreman" <[EMAIL PROTECTED]> wrote: > > > >> > As of version 2.3.1, the compiler by itself indicates all the > >> > various features it supports with FPC_HAS_

Re[3]: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Daniël Mantione
Op Thu, 23 Oct 2008, schreef JoshyFun: Hello Daniël, Thursday, October 23, 2008, 5:34:59 PM, you wrote: DM> Don't overexagerate, this is true with plain ASCII as well. Non-English DM> software exists already for over 5 decades and nothing has stopped us to DM> write code that performs the fu

Re[3]: [fpc-devel] assign constant text to widestring

2008-10-23 Thread JoshyFun
Hello Daniël, Thursday, October 23, 2008, 5:34:59 PM, you wrote: DM> Don't overexagerate, this is true with plain ASCII as well. Non-English DM> software exists already for over 5 decades and nothing has stopped us to DM> write code that performs the functions you name. I'm not overexagerating,

Re[2]: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Daniël Mantione
Op Thu, 23 Oct 2008, schreef JoshyFun: Hello Michael, Thursday, October 23, 2008, 1:46:48 PM, you wrote: More importantly, most of such routines will be implicitely tied to a certain language or language group already. MS> Which kind of UCS2 based function do you think are tied to a MS>

Re[2]: [fpc-devel] assign constant text to widestring

2008-10-23 Thread JoshyFun
Hello Michael, Thursday, October 23, 2008, 1:46:48 PM, you wrote: >> More importantly, most of such routines will be implicitely tied to a >> certain language or language group already. >> MS> Which kind of UCS2 based function do you think are tied to a MS> language(group) ? UpperCase, Lowe

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
http://www.unicode.org/reports/tr9/ Thanks. I see. (In fact I even did do embedded software for a display that can show Hebrew text. But this was with ANSI code.) -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.fre

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Marc Weustink
Michael Schnell wrote: Since it converts the UTF8 file internally to UCS2 on read before editing. Seems really silly to me. No it's not. This way you have internally only to support 2 editors. One with bytechars and one with wordchars (ignoring surrogates and other stuff) But the file len

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
I doubt that you will never need to support decomposed characters (such as ä being encoded as basically "a¨"). It's not that uncommon. This is the nasty old stuff Unicode should be useful to get rid of -Michael ___ fpc-devel maillist - fpc-de

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Martin Schreiber
On Thursday 23 October 2008 13.58:04 Michael Schnell wrote: > > Bidi stuff? You are aware of the fact that unicode strings can contain > > e.g. bidi markers? > > Sorry, never heard of bidi :( > Bidirectional text. Much more important than the hypothetical codepoints above the BMP. MSEgui does not

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
Since it converts the UTF8 file internally to UCS2 on read before editing. Seems really silly to me. But the file length really indicated that it's utf8 coded and when looking at the file with WinCommander's hex viewer it's utf-8. So I suppose that you are right and the nasty trick is Ultrae

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Florian Klaempfl
Michael Schnell schrieb: > >> Bidi stuff? You are aware of the fact that unicode strings can contain >> e.g. bidi markers? > Sorry, never heard of bidi :( > http://www.unicode.org/reports/tr9/ ___ fpc-devel maillist - fpc-devel@lists.freepascal.org h

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
Bidi stuff? You are aware of the fact that unicode strings can contain e.g. bidi markers? Sorry, never heard of bidi :( -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Marc Weustink
Michael Schnell wrote: Ultraedit might fool you here. Id edits either ansi or usc2. If you have a utf8 encoded file, it will show the contents in hex as being ucs2 That might be. But it would even virtually insert a BOPM ?!?!?!? Why should it do this when using the hex editor ? Since it conv

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Martin Schreiber
On Thursday 23 October 2008 13.31:30 Florian Klaempfl wrote: > This is also a simplified view. > - firstly, which real world (!) task really requires to execute an > operation like this, mostly it's something like copy(s,pos(...),...); > - secondly, a properly coded utf-16 application shouldn't do

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Florian Klaempfl
Michael Schnell schrieb: > >> More importantly, most of such routines will be implicitely tied to a >> certain language or language group already. >> > Which kind of UCS2 based function do you think are tied to a > language(group) ? Bidi stuff? You are aware of the fact that unicode strings ca

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Jonas Maebe
On 23 Oct 2008, at 13:41, Michael Schnell wrote: utf-16 application shouldn't do this either: it doesn't handle surrogates properly Right you are. For me WideString is UCS2 and not UTF16, as I regard it as a sequence of WideChar so that the Unicode user code can be done using WideChar and W

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
More importantly, most of such routines will be implicitely tied to a certain language or language group already. Which kind of UCS2 based function do you think are tied to a language(group) ? -Michael ___ fpc-devel maillist - fpc-devel@lists.f

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
Ultraedit might fool you here. Id edits either ansi or usc2. If you have a utf8 encoded file, it will show the contents in hex as being ucs2 That might be. But it would even virtually insert a BOPM ?!?!?!? Why should it do this when using the hex editor ? -Michael

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Marco van de Voort
In our previous episode, Florian Klaempfl said: > > But if you use UTF8String you need to be aware that you can't do simple > > and totally normal things like s := copy(s, 3); to get the first three > > characters of a string. Really finding the first three characters of a > > string is an interest

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
utf-16 application shouldn't do this either: it doesn't handle surrogates properly Right you are. For me WideString is UCS2 and not UTF16, as I regard it as a sequence of WideChar so that the Unicode user code can be done using WideChar and WideString. WideChar only has 16 Bits. So this rest

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Marc Weustink
Michael Schnell wrote: The compiler definitively eats no ucs-2 encoded sources. I did check several times: My source file looks like this when I open it with Ultra-Edit and tell to show it in Hex: FF FE 75 00 6E 0069 00 74 00 20 00 55 00 6E 00 ..u.n.i.t. .U.n. Ultraedit might fool you h

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Florian Klaempfl
Michael Schnell schrieb: > >> The conversion >> utf-8<->utf-16 is a very expensive operation and the compiler has to >> insert it all over the place and people would cry about the performance >> of their programs. > Of course I do agree. > > If you want to care about performance you need to know

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
If you want widestring, then maybe mseide is a better option for you. Again I do know this, and I in fact don't have a project that needs Unicode. But the cause why I started this thread is to help making Lazarus / FPC even more useful. -Michael __

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Vincent Snijders
Michael Schnell schreef: The conversion utf-8<->utf-16 is a very expensive operation and the compiler has to insert it all over the place and people would cry about the performance of their programs. Of course I do agree. If you want to care about performance you need to know what to do: Eit

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
The conversion utf-8<->utf-16 is a very expensive operation and the compiler has to insert it all over the place and people would cry about the performance of their programs. Of course I do agree. If you want to care about performance you need to know what to do: Either use WideString "all ov

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
As has been said before: the compiler itself simply does not support UCS-2. Regardless of any BOM, compiler setting or Lazarus setting, it will not understand it. See ,y other post in this thread: Windows XP seems to play some tricks on us here so that Ultraedit sees the UCS2 coded file whil

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Daniël Mantione
Op Thu, 23 Oct 2008, schreef Michael Schnell: The compiler definitively eats no ucs-2 encoded sources. I did check several times: My source file looks like this when I open it with Ultra-Edit and tell to show it in Hex: FF FE 75 00 6E 0069 00 74 00 20 00 55 00 6E 00 ..u.n.i.t. .U.n. Now

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
The compiler definitively eats no ucs-2 encoded sources. I did check several times: My source file looks like this when I open it with Ultra-Edit and tell to show it in Hex: FF FE 75 00 6E 0069 00 74 00 20 00 55 00 6E 00 ..u.n.i.t. .U.n. Now I created a Delphi program and read the file wi

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Jonas Maebe
On 23 Oct 2008, at 12:20, Michael Schnell wrote: No no, a string with unicode characters is interpreted by the compiler as widestring constant, never as UTF-8 ansistring constant. If it does otherwise, the compiler probably does not interpret your source code as Unicode. The issue might be

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
No no, a string with unicode characters is interpreted by the compiler as widestring constant, never as UTF-8 ansistring constant. If it does otherwise, the compiler probably does not interpret your source code as Unicode. The issue might be the UCS-2 encoding of your source, perhaps try to

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Florian Klaempfl
Michael Schnell schrieb: > > A decent system should be able to do the necessary conversions > automatically: This is a simplified view which ignores the resource wasting of this apporoach not visible in the academical example below. The conversion utf-8<->utf-16 is a very expensive operation and

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Florian Klaempfl
Daniël Mantione schrieb: > The issue might be the UCS-2 encoding of your source, perhaps try to > feed the compiler UTF-8, I didn't even know the compiler accepts UCS-2, > it may not work correctly. > The compiler definitively eats no ucs-2 encoded sources.

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Daniël Mantione
Op Thu, 23 Oct 2008, schreef Michael Schnell: Then you don't understand it yet, I think. May be If the compiler knows your source file is UTF-8 (by BOM or directive), the compiler generates a widestring constant and no conversion function is called when assigning to a widestring. In m

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
Then you don't understand it yet, I think. May be If the compiler knows your source file is UTF-8 (by BOM or directive), the compiler generates a widestring constant and no conversion function is called when assigning to a widestring. In my test the source code is not UTF8 but UCS2 and do

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Daniël Mantione
Op Thu, 23 Oct 2008, schreef Michael Schnell: I suppose this might solve the constant assignment on the fly, but in fact I feel that the compiler should generate a WideString constant at compile time instead of calling a conversion function at run time. Then you don't understand it yet, I t

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
Please read the entire thread, and if you have more question afterwards, then ask them. In fact I don't have questions, but in this regard the way the compiler (in Lazarus with default settings) works is very dissatisfying. IMHO the only cure is to make the compiler aware of the UTF8Type,

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
AFAIK the compiler reads the source as non-utf8 (latin or some 8 bit encoding). This leads to other things too, like identifiers cannot contain utf8. This was discussed in the German Lazarus Forum. Here I got a funny result: when I right-click the Lazarus-Code-Editor I see that the file cod

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
If anybody say another thing "UTF8String" is just an alias for "ansistring" so they are exactly the same thing, but with different name which in my case I'm using to be clear in code where things are utf-8 encoded. I do know that in the current implementation "UTF8String" is just an alias fo