Re: [Lazarus] cwstring in arm-linux

Žilvinas Ledas Sat, 22 Oct 2011 02:54:58 -0700

Hi,

On 2011-10-22 00:48, Hans-Peter Diettrich wrote:

Žilvinas Ledas schrieb:
Hello,
On 2011-10-21 10:43, Michael Schnell wrote:
Of course you are right, but "move" and friends is "hardware-nearprogramming" for this who know what they are doing. but basic(legacy) string operations like "myChar := myString[i]" is"office-level programming" and thus should work as a dummy expects.
What if a file on the user computer has 4byte [visible] character as8th character and you, for example want to get 8 character file name?In this case you split that 4 byte character and have garbage.
Then you (or your boss) didn't understand the meaning of "4characters". (Logical) characters are different from physical Chars,in every MBCS codepage.

I know that logical characters are different from physical. I was tryingto make a point, that even usint UTF16 you MUST check any string commingtrom outside world.

What it user inputs in your text field (or a command line parameteror anywhere else) a string containing 4 byte character and you splitthat string on that character? (For example when showing some kind ofsummary of his input.) Don't forget that user can input characters bycopy-pasting them from the web, not only using his keyboard!
See above. With proportial fonts, counting characters is a bad idea,instead the width of the displayed string (in pixels) should be used.Then you also can deal with languages and character sets, which useligatures and the like. Even with monospaced fonts the "characters"(glyphs) can have a different width, in multiples of the basic width,e.g. for Chinese or other eastern character sets.
So, if you want to write PROFESSIONAL software with any user input -you must handle 4 byte characters at every place you get user input.
Counting characters then is a bad idea, see above.
Otherwise you leave a chance to get and show to the user garbage. Isthis really easier than using UTF8 everywhere?
My personal experience: I am maintaining (as a hobby project)multi-language dictionary program (a screen-shoot:http://2.bp.blogspot.com/_3-IaodGIbVQ/TMHY-l9M4sI/AAAAAAAAAak/AbtShWq0ZUQ/s1600/KZod_screen_win7.png
Great :-)
) and it involves quite a bit of [multilingual] string manipulationand when I did migration from delphi to Lazarus I didn't know aboutrequirement that all (GUI) strings must be UTF8 and I had no problemsmigrating! Yes, afterwards I tweaked some calls to RTL (mostly filehandling) functions that expected to get ANSI encoding, but this isnot a problem of UTF8, but or RTL being (mostly) ansi.
From which Delphi version did you migrate?
What encoding did you use in Delphi?

From Delphi 5.

Actually, it was quite do not remember now what I was using :) I thinkit was a mix of ansi/wide/utf8 strings.



Regards,
Žilvinas Ledas

--
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus.freepascal.org
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Re: [Lazarus] cwstring in arm-linux

Reply via email to