On Fri, 30 Mar 2012 17:01:04 +0200
Ladislav Slezak <[email protected]> wrote:

> 
> Hi all,
> 
> This is just a note for you in case you came across a strange UTF-8 string 
> problem in
> YCP (maybe you already know that but for me it was quite a surprise):
> 
> YCP string operator [] takes _byte_ index in the string, while size(string) 
> returns
> _number_of_characters_. The problem is when you combine both functions, the 
> result
> will be probably buggy, see https://bugzilla.novell.com/show_bug.cgi?id=728588
> 
> Example from the bug:
> 
>   size("áa") => 2,
> 
> but
> 
>   "áa"[1] is not "a" as expected but the second _byte_ of the string which is 
> one
>   half of the "á" UTF-8 character, if you remove it you'll get garbage in the 
> string...
> 
> 
> Keep this in your mind when iterating over YCP strings...
> 
> (I'm not sure whether fixing YCPString::[] would be a good idea, it might 
> break
> something else. Martin?)
> 

When I compare behaviour to other languages like wstring in C++ or ruby string, 
I see this behaviour really strange. operator[] is expected to return element 
at given position. For a lot of programmer , which don't have ycp as first 
first language, is string is array of character ( not bytes ).
So I really vote for change of this behaviour.
Josef

> 
> --
> 
> Ladislav Slezák
> Appliance department / YaST Developer
> Lihovarská 1060/12
> 190 00 Prague 9 / Czech Republic
> tel: +420 284 028 960
> [email protected]
> SUSE

--
To unsubscribe, e-mail: [email protected]
To contact the owner, e-mail: [email protected]

Reply via email to