On Fri, 30 Mar 2012 17:01:04 +0200 Ladislav Slezak <[email protected]> wrote:
> > Hi all, > > This is just a note for you in case you came across a strange UTF-8 string > problem in > YCP (maybe you already know that but for me it was quite a surprise): > > YCP string operator [] takes _byte_ index in the string, while size(string) > returns > _number_of_characters_. The problem is when you combine both functions, the > result > will be probably buggy, see https://bugzilla.novell.com/show_bug.cgi?id=728588 > > Example from the bug: > > size("áa") => 2, > > but > > "áa"[1] is not "a" as expected but the second _byte_ of the string which is > one > half of the "á" UTF-8 character, if you remove it you'll get garbage in the > string... > > > Keep this in your mind when iterating over YCP strings... > > (I'm not sure whether fixing YCPString::[] would be a good idea, it might > break > something else. Martin?) > When I compare behaviour to other languages like wstring in C++ or ruby string, I see this behaviour really strange. operator[] is expected to return element at given position. For a lot of programmer , which don't have ycp as first first language, is string is array of character ( not bytes ). So I really vote for change of this behaviour. Josef > > -- > > Ladislav Slezák > Appliance department / YaST Developer > Lihovarská 1060/12 > 190 00 Prague 9 / Czech Republic > tel: +420 284 028 960 > [email protected] > SUSE -- To unsubscribe, e-mail: [email protected] To contact the owner, e-mail: [email protected]
