On Thu, 5 Apr 2012 10:56:34 +0200 Arvin Schnell <[email protected]> wrote:
> On Tue, Apr 03, 2012 at 10:06:18AM +0200, Ladislav Slezak wrote: > > Dne 3.4.2012 09:37, Johannes Meixner napsal(a): > > > > >Could a YCP expert show the correct way (example code) > > >how to remove an UTF8 sub-string from an UTF8 string? > > >I.e. how to remove "Bar" from "FooBarBaz" in an UTF8-safe way? > > > > > >More generally: > > >Is there documentation how to work on UTF8 strings with YCP in > > >general? > > > > Um, it seems that the documentation does not mention the [] > > behavior for string > > (http://doc.opensuse.org/projects/YaST/openSUSE11.3/tdg/bracket.html > > nor > > http://doc.opensuse.org/projects/YaST/openSUSE11.3/tdg/id_ycp_data_string.html). > > > > I fixed that particular bug by using regexpsub() instead of > > iterating over the string > > and using [] operator. So I guess the regexp*() functions are UTF-8 > > safe. > > > > I'm not sure what the correct solution is. Maybe the correct way is > > to fix the [] > > operator after all... That's why I have opened this discussion, > > because maybe someone is relaying on the current "buggy" > > behavior... (I don't expect that > > but I'd like to avoid regressions if possible.) > > Unfortunately this can be the case since most other string > functions also do not respect UTF-8. E.g. splitting a string at a > space by using search and substring works correctly with > substring since search is also byte-oriented. > > Program: > > string s = "schöner Würfel"; > integer i = search(s, " "); > > y2milestone("substring '%1' '%2'", substring(s, 0, i), > substring(s, i + 1)); y2milestone("lsubstring '%1' '%2'", > lsubstring(s, 0, i), lsubstring(s, i + 1)); > > Output: > > test1.ycp:6 substring 'schöner' 'Würfel' > test1.ycp:7 lsubstring 'schöner ' 'ürfel' > > So, if substring is fixed all other functions must also. > I still think that main problem is that we want unicode strings in YaST, but we use in backend string. I think that now it is right time to try to switch string implementation to wstring and be ready to such change. I can create set of patches if someone is interested in it and test it. Josef > Regards, > Arvin > -- To unsubscribe, e-mail: [email protected] To contact the owner, e-mail: [email protected]
