On Thu, 5 Apr 2012 10:56:34 +0200
Arvin Schnell <[email protected]> wrote:

> On Tue, Apr 03, 2012 at 10:06:18AM +0200, Ladislav Slezak wrote:
> > Dne 3.4.2012 09:37, Johannes Meixner napsal(a):
> > 
> > >Could a YCP expert show the correct way (example code)
> > >how to remove an UTF8 sub-string from an UTF8 string?
> > >I.e. how to remove "Bar" from "FooBarBaz" in an UTF8-safe way?
> > >
> > >More generally:
> > >Is there documentation how to work on UTF8 strings with YCP in
> > >general?
> > 
> > Um, it seems that the documentation does not mention the []
> > behavior for string
> > (http://doc.opensuse.org/projects/YaST/openSUSE11.3/tdg/bracket.html
> > nor
> > http://doc.opensuse.org/projects/YaST/openSUSE11.3/tdg/id_ycp_data_string.html).
> > 
> > I fixed that particular bug by using regexpsub() instead of
> > iterating over the string
> > and using [] operator. So I guess the regexp*() functions are UTF-8
> > safe.
> > 
> > I'm not sure what the correct solution is. Maybe the correct way is
> > to fix the []
> > operator after all... That's why I have opened this discussion,
> > because maybe someone is relaying on the current "buggy"
> > behavior... (I don't expect that
> > but I'd like to avoid regressions if possible.)
> 
> Unfortunately this can be the case since most other string
> functions also do not respect UTF-8. E.g. splitting a string at a
> space by using search and substring works correctly with
> substring since search is also byte-oriented.
> 
> Program:
> 
>     string s = "schöner Würfel";
>     integer i = search(s, " ");
> 
>     y2milestone("substring '%1' '%2'", substring(s, 0, i),
> substring(s, i + 1)); y2milestone("lsubstring '%1' '%2'",
> lsubstring(s, 0, i), lsubstring(s, i + 1));
> 
> Output:
> 
>     test1.ycp:6 substring 'schöner' 'Würfel'
>     test1.ycp:7 lsubstring 'schöner ' 'ürfel'
> 
> So, if substring is fixed all other functions must also.
> 

I still think that main problem is that we want unicode strings in
YaST, but we use in backend string. I think that now it is right time
to try to switch string implementation to wstring and be ready to such
change. I can create set of patches if someone is interested in it and
test it.

Josef

> Regards,
>   Arvin
> 

--
To unsubscribe, e-mail: [email protected]
To contact the owner, e-mail: [email protected]

Reply via email to