Re: [yast-devel] YCP String operator [] and UTF-8

Arvin Schnell Thu, 05 Apr 2012 01:56:45 -0700

On Tue, Apr 03, 2012 at 10:06:18AM +0200, Ladislav Slezak wrote:
> Dne 3.4.2012 09:37, Johannes Meixner napsal(a):
> 
> >Could a YCP expert show the correct way (example code)
> >how to remove an UTF8 sub-string from an UTF8 string?
> >I.e. how to remove "Bar" from "FooBarBaz" in an UTF8-safe way?
> >
> >More generally:
> >Is there documentation how to work on UTF8 strings with YCP in general?
> 
> Um, it seems that the documentation does not mention the [] behavior for 
> string
> (http://doc.opensuse.org/projects/YaST/openSUSE11.3/tdg/bracket.html nor 
> http://doc.opensuse.org/projects/YaST/openSUSE11.3/tdg/id_ycp_data_string.html).
> 
> I fixed that particular bug by using regexpsub() instead of iterating over 
> the string
> and using [] operator. So I guess the regexp*() functions are UTF-8 safe.
> 
> I'm not sure what the correct solution is. Maybe the correct way is to fix 
> the []
> operator after all... That's why I have opened this discussion, because 
> maybe someone is relaying on the current "buggy" behavior... (I don't 
> expect that
> but I'd like to avoid regressions if possible.)


Unfortunately this can be the case since most other string
functions also do not respect UTF-8. E.g. splitting a string at a
space by using search and substring works correctly with
substring since search is also byte-oriented.

Program:

    string s = "schöner Würfel";
    integer i = search(s, " ");

    y2milestone("substring '%1' '%2'", substring(s, 0, i), substring(s, i + 1));
    y2milestone("lsubstring '%1' '%2'", lsubstring(s, 0, i), lsubstring(s, i + 
1));

Output:

    test1.ycp:6 substring 'schöner' 'Würfel'
    test1.ycp:7 lsubstring 'schöner ' 'ürfel'

So, if substring is fixed all other functions must also.

Regards,
  Arvin

-- 
Arvin Schnell, <[email protected]>
Senior Software Engineer, Research & Development
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 
16746 (AG Nürnberg)
Maxfeldstraße 5
90409 Nürnberg
Germany
-- 
To unsubscribe, e-mail: [email protected]
To contact the owner, e-mail: [email protected]

Re: [yast-devel] YCP String operator [] and UTF-8

Reply via email to