Re: [yast-devel] Re: YCP substring() Was: YCP String operator [] and UTF-8

Arvin Schnell Tue, 03 Apr 2012 02:58:23 -0700

On Tue, Apr 03, 2012 at 11:33:09AM +0200, Klaus Kaempf wrote:
> * Ladislav Slezak <lsle...@suse.cz> [Apr 03. 2012 11:10]:


> > I used substring() to get one character. So the problematic call is 
> > actually:
> > 
> >   substring("áa", 1, 1);
> > 
> > which returns "\0xF1" instead of "a" as I expected.
> > 
> > The documentation does not tell whether the substring() argument units are 
> > in
> > bytes or characters.
> > http://doc.opensuse.org/projects/YaST/openSUSE11.3/tdg/substring-rest.html
> > 
> > So any opinions on changing this call? Is the UTF-8 assumption also valid 
> > here?
> 
> Yes. sub_string_ is operating on strings and strings are defined to be
> UTF-8 encoded.

Generally I agree that strings in YCP are UTF-8 encoded and
functions should respect this.

But simply fixing the functions might require converting from
UTF-8 to wstring and back in every function and that sounds very
costly. E.g. the size functions in YCP converts the string to
wstring. When I noticed that and saw how many time
size(string) == 0 is used I added an isempty function in YCP.

Could be that using wstring internally in YCPString is the better
solution.

Regards,
  Arvin

-- 
Arvin Schnell, <aschn...@suse.de>
Senior Software Engineer, Research & Development
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 
16746 (AG Nürnberg)
Maxfeldstraße 5
90409 Nürnberg
Germany
-- 
To unsubscribe, e-mail: yast-devel+unsubscr...@opensuse.org
To contact the owner, e-mail: yast-devel+ow...@opensuse.org

Re: [yast-devel] Re: YCP substring() Was: YCP String operator [] and UTF-8

Reply via email to