Thanks. I didn't know about char_len. `unicode_str.as_slice().char_len()`
is giving number of code points.

Sorry for the confusion, I was referring codepoint as character in my mail.
char_len gives the correct output for my requirement. I have written
javascript script to convert from string length to grapheme cluster length
for Kannada language.


On Wed, May 28, 2014 at 3:01 PM, Simon Sapin <[email protected]> wrote:

> On 28/05/2014 10:10, Aravinda VK wrote:
>
>> Hi,
>>
>> How to find number of characters in a string?
>>
>> Following example returns byte count instead of number of characters.
>>
>>      use std::string::String;
>>
>>      fn main() {
>>          let unicode_str = String::from_str("ಅ");
>>          let ascii_str = String::from_str("a");
>>          println!("unicode str: {}, ascii str: {}", unicode_str.len(),
>> ascii_str.len());
>>      }
>>
>
> It depends on what you call a "character". As you noted, the .len() method
> returns the number of UTF-8 bytes. Since strings are represented as UTF-8
> internally, .len() takes O(1) time.
>
> There is also the .char_len() method, which counts the number of Unicode
> code points in O(n) time.
>
> http://static.rust-lang.org/doc/master/std/str/trait.
> StrSlice.html#tymethod.char_len
>
> However, what users perceive as a single "character" may be more than a
> single code point. These are sometimes "grapheme clusters". For example,
> "áo" (which renders incorrectly in my email client…) is two grapheme
> clusters, but is made of three code points U+0065, U+0301, and U+006F.
>
> Rust’s standard libraries do not currently have a method for counting
> grapheme clusters, as far as I can tell. However, except for very specific
> cases (such as handling text selection in an editor), you generally don’t
> need to deal with grapheme clusters. Twitter also has a very specific idea
> of what "140 characters" means:
>
> https://dev.twitter.com/docs/counting-characters
>
> --
> Simon Sapin
>



-- 
Regards
Aravinda | ಅರವಿಂದ
http://aravindavk.in
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to