Thanks. I didn't know about char_len. `unicode_str.as_slice().char_len()` is giving number of code points.
Sorry for the confusion, I was referring codepoint as character in my mail. char_len gives the correct output for my requirement. I have written javascript script to convert from string length to grapheme cluster length for Kannada language. On Wed, May 28, 2014 at 3:01 PM, Simon Sapin <[email protected]> wrote: > On 28/05/2014 10:10, Aravinda VK wrote: > >> Hi, >> >> How to find number of characters in a string? >> >> Following example returns byte count instead of number of characters. >> >> use std::string::String; >> >> fn main() { >> let unicode_str = String::from_str("ಅ"); >> let ascii_str = String::from_str("a"); >> println!("unicode str: {}, ascii str: {}", unicode_str.len(), >> ascii_str.len()); >> } >> > > It depends on what you call a "character". As you noted, the .len() method > returns the number of UTF-8 bytes. Since strings are represented as UTF-8 > internally, .len() takes O(1) time. > > There is also the .char_len() method, which counts the number of Unicode > code points in O(n) time. > > http://static.rust-lang.org/doc/master/std/str/trait. > StrSlice.html#tymethod.char_len > > However, what users perceive as a single "character" may be more than a > single code point. These are sometimes "grapheme clusters". For example, > "áo" (which renders incorrectly in my email client…) is two grapheme > clusters, but is made of three code points U+0065, U+0301, and U+006F. > > Rust’s standard libraries do not currently have a method for counting > grapheme clusters, as far as I can tell. However, except for very specific > cases (such as handling text selection in an editor), you generally don’t > need to deal with grapheme clusters. Twitter also has a very specific idea > of what "140 characters" means: > > https://dev.twitter.com/docs/counting-characters > > -- > Simon Sapin > -- Regards Aravinda | ಅರವಿಂದ http://aravindavk.in
_______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
