Except that in C++ std::basic_string::size and std::basic_string:length are synonymous (both return the number of CharTs, which in std::string is also the number of bytes).
Thus I am unsure whether this would end up helping C++ developers. Might help others though. On Fri, May 30, 2014 at 2:12 PM, Nathan Myers <[email protected]> wrote: > A good name would be size(). That would avoid any confusion over various > length definitions, and just indicate how much address space it occupies. > > Nathan Myers > > > On May 29, 2014 8:11:47 PM Palmer Cox <[email protected]> wrote: > > Thinking about it more, units() is a bad name. I think a renaming could >> make sense, but only if something better than len() can be found. >> >> -Palmer Cox >> >> >> On Thu, May 29, 2014 at 10:55 PM, Palmer Cox <[email protected]> wrote: >> >> > What about renaming len() to units()? >> > >> > I don't see len() as a problem, but maybe as a potential source of >> > confusion. I also strongly believe that no one reads documentation if >> they >> > *think* they understand what the code is doing. Different people will >> see >> > len(), assume that it does whatever they want to do at the moment, and >> for >> > a significant portion of strings that they encounter it will seem like >> > their interpretation, whatever it is, is correct. So, why not rename >> len() >> > to something like units()? Its more explicit with the value that its >> > actually producing than len() and its not all that much longer to type. >> As >> > stated, exactly what a string is varies greatly between languages, so, I >> > don't think that lacking a function named len() is bad. Granted, I would >> > expect that many people expect that a string will have method named >> len() >> > (or length()) and when they don't find one, they will go to the >> > documentation and find units(). I think this is a good thing since the >> > documentation can then explain exactly what it does. >> > >> > I much prefer len() to byte_len(), though. byte_len() seems like a bit >> > much to type and it seems like all the other methods on strings should >> then >> > be renamed with the byte_ prefix which seems unpleasant. >> > >> > -Palmer Cox >> > >> > >> > On Thu, May 29, 2014 at 3:39 AM, Masklinn <[email protected]> >> wrote: >> > >> >> >> >> On 2014-05-29, at 08:37 , Aravinda VK <[email protected]> >> wrote: >> >> >> >> > I think returning length of string in bytes is just fine. Since I >> >> didn't know about the availability of char_len in rust caused this >> >> confusion. >> >> > >> >> > python 2.7 - Returns length of string in bytes, Python 3 returns >> number >> >> of codepoints. >> >> >> >> Nope, depends on the string type *and* on compilation options. >> >> >> >> * Python 2's `str` and Python 3's `bytes` are byte sequences, their >> >> len() returns their byte counts. >> >> * Python 2's `unicode` and Python 3's `str` before 3.3 returns a code >> >> units count which may be UCS2 or UCS4 (depending whether the >> >> interpreter was compiled with `—enable-unicode=ucs2` — the default — >> >> or `—enable-unicode=ucs4`. Only the latter case is a true code points >> >> count. >> >> * Python 3.3's `str` switched to the Flexible String Representation, >> >> the build-time option disappeared and len() always returns the number >> >> of codepoints. >> >> >> >> Note that in no case to len() operations take normalisation or visual >> >> composition in account. >> >> >> >> > JS returns number of codepoints. >> >> >> >> JS returns the number of UCS2 code units, which is twice the number of >> >> code points for those in astral planes. >> >> _______________________________________________ >> >> Rust-dev mailing list >> >> [email protected] >> >> https://mail.mozilla.org/listinfo/rust-dev >> >> >> > >> > >> > > > > _______________________________________________ > Rust-dev mailing list > [email protected] > https://mail.mozilla.org/listinfo/rust-dev >
_______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
