On May 28, 2014, at 6:00 PM, Benjamin Striegel <[email protected]> wrote:
> To reiterate, it simply doesn't make sense to ask what the length of a string > is. You may as well ask what color the string is, or where the string went to > high school, or how many times the string rode the roller coaster that one > year on the first day of summer vacation when the string's parents took the > string to that amusement park and the weather said that it was going to rain > so there were almost no crowds that day but then it didn't rain and all the > rides were open with absolutely no lines whatsoever. As amusing as this imagery is, you're still arguing from faulty premise, which is that the concept of a "string" has not been well-defined. The nebulous "string", as it applies to the general category of programming languages, does indeed not have a well-defined length. But Rust's strings (both String and str) are very explicitly defined as a utf-8 encoded sequence. And when dealing with a sequence in a precise encoding, the natural unit to work with is the code unit (and this has precedence in other languages, such as JavaScript, Obj-C, and Go). --- My interpretation of your arguments is that your real objection is that you think that calling it len() will mean people won't even think about the fact that there's a difference between byte length and character length, because they'll be too used to working with ASCII data, and that they'll write code that breaks when forced to confront the difference. This is true regardless of how len() is defined (whether it's in bytes, in UTF-16 characters, in unicode scalar values, etc). My assertion is that calling the method .byte_len() will not force anyone to deal with non-ASCII data if they don't want to, it will only annoy everyone by being overly verbose, even more so when you rename .slice() to .byte_slice(), etc. I also believe that renaming .slice() to .byte_slice() is unambiguously wrong, as the name implies that it returns &[u8] when it doesn't. And similarly, that renaming just .len() to .byte_len() without renaming .slice() to .byte_slice() is also wrong. This means you cannot rename .len() to .byte_len() without introducing unambiguously wrong naming elsewhere. --- Does this accurately represent your argument? And do you have any rebuttal to my argument that hasn't already been said? If the answers are "yes" and "no" respectively, then I agree, we will have to simply live with being in disagreement. > Oh and while we're belligerently bikeshedding, we should rename `to_str` to > `to_string` once we rename `StrBuf` to `String`. :) We've already renamed StrBuf to String, but I agree that .to_str() makes more sense as .to_string(). I was assuming that would eventually get renamed, although I just realized that it would then conflict with StrAllocating's .to_string() method, which is rather unfortunate. -Kevin
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
