On 2014-05-29 05:36, Kevin Ballard wrote:
[--snip--]
> And when dealing with a sequence in a precise encoding, the natural unit to
> work
> with is the code unit (and this has precedence in other languages,
such as JavaScript, Obj-C, and Go).
>
JavaScript:
$ node
> var s = "hï"; // Note the accent
undefined
> s.length;
2
Rust:
$ cat
fn main() {
let l = "hï".len(); // Note the accent
println!("{:u}", l);
}
$ rustc hello.rs
$ ./hello
3
No matter how defective the notion of "length" may be, personally I
think that people will expect the former, but will be very surprised by
the latter. There are certainly cases where the JavaScript version is
wrong, but I conjecture that it "works" for the vast majority of cases
that people and programs are likely to encounter.
IMO expecting people to read docs is a poor substitute for being
explicit in a method name about what the method does, especially when it
costs only 5 characters. The Principle of Least Astonishment and all that.
As a rule people don't read docs until they've encountered a "bug" in
their expectations vs. what the language/library actually does -- at
which point they're already annoyed and don't need to be further annoyed
by the realization that "it does something completely non-intuitive"
(for their perspective).
Thankfully the programming world has become more aware of i18n issues,
but for people who still predominantly use ASCII such bugs may lay
dormant for a long time before anyone discovers them.
Just my €0.02.
Regards,
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev