On Wed, Sep 4, 2013 at 4:58 PM, Brendan Eich <[email protected]> wrote: > String.fromCodePoint, rather.
Oops. Any reason this is not just String.from() btw? Give the better method a nice short name? >> I'm not sure I'm a big fan of having all three concepts around. > > You can't avoid it: UTF-8 is a transfer format that can be observed via > serialization. Yes, but it cannot encode lone surrogates. It can only deal in Unicode scalar values. > String.prototype.charCodeAt and String.fromCharCode are > required for backward compatibility. And ES6 wants to expose code points as > well, so three. Unicode scalar values are code points sans surrogates, i.e. completely compatible with what a utf-8 encoder/decoder pair can handle. Why do you want to expose surrogates? > Sorry, I missed this: how else (other than the charCodeAt/fromCharCode > legacy) are lone surrogates exposed? "\udfff".codePointAt(0) == "\udfff" It seems better if that returns "\ufffd", as you'd get with utf-8 (assuming it accepts code points as input rather than just Unicode scalar values, in which case it'd throw). The indexing of codePointAt() is also kind of sad as it just passes through to charCodeAt(), which means for any serious usage you need to use the iterator anyway. What's the reason codePointAt() exists? -- http://annevankesteren.nl/ _______________________________________________ es-discuss mailing list [email protected] https://mail.mozilla.org/listinfo/es-discuss

