On Thu, Sep 12, 2013 at 6:42 PM, Brendan Eich <[email protected]> wrote:
> Iterators forward and (if needed backward) over Unicode characters (scalar
> values; I'm allowed to call those "characters", no?) would be good. Github
> beats TC39 as usual, prollyfill FTW.
No, there a non-characters that are Unicode scalar values and can
(therefore) be expressed using utf-8, such as U+FFFF.
This should do what you asked for, although it's late and it's not an
iterator as those don't really work in browsers yet, but should be
easy enough to convert:
function toUnicode(str) {
var output = ""
for(var i = 0, l = str.length; i < l; i++) {
var c = str.charCodeAt(i)
if (0xD800 <= c && c <= 0xDBFF) {
nextC = str.charCodeAt(i+1);
if (0xDC00 > nextC || nextC > 0xDFFF) {
output += "\uFFFD"
} else {
output += str[i] += str[++i]
continue
}
}
else if (0xDC00 <= c && c <= 0xDFFF) {
output += "\uFFFD"
} else {
output += str[i]
}
}
return output
}
toUnicode("\ud800a")
toUnicode("\ud800\udc01")
toUnicode("\udc00a")
--
http://annevankesteren.nl/
_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss