RE: UTF-16 vs UTF-32

2011-05-16 Thread Phillips, Addison
> > Personally, I think UTF16 is more prone to error than either UTF8 or > > UTF32 -- in UTF32 there is a one-to-one correspondence > > One-to-one correspondence between string code units and Unicode codepoints. > > Unfortunately, "Unicode codepoint" is only a useful concept for some > scripts..

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 7:18 PM, Brendan Eich wrote: > On May 16, 2011, at 5:18 PM, Allen Wirfs-Brock wrote: > >> On May 16, 2011, at 5:06 PM, Brendan Eich wrote: >> >>> On May 16, 2011, at 2:07 PM, Boris Zbarsky wrote: >>> That said, defining JS strings and DOMString differently seems like a

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 7:53 PM, Mike Samuel wrote: > 2011/5/16 Allen Wirfs-Brock : >> It already ins't the case that eval(x)===JSON.parse(x). >> See http://timelessrepo.com/json-isnt-a-javascript-subset > > I'm aware of that hole. That doesn't mean that we should break the > relationship for code

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 7:22 PM, Boris Zbarsky wrote: > On 5/16/11 10:20 PM, Allen Wirfs-Brock wrote: >>> That seems like it'll make it very easy to introduce strings that are a mix >>> of the two via concatenation >> >> Some implementations already use tree structures to represent strings that

Re: arrow syntax unnecessary and the idea that "function" is too long

2011-05-16 Thread Peter Michaux
On Mon, May 9, 2011 at 6:02 PM, Brendan Eich wrote: > Yes, and we could add call/cc to make (some) compiler writers even happier. > But users would shoot off all their toes with this footgun, and some > implementors would be hard-pressed to support it. The point is *not * to do > any one chang

Re: Full Unicode strings strawman

2011-05-16 Thread Mike Samuel
2011/5/16 Allen Wirfs-Brock : > It already ins't the case that eval(x)===JSON.parse(x). >  See http://timelessrepo.com/json-isnt-a-javascript-subset I'm aware of that hole. That doesn't mean that we should break the relationship for code that doesn't error out in either. _

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
It already ins't the case that eval(x)===JSON.parse(x). See http://timelessrepo.com/json-isnt-a-javascript-subset On May 16, 2011, at 6:51 PM, Mike Samuel wrote: > 2011/5/16 Allen Wirfs-Brock : >> It the string is written as \ud800\udc00\u0061" the 'a' will be at offset >> 1, even in the new

Re: Full Unicode strings strawman

2011-05-16 Thread Boris Zbarsky
On 5/16/11 10:20 PM, Allen Wirfs-Brock wrote: That seems like it'll make it very easy to introduce strings that are a mix of the two via concatenation Some implementations already use tree structures to represent strings that are built via concatenation. It would be straight forward to h

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 6:41 PM, Boris Zbarsky wrote: > On 5/16/11 6:18 PM, Allen Wirfs-Brock wrote: >> It the string is written as \ud800\udc00\u0061" the 'a' will be at >> offset 1, even in the new proposal. It would only be at offset 1 if it >> was written as "\u+01\u+61" (using the litera

Re: Full Unicode strings strawman

2011-05-16 Thread Brendan Eich
On May 16, 2011, at 5:18 PM, Allen Wirfs-Brock wrote: > On May 16, 2011, at 5:06 PM, Brendan Eich wrote: > >> On May 16, 2011, at 2:07 PM, Boris Zbarsky wrote: >> >>> That said, defining JS strings and DOMString differently seems like a >>> recipe for serious author confusion (e.g. actually usi

Re: UTF-16 vs UTF-32

2011-05-16 Thread Boris Zbarsky
On 5/16/11 9:07 PM, John Tamplin wrote: Personally, I think UTF16 is more prone to error than either UTF8 or UTF32 -- in UTF32 there is a one-to-one correspondence One-to-one correspondence between string code units and Unicode codepoints. Unfortunately, "Unicode codepoint" is only a useful

Re: Full Unicode strings strawman

2011-05-16 Thread Boris Zbarsky
On 5/16/11 7:21 PM, Shawn Steele wrote: In other words I don’t think you can get the engine to be completely UTF-32. At least not without declaring a page as being UTF-32. For what it's worth, HTML5 does not support declaring a page as UTF-32 at all. We're removing our existing support for th

Re: Full Unicode strings strawman

2011-05-16 Thread Boris Zbarsky
On 5/16/11 7:05 PM, Allen Wirfs-Brock wrote: If you allow storage of such, then you're allowing mixing Unicode strings and "something else" (whatever the something else is), with bad most likely bad results. Most simply, assignign a DOMString containing surrogates to a JS string should collap

Re: Full Unicode strings strawman

2011-05-16 Thread Mike Samuel
2011/5/16 Allen Wirfs-Brock : > It the string is written as   \ud800\udc00\u0061" the 'a' will be at offset > 1, even in the new proposal.  It would only be at offset 1 if it was written > as "\u+01\u+61"  (using the literal notation from the proposal). Under this scheme, eval(' "\\

Re: Full Unicode strings strawman

2011-05-16 Thread Boris Zbarsky
On 5/16/11 6:18 PM, Allen Wirfs-Brock wrote: It the string is written as \ud800\udc00\u0061" the 'a' will be at offset 1, even in the new proposal. It would only be at offset 1 if it was written as "\u+01\u+61" (using the literal notation from the proposal). Ah, so in the proposal strin

RE: UTF-16 vs UTF-32

2011-05-16 Thread Shawn Steele
Allen said: > One reason is that none of the built-in string methods understand surrogate > pairs. If you want to do any string processing that recognizes such pairs you > have to either handles such pairs as multi-character sequences or do you own > character by character processing. (Which I

RE: UTF-16 vs UTF-32

2011-05-16 Thread Shawn Steele
And why do you care to iterate by code unit? I mean, sure it seems useful, but how useful? Do you want to stop in the middle of Ä? You probably don't stop in the middle of Ä. I have no problem with regular expressions or APIs that use 32bit (well, 21bit) values, but I think existing apps al

Re: UTF-16 vs UTF-32

2011-05-16 Thread John Tamplin
On Mon, May 16, 2011 at 8:42 PM, Shawn Steele wrote: > It's clear why we want to support the full Unicode range, but it's less > clear to me why UTF-32 would be desirable internally. (Sure, it'd be nice > for conversion types). > > What UTF-32 has that UTF-16 doesn't is the ability to walk a stri

Re: UTF-16 vs UTF-32

2011-05-16 Thread Mike Samuel
2011/5/16 Shawn Steele : > It's clear why we want to support the full Unicode range, but it's less clear > to me why UTF-32 would be desirable internally.  (Sure, it'd be nice for > conversion types). I don't think anyone says that UTF-32 is desirable *internally*. We're talking about the API of

Re: UTF-16 vs UTF-32

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 5:42 PM, Shawn Steele wrote: > It's clear why we want to support the full Unicode range, but it's less clear > to me why UTF-32 would be desirable internally. (Sure, it'd be nice for > conversion types). > > What UTF-32 has that UTF-16 doesn't is the ability to walk a stri

UTF-16 vs UTF-32

2011-05-16 Thread Shawn Steele
It's clear why we want to support the full Unicode range, but it's less clear to me why UTF-32 would be desirable internally. (Sure, it'd be nice for conversion types). What UTF-32 has that UTF-16 doesn't is the ability to walk a string without accidentally chopping up a surrogate pair. Howev

Re: Source file encoding

2011-05-16 Thread Mike Samuel
2011/5/16 Allen Wirfs-Brock : > On May 16, 2011, at 5:11 PM, Mike Samuel wrote: > >> 2011/5/16 Allen Wirfs-Brock : >> >>> The actual program might be encoded in EBCDIC or Hollerith card codes as >>> long as there is a mapping of the characters actually used in that encoding >>> to Unicode charact

RE: Full Unicode strings strawman

2011-05-16 Thread Shawn Steele
> I think you'll find that the actual JS engines are currently UCS-2 centric. > The surrounding browser environments are doing the UTF-16 interpretation. > That why you see 𐀀 instead of �� in browser generated display output. There’s no difference. I wouldn’t call Windows C++ WCHAR “UCS-2”, howev

Re: Source file encoding

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 5:11 PM, Mike Samuel wrote: > 2011/5/16 Allen Wirfs-Brock : > >> The actual program might be encoded in EBCDIC or Hollerith card codes as >> long as there is a mapping of the characters actually used in that encoding >> to Unicode characters. > > For ES next, why not mandat

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 5:06 PM, Brendan Eich wrote: > On May 16, 2011, at 2:07 PM, Boris Zbarsky wrote: > >> That said, defining JS strings and DOMString differently seems like a recipe >> for serious author confusion (e.g. actually using JS strings as the >> DOMString binding in ES might be loss

Source file encoding

2011-05-16 Thread Mike Samuel
2011/5/16 Allen Wirfs-Brock : > The actual program might be encoded in EBCDIC or Hollerith card codes as long > as there is a mapping of the characters actually used in that encoding to > Unicode characters. For ES next, why not mandate that all ES harmony source files not embedded in another l

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 4:21 PM, Shawn Steele wrote: > > Not in my proposal! "\ud800\udc00"=== "\u+01" is false in my proposal. > > That’s exactly my problem. I think the engine’s (or at least the > applications written in JavaScript) are still UTF-16-centric and that they’ll > have d800,

Re: Full Unicode strings strawman

2011-05-16 Thread Brendan Eich
On May 16, 2011, at 2:07 PM, Boris Zbarsky wrote: > That said, defining JS strings and DOMString differently seems like a recipe > for serious author confusion (e.g. actually using JS strings as the DOMString > binding in ES might be lossy, assigning from JS strings to DOMString might be > loss

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 3:36 PM, Mark Davis ☕ wrote: > > all defined Unicode characters. > > That would also not be correct. The defined characters are only about 109K > (more if you consider private use); nowhere near the number of code points, > because there are over 800K code points that are r

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 3:33 PM, Mike Samuel wrote: >> 2011/5/16 Allen Wirfs-Brock : > > Really? > > There is existing code out there that uses particular implementations > for strings. > Should the cost of migrating existing implementations be taken into > account when considering this strawman?

Re: Full Unicode strings strawman

2011-05-16 Thread Mike Samuel
2011/5/16 Allen Wirfs-Brock : > No. That would be a breaking change in the context of the browser.  Programs > creating surrogate that want to be updated to not use surrogate pairs are the > only ones that need to retool.  More likely we are talking about new code > that can be written without h

RE: Full Unicode strings strawman

2011-05-16 Thread Shawn Steele
> Not in my proposal! "\ud800\udc00"=== "\u+01" is false in my proposal. That’s exactly my problem. I think the engine’s (or at least the applications written in JavaScript) are still UTF-16-centric and that they’ll have d800, dc00 === 1. For example, if they were different, then d80

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 3:22 PM, Shawn Steele wrote: > The problem is that “\UD800\UDC00” === “\U+01”. And if the internal > representation is UTF-32, then they’d have to continue to be the same. And > it’s really hard for them to have the same length if one’s 2 code points and > the other’s

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 2:42 PM, Boris Zbarsky wrote: > On 5/16/11 4:38 PM, Wes Garland wrote: >> Two great things about strings composed of Unicode code points: > ... >> If though this is a breaking change from ES-5, I support it >> whole-heartedly but I expect breakage to be very limited. Provi

Re: Use cases for WeakMap

2011-05-16 Thread Brendan Eich
On May 16, 2011, at 2:46 PM, Hudson, Rick wrote: > This is all a bit off topic but performance does matter and folks seem to be > underestimating the wealth of community knowledge that exists in this area. Who underestimates? A bunch of us are aware of all this. Allen certainly knows all about

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 2:23 PM, Shawn Steele wrote: > I’m having some (ok, a great deal of) confusion between the DOM Encoding and > the JavaScript encoding and whatever. I’d assumed that if I had a web page > in some encoding, that it was converted to UTF-16 (well, UCS-2), and that’s > what the

Re: Full Unicode strings strawman

2011-05-16 Thread Mark Davis ☕
Mark *— Il meglio è l’inimico del bene —* On Mon, May 16, 2011 at 15:27, Allen Wirfs-Brock wrote: > See the section of the proposal about String.prototype.charCodeAt > > On May 16, 2011, at 2:20 PM, Mike Samuel wrote: > > > Allen, could you clarify something. > > > > When the strawman says with

Re: Full Unicode strings strawman

2011-05-16 Thread Mike Samuel
> 2011/5/16 Allen Wirfs-Brock : > I think you have an extra 0 at a couple of  places above... Yep. Sorry. The 0x1 really is supposed to be five digits though. > A DOMstring is defined by the DOM spec. to consists of 16-bit elements that > are to be interpreted as a UTF-16 encoding of Unicod

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
See the section of the proposal about String.prototype.charCodeAt On May 16, 2011, at 2:20 PM, Mike Samuel wrote: > Allen, could you clarify something. > > When the strawman says without mentioning "codepoint" > > "The String type is the set of all finite ordered sequences of zero or > more 16-

Re: Full Unicode strings strawman

2011-05-16 Thread Gillam, Richard
Allen-- I tried to post a pointer to this strawman on this list a few weeks ago, but apparently it didn't reach the list for some reason. Feed back would be appreciated: http://wiki.ecmascript.org/doku.php?id=strawman:support_full_unicode_in_strings I was actually on the committee when the lan

RE: Full Unicode strings strawman

2011-05-16 Thread Shawn Steele
The problem is that “\UD800\UDC00” === “\U+01”. And if the internal representation is UTF-32, then they’d have to continue to be the same. And it’s really hard for them to have the same length if one’s 2 code points and the other’s 1 code point. -Shawn From: es-discuss-boun...@mozilla.or

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 2:19 PM, Mark Davis ☕ wrote: > I'm quite sympathetic to the goal, but the proposal does represent a > significant breaking change. The problem, as Shawn points out, is with > indexing. Before, the strings were defined as UTF16. Not by the ECMAScript specification > > Take

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 2:16 PM, Mike Samuel wrote: > 2011/5/16 Boris Zbarsky : >> On 5/16/11 4:37 PM, Mike Samuel wrote: >>> >>> > >> There is no Unicode codepoint U+D800 or U+DC00. See >> http://www.unicode.org/charts/PDF/UD800.pdf and >> http://www.unicode.org/charts/PDF/UDC00.pdf which clearl

Re: Full Unicode strings strawman

2011-05-16 Thread Mark Davis ☕
In practice, the supplemental code points don't really cause problems in Unicode strings. Most implementations just treat them as if they were unassigned. The only important issue is that *when* they are converted to UTF-xx for storage or transmission, they need to be handled; typically by converti

Re: Full Unicode strings strawman

2011-05-16 Thread Mark Davis ☕
A correction. U+D800 is indeed a code point: http://www.unicode.org/glossary/#Code_Point. It is defined for usage in Unicode Strings (see http://www.unicode.org/glossary/#Unicode_String) because often it is useful for implementations to be able to allow it in processing. It does, however, have a

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 1:38 PM, Wes Garland wrote: > Allen; > > Thanks for putting this together. We use Unicode data extensively in both > our web and server-side applications, and being forced to deal with UTF-16 > surrogate pair directly -- rather than letting the String implementation deal

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 1:37 PM, Mike Samuel wrote: > 2011/5/16 Allen Wirfs-Brock : >> >> ... >> >>> How would >>> >>>var oneSupplemental = "\U0001"; > >> I don't think I understand you literal notation. \U is a 32-bit character >> value? I whose implementation? > > Sorry, please read t

Re: Full Unicode strings strawman

2011-05-16 Thread Boris Zbarsky
On 5/16/11 5:23 PM, Shawn Steele wrote: I’m having some (ok, a great deal of) confusion between the DOM Encoding and the JavaScript encoding and whatever. I’d assumed that if I had a web page in some encoding, that it was converted to UTF-16 (well, UCS-2), and that’s what the JavaScript engine di

RE: Full Unicode strings strawman

2011-05-16 Thread Shawn Steele
I’d go further and also say there isn’t really a very big practical difference between: · A UCS-2 implementation who’s data is rendered by a completely Unicode aware rendering engine, and · A UTF-16 implementation. In fact I’m unaware of any UCS-2/UTF-16 conversion functionalit

Re: Full Unicode strings strawman

2011-05-16 Thread Boris Zbarsky
On 5/16/11 5:16 PM, Mike Samuel wrote: The strawman says "The String type is the set of all finite ordered sequences of zero or more 21-bit unsigned integer values (“elements”)." Yeah, that's not the same thing as an actual Unicode string, and requires handling of all sorts of "what if someon

RE: Use cases for WeakMap

2011-05-16 Thread Hudson, Rick
The other hack is to make the hash value itself opaque as I believe weak maps could. That way you can use the address as the hash at the cost of having to lazily rehash after a GC. Typically the Cheney scan of a hash table would randomly reordered objects in memory and this hurt memory performan

Re: Full Unicode strings strawman

2011-05-16 Thread Mark Davis ☕
In terms of implementation capabilities, there isn't really a significant practical difference between - a UCS-2 implementation, and - a UTS-16 implementation that doesn't have supplemental characters in its supported repertoire. Mark *— Il meglio è l’inimico del bene —* On Mon, May

Re: Full Unicode strings strawman

2011-05-16 Thread Boris Zbarsky
On 5/16/11 4:38 PM, Wes Garland wrote: Two great things about strings composed of Unicode code points: ... If though this is a breaking change from ES-5, I support it whole-heartedly but I expect breakage to be very limited. Provided that the implementation does not restrict the storage of

RE: Full Unicode strings strawman

2011-05-16 Thread Shawn Steele
I think the problem isn’t so much that the spec used UCS-2, but rather that some implementations used UTF-16 instead as that is more convenient in many cases. To the application developer, it’s difficult to tell the difference between UCS-2 and UTF-16 if I can use a regular expression to find D

Re: Full Unicode strings strawman

2011-05-16 Thread 신정식, 申政湜
On Mon, May 16, 2011 at 2:19 PM, Mark Davis ☕ wrote: > I'm quite sympathetic to the goal, but the proposal does represent a > significant breaking change. The problem, as Shawn points out, is with > indexing. Before, the strings were defined as UTF16. I agree with Mark wrote except that the pre

RE: Full Unicode strings strawman

2011-05-16 Thread Shawn Steele
I'm having some (ok, a great deal of) confusion between the DOM Encoding and the JavaScript encoding and whatever. I'd assumed that if I had a web page in some encoding, that it was converted to UTF-16 (well, UCS-2), and that's what the JavaScript engine did it's work on. I confess to not havi

Re: Full Unicode strings strawman

2011-05-16 Thread Mike Samuel
Allen, could you clarify something. When the strawman says without mentioning "codepoint" "The String type is the set of all finite ordered sequences of zero or more 16-bit\b\b\b\b\b\b 21-bit unsigned integer values (“elements”)." does that mean that String.charCodeAt(...) can return any value i

Re: Full Unicode strings strawman

2011-05-16 Thread Mark Davis ☕
I'm quite sympathetic to the goal, but the proposal does represent a significant breaking change. The problem, as Shawn points out, is with indexing. Before, the strings were defined as UTF16. Take a sample string "\ud800\udc00\u0061" = "\u{1}\u{61}". Right now, the 'a' (the \u{61}) is at offs

Re: Full Unicode strings strawman

2011-05-16 Thread Mike Samuel
2011/5/16 Boris Zbarsky : > On 5/16/11 4:37 PM, Mike Samuel wrote: >> >> You might have.  If you reject my assertion about option 2 above, then >> to clarify, >> The UTF-16 representation of codepoint U+1 is the code-unit pair >> U+D8000 U+DC000. > > No.  The UTF-16 representation of codepoint

Re: Full Unicode strings strawman

2011-05-16 Thread Mike Samuel
2011/5/16 Wes Garland : > Mike Samuel, can you explain why you are en/decoding UTF-16 when > round-tripping through the DOM? I was UTF-16 encoding it because there will be host objects in browsers that assume a UTF-16 encoding and so a possibility for orphaned surrogates in internal representation

Re: Full Unicode strings strawman

2011-05-16 Thread Boris Zbarsky
On 5/16/11 4:37 PM, Mike Samuel wrote: You might have. If you reject my assertion about option 2 above, then to clarify, The UTF-16 representation of codepoint U+1 is the code-unit pair U+D8000 U+DC000. No. The UTF-16 representation of codepoint U+1 is the code-unit pair 0xD800 0xDC0

Re: Use cases for WeakMap

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 11:37 AM, Oliver Hunt wrote: > The same logic applies to object hashcodes -- an object must always produce > the same hashcode which means it will need to store it -- having a secondary > map doesn't help you, because that map will itself require a hascode. > However makin

Re: Full Unicode strings strawman

2011-05-16 Thread Wes Garland
Allen; Thanks for putting this together. We use Unicode data extensively in both our web and server-side applications, and being forced to deal with UTF-16 surrogate pair directly -- rather than letting the String implementation deal with them -- is a constant source of mild pain. At first blush

Re: Non-method functions and this

2011-05-16 Thread Brendan Eich
On May 16, 2011, at 1:36 PM, Axel Rauschmayer wrote: > Thanks for explaining the details. With "temporary result", I meant what you > call "object reference". Similar to Common Lisp places (setf etc.), I > suppose. If you use the conditional operator, you can force the dereferencing > of an obj

Re: Full Unicode strings strawman

2011-05-16 Thread Mike Samuel
2011/5/16 Allen Wirfs-Brock : > > On May 16, 2011, at 12:28 PM, Mike Samuel wrote: > > > DOMString is defined at > > http://www.w3.org/TR/DOM-Level-2-Core/core.html#ID-C74D1578 thus > > > >    Type Definition DOMString > >    A DOMString is a sequence of 16-bit units. > > > > so how would round tri

Re: Non-method functions and this

2011-05-16 Thread Axel Rauschmayer
> This breaks the web, so regardless of whether it's worth it, it's just not > possible. Harmony and non-Harmony share a heap with the same built-in > globals. If you change Function.prototype.call and Function.prototype.apply, > you break the web. I suspect that a new type for the new-style fu

Re: Use cases for WeakMap

2011-05-16 Thread Brendan Eich
On May 16, 2011, at 1:15 PM, Oliver Hunt wrote: > On May 16, 2011, at 1:07 PM, Brendan Eich wrote: >> WeakMaps satisfy this more general non-enumerable object-keyed cache >> use-case well, too -- at least as far as I can tell. We'll be building on >> the Firefox nightly implementation to find ou

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 12:28 PM, Mike Samuel wrote: > > DOMString is defined at > http://www.w3.org/TR/DOM-Level-2-Core/core.html#ID-C74D1578 thus > >Type Definition DOMString >A DOMString is a sequence of 16-bit units. > > so how would round tripping a JS string through a DOM string work

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 11:34 AM, Shawn Steele wrote: > Thanks for making a strawman (see my very last sentence below as it may impact the interpreation of some of the rest of these responses) > > Unicode Escape Sequences > Is it possible for U+ to accept either 4, 5, or 6 digit sequences? >

Re: Use cases for WeakMap

2011-05-16 Thread Oliver Hunt
On May 16, 2011, at 1:07 PM, Brendan Eich wrote: > WeakMaps satisfy this more general non-enumerable object-keyed cache use-case > well, too -- at least as far as I can tell. We'll be building on the Firefox > nightly implementation to find out more. I think my problem with this is that WeakMap

Re: Use cases for WeakMap

2011-05-16 Thread Brendan Eich
On May 16, 2011, at 11:30 AM, Erik Corry wrote: > 2011/5/16 Andreas Gal : >> >> Even if you want to store weak-map specific meta data per object, nobody >> would store that directly in the object. Thats needlessly cruel on the cache >> since >>99.9% of objects never end up in a weak map. Instea

Re: I noted some open issues on "Classes with Trait Composition"

2011-05-16 Thread Dmitry A. Soshnikov
On 16.05.2011 19:02, Brendan Eich wrote: On May 16, 2011, at 4:54 AM, Dmitry A. Soshnikov wrote: On 16.05.2011 10:49, Brendan Eich wrote: On May 15, 2011, at 10:01 PM, Brendan Eich wrote: http://wiki.ecmascript.org/doku.php?id=strawman:classes_with_trait_composition#open_issues This looks

Re: Full Unicode strings strawman

2011-05-16 Thread Mike Samuel
2011/5/16 Shawn Steele : >> > myString.replace( /[\ud800-\udbff](?![\udc00-\u])/g, "\ufffd") >> >    .replace( /(^|[^\ud800-\udbff])([\udc00-\ud])/g, "\ufffd") My example code has typos. It should have read myString.replace( /[\ud800-\udbff](?![\udc00-\udfff])/g, "\ufffd")    .r

Re: Full Unicode strings strawman

2011-05-16 Thread Mike Samuel
2011/5/16 Allen Wirfs-Brock : > > On May 16, 2011, at 11:30 AM, Mike Samuel wrote: > >> 2011/5/16 Allen Wirfs-Brock : >>> I tried to post a pointer to this strawman on this list a few weeks ago, but >>> apparently it didn't reach the list for some reason. >>> Feed back would be appreciated: >>> htt

RE: Full Unicode strings strawman

2011-05-16 Thread Shawn Steele
> > myString.replace( /[\ud800-\udbff](?![\udc00-\u])/g, "\ufffd") > >.replace( /(^|[^\ud800-\udbff])([\udc00-\ud])/g, "\ufffd") > Exactly as it currently does, assuming it was applied to a string that didn't > contain any codepoints greater than \u. > If the string contained any

Re: Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
On May 16, 2011, at 11:30 AM, Mike Samuel wrote: > 2011/5/16 Allen Wirfs-Brock : >> I tried to post a pointer to this strawman on this list a few weeks ago, but >> apparently it didn't reach the list for some reason. >> Feed back would be appreciated: >> http://wiki.ecmascript.org/doku.php?id=str

Re: Use cases for WeakMap

2011-05-16 Thread Brendan Eich
On May 16, 2011, at 11:40 AM, Brendan Eich wrote: > Frozen objects are not observably extensible. I argued that we want an > implementation option to put their shallowly-frozen state (values at least) > in read-only memory. I have not seen anything like an argument why this > option should be f

Re: Use cases for WeakMap

2011-05-16 Thread Brendan Eich
On May 16, 2011, at 11:37 AM, Oliver Hunt wrote: > On May 16, 2011, at 11:30 AM, Erik Corry wrote: > >> 2011/5/16 Andreas Gal : >>> >>> Even if you want to store weak-map specific meta data per object, nobody >>> would store that directly in the object. Thats needlessly cruel on the >>> cache

Re: Use cases for WeakMap

2011-05-16 Thread Oliver Hunt
On May 16, 2011, at 11:30 AM, Erik Corry wrote: > 2011/5/16 Andreas Gal : >> >> Even if you want to store weak-map specific meta data per object, nobody >> would store that directly in the object. Thats needlessly cruel on the cache >> since >>99.9% of objects never end up in a weak map. Inste

Re: Use cases for WeakMap

2011-05-16 Thread Erik Corry
On May 16, 2011 7:30 PM, "Brendan Eich" wrote: > > On May 16, 2011, at 12:43 AM, Erik Corry wrote: >> Perhaps we have a nomenclature problem with 'value'. Do you regard >> things that have typeof(o) == 'object' as values? > > > Certainly. This is the source of our confusion. I was meaning 'pri

RE: Full Unicode strings strawman

2011-05-16 Thread Shawn Steele
Thanks for making a strawman Unicode Escape Sequences Is it possible for U+ to accept either 4, 5, or 6 digit sequences? Typically when I encounter U+ notation the leading zero is omitted, and I see BMP characters quite often. Obviously BMP could use the U notation, however it seems like it'

Re: Use cases for WeakMap

2011-05-16 Thread Erik Corry
2011/5/16 Andreas Gal : > > Even if you want to store weak-map specific meta data per object, nobody > would store that directly in the object. Thats needlessly cruel on the cache > since >>99.9% of objects never end up in a weak map. Instead one would locate > that meta data outside the object

Re: Full Unicode strings strawman

2011-05-16 Thread Mike Samuel
2011/5/16 Allen Wirfs-Brock : > I tried to post a pointer to this strawman on this list a few weeks ago, but > apparently it didn't reach the list for some reason. > Feed back would be appreciated: > http://wiki.ecmascript.org/doku.php?id=strawman:support_full_unicode_in_strings Will this change t

Re: Use cases for WeakMap

2011-05-16 Thread Boris Zbarsky
On 5/16/11 1:48 PM, Brendan Eich wrote: For the use case mentioned by Boris in this thread, where a FF extension needs to attach metadata to an object it doesn't seem likely that the mapping will get lost and need to be GCed before the objects that have the metadata attached. The add-on easily

Full Unicode strings strawman

2011-05-16 Thread Allen Wirfs-Brock
I tried to post a pointer to this strawman on this list a few weeks ago, but apparently it didn't reach the list for some reason. Feed back would be appreciated: http://wiki.ecmascript.org/doku.php?id=strawman:support_full_unicode_in_strings Allen___

Re: Use cases for WeakMap

2011-05-16 Thread Andreas Gal
Even if you want to store weak-map specific meta data per object, nobody would store that directly in the object. Thats needlessly cruel on the cache since >>99.9% of objects never end up in a weak map. Instead one would locate that meta data outside the object in a direct mapped dense area (li

Re: Use cases for WeakMap

2011-05-16 Thread Brendan Eich
On May 16, 2011, at 1:52 AM, Erik Corry wrote: > My question was, do the use cases have both the GC of the map and the > key triggering the GC of the value or is the GC of the key the > important one and GC of the map not that common/important etc. I'm not sure, we will have to measure to give a

Re: Use cases for WeakMap

2011-05-16 Thread Brendan Eich
On May 16, 2011, at 12:47 AM, Erik Corry wrote: > I think the objects used as keys in weak maps need to be somehow > annotated with this information so that the GC can clean up the weak > maps when the keys die. This means that if you take an object that is > frozen and use it as a key in a weak

Re: Use cases for WeakMap

2011-05-16 Thread Brendan Eich
On May 16, 2011, at 12:43 AM, Erik Corry wrote: >>> Elided for clarity :-) It can be implemented with private names or >>> WeakMaps. >> >> Oh, ok -- you wrote "normal JS object" and that seemed to preclude new stuff. >> >> Yes, we can make a value -> value map as a library abstraction, but it's

Re: Non-method functions and this

2011-05-16 Thread Kyle Simpson
d) At runtime, a function invoked as a non-method gets a dynamic ReferenceError if it tries to refer to |this|. This would just be kind of obnoxious, I think, since if the function wants to test whether it's been called as a non-method it has to do something like let nonMethod = false; t

Re: Strict mode eval

2011-05-16 Thread Andreas Rossberg
Sure, sounds good. I will look into it. Thanks, /Andreas On 14 May 2011 03:18, Mark S. Miller wrote: > I think this is the kind of incremental refinement of the details of > existing features that we can legitimately consider after May without > setting a bad precedent. Would you be interested

Re: Non-method functions and this

2011-05-16 Thread David Herman
> - "self" is just another parameter, it can be called anything. > - Function.prototype.call() and Function.prototype.apply() would have one > parameter less. > - IIRC, this is more or less how Python works. > - Probably not worth it, migration-cost-wise. This breaks the web, so regardless of whe

Re: Non-method functions and this

2011-05-16 Thread Axel Rauschmayer
Yeah, sorry, I was wrong about the need for two constructs. Example code: > var obj = { > id: "ac3fb", > method: function(self, msgs) { > msgs.forEach(function(msg) { > console.log(self.id+": "+msg); > }); > } > }; > > // Invoke as method > obj.method([ "h

Re: I noted some open issues on "Classes with Trait Composition"

2011-05-16 Thread Brendan Eich
On May 16, 2011, at 4:54 AM, Dmitry A. Soshnikov wrote: > On 16.05.2011 10:49, Brendan Eich wrote: >> >> On May 15, 2011, at 10:01 PM, Brendan Eich wrote: >> >>> http://wiki.ecmascript.org/doku.php?id=strawman:classes_with_trait_composition#open_issues >>> >>> This looks pretty good at a glance

Re: Non-method functions and this

2011-05-16 Thread Sam Tobin-Hochstadt
On Mon, May 16, 2011 at 10:15 AM, David Herman wrote: > > d) At runtime, a function invoked as a non-method gets a dynamic > ReferenceError if it tries to refer to |this|. This would just be kind of > obnoxious, I think, since if the function wants to test whether it's been > called as a non-me

Re: Non-method functions and this

2011-05-16 Thread David Herman
>> How do you define "non-method"? > > A function that is not invoked as method. Right now, the same kind of > construct is used for both true functions and methods. I’m proposing a new > construct (similar to the distinction that Python makes): a function that > does not have an implicit |this

Re: I noted some open issues on "Classes with Trait Composition"

2011-05-16 Thread Dmitry A. Soshnikov
On 16.05.2011 10:49, Brendan Eich wrote: On May 15, 2011, at 10:01 PM, Brendan Eich wrote: http://wiki.ecmascript.org/doku.php?id=strawman:classes_with_trait_composition#open_issues This looks pretty good at a glance, but it's a /lot/, and it's new. Looking closer, I have to say something

Re: Use cases for WeakMap

2011-05-16 Thread Erik Corry
2011/5/16 Brendan Eich : > On May 16, 2011, at 12:11 AM, Erik Corry wrote: > >> 2011/5/15 Brendan Eich : >>> Besides attaching metadata, weak maps are important for remembering the >>> wrapper or membrane for a given (frozen or not, built-in or "host", not to >>> be mutated) object identity. Mark a

Re: arrow syntax unnecessary and the idea that "function" is too long

2011-05-16 Thread Dmitry A. Soshnikov
On 16.05.2011 1:47, Brendan Eich wrote: On May 15, 2011, at 1:54 PM, Dmitry A. Soshnikov wrote: See last reply for more on joining. It occurs to me you thought scope chain varying in the context of a pure hash-rocket such as #->42 means that function cannot be joined, but since it is pure, it

Re: Use cases for WeakMap

2011-05-16 Thread Erik Corry
2011/5/16 Brendan Eich : > On May 16, 2011, at 12:18 AM, Erik Corry wrote: > >> 2011/5/16 Brendan Eich : >>> On May 16, 2011, at 12:01 AM, Brendan Eich wrote: >>> On May 15, 2011, at 11:55 PM, Erik Corry wrote: > 2011/5/16 Brendan Eich : >> Not if the object is frozen. > >

  1   2   >