[dev-servo] WTF-8 encoding for DOM strings and HTML parsing

2014-10-05 Thread Simon Sapin
We’ve discussed using UTF-8 internally for strings in Servo, but well-formed UTF-8 can not represent surrogate code points. JavaScript strings, however, can. (They are effectively potentially ill-formed UTF-16.) It’s possible (?) that the Web depends on these surrogates being preserved. So

Re: [dev-servo] WTF-8 encoding for DOM strings and HTML parsing

2014-10-05 Thread Cameron Zwarich
If JS can’t handle WTF-8 natively, then what’s the benefit of using it? I am opposed to anything that requires string copies between the DOM and JS, unless there’s some really great overriding reason. Cameron On Oct 5, 2014, at 9:26 AM, Simon Sapin simon.sa...@exyr.org wrote: We’ve discussed

Re: [dev-servo] WTF-8 encoding for DOM strings and HTML parsing

2014-10-05 Thread Ms2ger
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 10/05/2014 08:27 PM, Cameron Zwarich wrote: If JS can’t handle WTF-8 natively, then what’s the benefit of using it? I am opposed to anything that requires string copies between the DOM and JS, unless there’s some really great overriding reason.

Re: [dev-servo] WTF-8 encoding for DOM strings and HTML parsing

2014-10-05 Thread Boris Zbarsky
On 10/5/14, 2:27 PM, Cameron Zwarich wrote: I am opposed to anything that requires string copies between the DOM and JS The only way to do that with SpiderMonkey in its current state is to use JSString for your string type. You cannot safely grab the chars from a SpiderMonkey string and

Re: [dev-servo] WTF-8 encoding for DOM strings and HTML parsing

2014-10-05 Thread Patrick Walton
On 10/5/14 3:08 PM, Boris Zbarsky wrote: On 10/5/14, 2:27 PM, Cameron Zwarich wrote: I am opposed to anything that requires string copies between the DOM and JS The only way to do that with SpiderMonkey in its current state is to use JSString for your string type. You cannot safely grab the

Re: [dev-servo] WTF-8 encoding for DOM strings and HTML parsing

2014-10-05 Thread Cameron Zwarich
On Oct 5, 2014, at 3:13 PM, Patrick Walton pcwal...@mozilla.com wrote: On 10/5/14 3:08 PM, Boris Zbarsky wrote: On 10/5/14, 2:27 PM, Cameron Zwarich wrote: I am opposed to anything that requires string copies between the DOM and JS The only way to do that with SpiderMonkey in its current

Re: [dev-servo] WTF-8 encoding for DOM strings and HTML parsing

2014-10-05 Thread Cameron Zwarich
On Oct 5, 2014, at 3:08 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 10/5/14, 2:27 PM, Cameron Zwarich wrote: I am opposed to anything that requires string copies between the DOM and JS The only way to do that with SpiderMonkey in its current state is to use JSString for your string type.