Re: [dev-servo] WTF-8 encoding for DOM strings and HTML parsing

Cameron Zwarich Mon, 06 Oct 2014 23:05:38 -0700

On Oct 6, 2014, at 3:49 PM, Boris Zbarsky <bzbar...@mit.edu> wrote:

>> Is there any particular place where you feel there is tension between the 
>> goals of memory usage and performance?
> 
> I don't know yet.  I mean, for charAt, sure.  ;)


JS engines have been using ropes for quite some time now, which means that 
charAt can be O(log n) without anyone noticing.

You could imagine a JS string representation that converts something like WTF-8 
to UTF-16 / Latin1 upon character indexing. Or you could do something fancier 
(and probably not worth it) by ropifying the original WTF-8 string, reusing 
pointers into the same immutable buffer but adding character index range info 
on rope nodes. Either way, this doesn’t sound like a change that is going to 
happen in SpiderMonkey because of Servo.

>> One annoying problem that comes up with using UTF-8 in Servo and UTF-16 / 
>> Latin1 in SpiderMonkey is that repeated accesses of a DOM property would 
>> cause repeated copying of the same string, so you would need to cache the 
>> copies to ensure that JS programs have the expected time / space complexity.
> 
> You need such a cache anyway for dumb performance reasons: it's hard to be 
> competitive on benchmarks otherwise.  Gecko has a one-slot cache. WebKit+JSC 
> had a full-on hashtable last I checked.  Gecko only does this for the 
> shareable string case, for various reasons, but it could be done in general.

The cache helps WebKit and Gecko get a constant factor improvement, but if the 
caching weren’t there then programs would still have the same asymptotic 
complexity, at least as I understand your description of how this works in 
Gecko. With UTF-8 conversions on the Servo / JS boundary, we would be required 
to cache the conversion of every live string to preserve correct complexity.

Cameron
_______________________________________________
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo

Re: [dev-servo] WTF-8 encoding for DOM strings and HTML parsing

Reply via email to