Re: Why is querySelector much slower?

Boris Zbarsky Tue, 28 Apr 2015 09:19:41 -0700

On 4/28/15 2:59 AM, Glen Huang wrote:

Looking at the microbenchmark again, for Gecko, getElementById is around 300x 
faster than querySelector('#id'), and even getElementsByClassName is faster 
than it.

This is why one should not write microbenchmarks. ;) Or at least ifone does, examine the results very carefully.

The numbers you see for the getElementById benchmark there are on theorder of 2e9 operations per second, yes? And modern desktop/laptop CPUsare clocked somewhere in the 2-4 GHz range. So what you're seeing isthat the benchmark claims the operation is performed in 1-2 clockcycles. This should seem unlikely if you think the operation involves ahashtable lookup!

What's happening there is that Gecko happens to know at JIT compile timein this microbenchmark:

1) The bareword lookup is going to end up at the global, because thereis nothing on the scope chain that would shadow the "document" name.2) The global has an own property named "document" whose getter isside-effect-free.3) The return value of the "document" property has only been observedto be a Document.4) Looking up "getElementById" on the return value of the "document"property has consistently found it on Document.prototype.

5)  Document.prototype.getElementById is known to be side-effect-free.

6) The return value of getElementById is not used (assigned to afunction-local variable that is then not used).

The upshot of all that is that with a few guards both the"getElementById" call get and the "document" get can be dead-codeeliminated here. And even if you stored the value somewhere persistentthey could both still be loop-hoisted in this case. So what thisgetElementById benchmark measures is how fast a loop counter can bedecremented from some starting value to 0. It happens that this can bedone in about 1-2 clock cycles per loop iteration.


OK, so what about querySelector("#id") vs getElementsByClassName?

In the former case, loop-hoisting and dead code elimination aredisallowed because querySelector can throw. That means that you can'teliminate it, and you can't move it past other things that might haveobservable side effects (like the counter increment). Arguably this isa misfeature in the design of querySelector.

In the latter case, loop-hoisting or dead code elimination can't happenbecause Gecko doesn't know enough about what [0] will do so assumes theworst: that it can have side-effects that can affect what the "document"getter returns as well as what the getElementsByClassName() call returns.

So there are no shortcuts here; you have to actually do the calls. Whatdo those calls do?

querySelector does a hashtable lookup for the selector to find a parsedselector. Then it sets up some state that's needed for selectormatching. Then it detects that the selector's right-hand-most bit has asimple ID selector and does a fast path that involves looking up that idin the hashtable and then comparing the selector to the elements thatare returned until one of them matches.

getElementsByClassName has to do a hashtable lookup on the class name,then return the result. Then it has to do the [0] (which is actuallysurprisingly expensive, by the way, because of the proxy machineryinvolved on the JS engine side).

So we _could_ make querySelector faster here by adding another specialcase for selectors that are _just_ an id as opposed to the existingoptimization (which works for "#foo > #bar" and similar as well). Andof course the new special case would only work the way you want fordocument.querySelector, not element.querySelector; the latter needs tocheck for your result being a descendant of the element anyway. It's atradeoff between complexity of implementation (which has its ownmaintenance _and_ performance costs) and real-life use cases.

Lastly, I'd like to put numbers to this. On this particular testcase,the querySelector("#list") call takes about 100ns on my hardware: about300 CPU cycles. We could add that other set of special-casing and getit down to 70ns (I just checked by implementing it, so this is not arandom guess). At that point you've got two hashtable lookups (which wecould try to make faster, perhaps), the logic to detect that theoptimization can be done at all (which is not that trivial; our selectorrepresentation requires a bunch of checks to ensure that it's just an idselector), and whatever work is involved in the binding layer. In thiscase, those all seem to have about the same cost; about 17-18ns (50 CPUcycles) each.

So is your use case one where the difference between querySelectorcosting 100ns and it costing 70ns actually makes a difference?

It doesn't look like it benefits much from an eagerly populated hash table?

It benefits a good bit for non-toy documents where avoiding walking theentire DOM is the important part of the optimization. Again,microbenchmarks mostly serve to highlight the costs of constant-timeoverhead, which is a useful thing to do, as long as you know that's whatyou're doing. But for real-life testcases algorithmic complexity canoften be much more important.


-Boris

P.S. Other engines make different complexity/speed tradeoffs here; forexample Safari's querySelector with ".foo" and "#list" is about 60ns onmy hardware. I know they have extra fast paths for both those cases.

Re: Why is querySelector much slower?

Reply via email to