With the new opcode statistics added by Cameron, we can see that many of the bytecode instructions executed on both SunSpider and real sites are "load" instructions. On SunSpider, it's around 19%, and on many real sites the proportion is 10-20%.
The "load" opcode's only purpose is to move a constant pool item into a VM register, so that other instructions can operate on it. We could save the dispatch cost for these instructions, and an extra memory access, by having opcodes that can operate on inputs from the constant pool directly. In a way, this merges the "load" instructions with the instructions that will actually use the value. Here is a rough outline for how to make this work. 1) Introduce a new class, OperandID, which has all the functionality of RegisterID, plus a flag indicating whether the operand is a constant pool index or register index. RegisterID remains as a subclass, and a new ConstantID subclass is added. All CodeGenerator::emit...() functions take OperandID parameters for input operands and RegisterID for output operands. 2) The "load" instruction is renamed to "mov_k", this will match the systematic naming scheme for instruction variants later. 3) AST nodes that produce a constant won't emit a load/mov_k instruction any more, instead they produce a constant pool index OperandID (unless they are passed in a dst in which case they emit a mov_k, same as anything else would via moveToDestinationIfNeeded()). 4) A data table lists the opcodes and what variants exist, plus some other properties. For example, to introduce variants of and where either the first or second input operand can be a constant: add inputs:2 rk kr This relates add_rk and add_kr to add. (The places in the code that manually list all the opcodes would also be autogenerated, but not the main interpreter loop). For bitand, we know that when one operand is constant there can be no observable difference to doing the operation with either operand order, so we only need one variant and an indication that the opcode is commutative: bitand inputs:2 rk commutative 5) Each CodeGenerator::emit...() function, on entry, passes its OperandID arguments by reference to an emitOpcode() function, which chooses the best variant opcode, emits constant loads if necessary (because no available format can accept the right k operands), and substitutes the OperandID arguments. Thus, loads are generated at the latest possible point. Note, in theory if all inputs are k-operands we can in most cases do constant folding in step 5, as part of codegen. The only trick here is that at emitSomeOp() time we no longer know if the destination is mandatory, so at best we can fold down to a mov_k, but perhaps we can pass in some indication of whether dst is mandatory. This scheme should be complementary with Cameron's plans for a pattern- matching peephole optimizer. We considered the possibility of a tile- matching code generator (which could handle the equivalent of k- operands and various peephole optimizations) but we decided that it sounded too complicated for our needs. Regards, Maciej _______________________________________________ webkit-dev mailing list [email protected] http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

