Joe Wilson <[EMAIL PROTECTED]> wrote: > --- "D. Richard Hipp" <[EMAIL PROTECTED]> wrote: > > Sorry for the confusion. > > No problem. > > For what it's worth, I am also curious as to the final form of the > VM opcode transformation. The number of opcodes generated by the various > SQL statements seems to be roughly the same as the old scheme. At this > point without sub-expresssion elimination are you seeing any speed > improvement? >
I have not even looked at performance yet. I'm assuming that performance will drop during the conversion process and that we will have to fight to get it back up to previous levels after the conversion is complete. But consider would can be done with a register machine that would couldn't do with the old stack machine. In a statement like this: SELECT * FROM a NATURAL JOIN b; Suppose tables a and b have column c in common and unique columns a1, a2, a3 and b1, b2, b3. With the stack machine, the algorithm is roughly this: foreach each entry in a: foreach entry in b with b.c==a.c: push a.c push a.a1 push a.a2 push a.a3 push b.b1 push b.b2 push b.b3 return one row of result endforeach endforeach For each result row, all columns had to be pushed onto the stack. Then the OP_Callback opcode would fire, causing sqlite3_step() to return SQLITE_ROW. The result columns would then be available to sqlite3_column_xxx() routines which read those results off of the stack. When sqlite3_step() is called again, all result columns are popped from the stack and execution continues with the first operation after the OP_Callback. In the register VM, result columns are stored in a consecutive sequence of registers. It is no longer necessary to pop the stack of prior results at the start of each sqlite3_step(). So the code can look more like this: foreach entry in a: r1 = a.c r2 = a.a1 r3 = a.a2 r4 = a.a3 foreach entry in b where c=r1: r5 = b.b1 r6 = b.b2 r7 = b.b3 return one row of result endforeach endforeach When result are stored in registers, the computation of the first four columns of the result set can be factored out of the inner loop. If there are 10 matching rows in b for every row in a, this might result in a significant performance boost. Do not look for this improvement right away, though. The first order of business is to get the VM converted over into a register machine. Only after that is successfully accomplished will we look into implementing optimizations such as the above. -- D. Richard Hipp <[EMAIL PROTECTED]> ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------