I meant of course the following -- "result" is really ef in the where clause ...
with t(rowid, ef) as ( select distinct rowid, expensive_function(?99, vdata) from some_table ), u(rowid, vdist, oid, flags) as ( select some_table.rowid, ef as vdist, ?1 as oid, flags | (case when ef < ?2 then ?3 else 0 end) from some_table join t on t.rowid == some_table.rowid where (flags & ?3) == 0 and (oid is null and ef < 0.6) or ef < vdist ) update some_table set (vdist, oid, flags) = (select vdist, oid, flags from u where u.rowid = some_table.rowid) where some_table.rowid in (select rowid from u) example, which materializes u, but requires more than one SQL statement, but still only requires you provide parameters to one statement: create temporary table u(rowid integer primary key, vdist, oid, flags); insert into temp.u with t(rowid, ef) as ( select distinct rowid, expensive_function(?99, vdata) from some_table ), u(rowid, vdist, oid, flags) as ( select t.rowid, ef as vdist, ?1 as oid, flags | (case when ef < ?2 then ?3 else 0 end) from some_table join t on t.rowid == some_table.rowid where (flags & ?3) == 0 and (oid is null and ef < 0.6) or ef < vdist ) select rowid, vdist, oid, flags from u; update some_table set (vdist, oid, flags) = (select vdist, oid, flags from temp.u where u.rowid = some_table.rowid) where some_table.rowid in (select rowid from temp.u) drop table temp.u; -- The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume. >-----Original Message----- >From: Keith Medcalf <kmedc...@dessus.com> >Sent: Friday, 8 November, 2019 13:21 >To: 'SQLite mailing list' <sqlite-users@mailinglists.sqlite.org> >Subject: RE: [sqlite] SQLITE_DETERMINISTIC and custom function >optimization > > >SQLITE_DETERMINISTIC does not mean that the function is only called once >for each unique set of arguments, only that when called with a unique set >of arguments that it returns the same result. This means that if it is a >constant it can be factored out of being called more than once. In your >case, the parameters are not constant so it must be called once per row. > >The optimizer is prone to calculating things more often than it needs to, >and is difficult to force to "materialize" things. Since your expensive >function needs to be calculated for every row of the table anyway, it >would be better to just create a table that has it calculated once, then >compute the updates table, then perform the update, then get rid of the >extra tables. You could do this in one statement if anyone could figure >out how to get the optimizer to "materialize" table u in the following >query (table t is materialized by the distinct, which does nothing other >than force t to be materialized -- u is not so easy). > >with t(rowid, ef) > as ( > select distinct rowid, > expensive_function(?99, vdata) > from some_table > ), > u(rowid, vdist, oid, flags) > as ( > select some_table.rowid, > ef as vdist, > ?1 as oid, > flags | (case when ef < ?2 then ?3 else 0 end) > from some_table > join t > on t.rowid == some_table.rowid > where (flags & ?3) == 0 > and (oid is null and ef < 0.6) > or result < vdist > ) >update some_table > set (vdist, oid, flags) = (select vdist, oid, flags > from u > where u.rowid = some_table.rowid) > where some_table.rowid in (select rowid from u) > > >example, which materializes u, but requires more than one SQL statement, >but still only requires you provide parameters to one statement: > >create temporary table u(rowid integer primary key, vdist, oid, flags); >insert into temp.u >with t(rowid, ef) > as ( > select distinct rowid, > expensive_function(?99, vdata) > from some_table > ), > u(rowid, vdist, oid, flags) > as ( > select t.rowid, > ef as vdist, > ?1 as oid, > flags | (case when ef < ?2 then ?3 else 0 end) > from some_table > join t > on t.rowid == some_table.rowid > where (flags & ?3) == 0 > and (oid is null and ef < 0.6) > or result < vdist > ) >select rowid, vdist, oid, flags > from u; >update some_table > set (vdist, oid, flags) = (select vdist, oid, flags > from temp.u > where u.rowid = some_table.rowid) > where some_table.rowid in (select rowid from temp.u) >drop table temp.u; > >This will calculate expensive_function only once per row in some_table. >It assumes and requires that "some_table" is a rowid table. > >-- >The fact that there's a Highway to Hell but only a Stairway to Heaven >says a lot about anticipated traffic volume. > >>-----Original Message----- >>From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> On >>Behalf Of Mario M. Westphal >>Sent: Friday, 8 November, 2019 12:08 >>To: sqlite-users@mailinglists.sqlite.org >>Subject: [sqlite] SQLITE_DETERMINISTIC and custom function optimization >> >>Hi all, >> >> >> >>I have a table with matrices stored as blobs. Each matrix has about 800 >>bytes. >> >>This table has between 20,000 and 500,000 rows. >> >> >> >>I use a custom function "EXPENSIVE_FUNCTION" which performs a >calculation >>using a matrix supplied via sqlite3_bind_pointer() as ?99 and the matrix >>in >>the vdata column. >> >>I create the function using the SQLITE_DETERMINISTIC flag. My hope was >>that >>the EXPENSIVE_FUNCTION is called only once per row. But that's not the >>case. >> >> >> >>The query looks like this: >> >> >> >>UPDATE some_table >> >> >> >>SET >> >>vdist = EXPENSIVE_FUNCTION(?99,vdata), >> >>oid = ?1, >> >>flags = flags | (CASE WHEN EXPENSIVE_FUNCTION(?99,vdata) < ?2 THEN ?3 >>ELSE 0 >>END) >> >> >> >>WHERE >> >>(flags & ?3) = 0 AND >> >>(oid IS NULL AND EXPENSIVE_FUNCTION(?99,vdata) < 0.6) OR >> >>(EXPENSIVE_FUNCTION(?99,vdata) < vdist) >> >> >> >>The EXPENSIVE_FUNCTION function is referred multiple times in the update >>statement. But it always returns the same result (for any given row). >> >> >> >>My stats report that SQLite calls EXPENSIVE_FUNCTION 91,806 times for a >>table with 45,775 rows. >> >>256 rows are modified. This takes (only) 3.3 seconds. >> >> >> >>The profiler tells me that sqlite3VdbeExec() spends 47% in >>vdbeMemFromBtreeResize and 36% in EXPENSIVE_FUNCTION. >> >> >> >>Can I change something so SQLite calls EXPENSIVE_FUNCTION only once per >>row? >> >> >> >>Thanks in advance. >> >> >> >>-- Mario >> >>_______________________________________________ >>sqlite-users mailing list >>sqlite-users@mailinglists.sqlite.org >>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users