SQLITE_DETERMINISTIC does not mean that the function is only called once for 
each unique set of arguments, only that when called with a unique set of 
arguments that it returns the same result.  This means that if it is a constant 
it can be factored out of being called more than once.  In your case, the 
parameters are not constant so it must be called once per row.

The optimizer is prone to calculating things more often than it needs to, and 
is difficult to force to "materialize" things.  Since your expensive function 
needs to be calculated for every row of the table anyway, it would be better to 
just create a table that has it calculated once, then compute the updates 
table, then perform the update, then get rid of the extra tables.  You could do 
this in one statement if anyone could figure out how to get the optimizer to 
"materialize" table u in the following query (table t is materialized by the 
distinct, which does nothing other than force t to be materialized -- u is not 
so easy).

with t(rowid, ef)
  as (
      select distinct rowid,
                      expensive_function(?99, vdata)
        from some_table
     ),
     u(rowid, vdist, oid, flags)
  as (
      select some_table.rowid,
             ef as vdist,
             ?1 as oid,
             flags | (case when ef < ?2 then ?3 else 0 end)
        from some_table
        join t 
          on t.rowid == some_table.rowid
       where (flags & ?3) == 0
         and (oid is null and ef < 0.6)
          or result < vdist
     )
update some_table
   set (vdist, oid, flags) = (select vdist, oid, flags
                                from u
                               where u.rowid = some_table.rowid)
  where some_table.rowid in (select rowid from u)


example, which materializes u, but requires more than one SQL statement, but 
still only requires you provide parameters to one statement:

create temporary table u(rowid integer primary key, vdist, oid, flags);
insert into temp.u
with t(rowid, ef)
  as (
      select distinct rowid,
                      expensive_function(?99, vdata)
        from some_table
     ),
     u(rowid, vdist, oid, flags)
  as (
      select t.rowid,
             ef as vdist,
             ?1 as oid,
             flags | (case when ef < ?2 then ?3 else 0 end)
        from some_table
        join t 
          on t.rowid == some_table.rowid
       where (flags & ?3) == 0
         and (oid is null and ef < 0.6)
          or result < vdist
     )
select rowid, vdist, oid, flags
  from u;
update some_table
   set (vdist, oid, flags) = (select vdist, oid, flags
                                from temp.u
                               where u.rowid = some_table.rowid)
  where some_table.rowid in (select rowid from temp.u)
drop table temp.u;

This will calculate expensive_function only once per row in some_table.  It 
assumes and requires that "some_table" is a rowid table.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-----Original Message-----
>From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> On
>Behalf Of Mario M. Westphal
>Sent: Friday, 8 November, 2019 12:08
>To: sqlite-users@mailinglists.sqlite.org
>Subject: [sqlite] SQLITE_DETERMINISTIC and custom function optimization
>
>Hi all,
>
>
>
>I have a table with matrices stored as blobs. Each matrix has about 800
>bytes.
>
>This table has between 20,000 and 500,000 rows.
>
>
>
>I use a custom function "EXPENSIVE_FUNCTION" which performs a calculation
>using a matrix supplied via sqlite3_bind_pointer() as ?99 and the matrix
>in
>the vdata column.
>
>I create the function using the SQLITE_DETERMINISTIC flag. My hope was
>that
>the EXPENSIVE_FUNCTION is called only once per row. But that's not the
>case.
>
>
>
>The query looks like this:
>
>
>
>UPDATE some_table
>
>
>
>SET
>
>vdist = EXPENSIVE_FUNCTION(?99,vdata),
>
>oid = ?1,
>
>flags = flags | (CASE WHEN EXPENSIVE_FUNCTION(?99,vdata) < ?2 THEN ?3
>ELSE 0
>END)
>
>
>
>WHERE
>
>(flags & ?3) = 0  AND
>
>(oid IS NULL AND EXPENSIVE_FUNCTION(?99,vdata) < 0.6) OR
>
>(EXPENSIVE_FUNCTION(?99,vdata) < vdist)
>
>
>
>The EXPENSIVE_FUNCTION function is referred multiple times in the update
>statement. But it always returns the same result (for any given row).
>
>
>
>My stats report that SQLite calls EXPENSIVE_FUNCTION 91,806 times for a
>table with 45,775 rows.
>
>256 rows are modified. This takes (only) 3.3 seconds.
>
>
>
>The profiler tells me that sqlite3VdbeExec() spends 47% in
>vdbeMemFromBtreeResize and 36% in EXPENSIVE_FUNCTION.
>
>
>
>Can I change something so SQLite calls EXPENSIVE_FUNCTION only once per
>row?
>
>
>
>Thanks in advance.
>
>
>
>-- Mario
>
>_______________________________________________
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to