I meant of course the following -- "result" is really ef in the where clause ...
with t(rowid, ef)
as (
select distinct rowid,
expensive_function(?99, vdata)
from some_table
),
u(rowid, vdist, oid, flags)
as (
select some_table.rowid,
ef as vdist,
?1 as oid,
flags | (case when ef < ?2 then ?3 else 0 end)
from some_table
join t
on t.rowid == some_table.rowid
where (flags & ?3) == 0
and (oid is null and ef < 0.6)
or ef < vdist
)
update some_table
set (vdist, oid, flags) = (select vdist, oid, flags
from u
where u.rowid = some_table.rowid)
where some_table.rowid in (select rowid from u)
example, which materializes u, but requires more than one SQL statement, but
still only requires you provide parameters to one statement:
create temporary table u(rowid integer primary key, vdist, oid, flags);
insert into temp.u
with t(rowid, ef)
as (
select distinct rowid,
expensive_function(?99, vdata)
from some_table
),
u(rowid, vdist, oid, flags)
as (
select t.rowid,
ef as vdist,
?1 as oid,
flags | (case when ef < ?2 then ?3 else 0 end)
from some_table
join t
on t.rowid == some_table.rowid
where (flags & ?3) == 0
and (oid is null and ef < 0.6)
or ef < vdist
)
select rowid, vdist, oid, flags
from u;
update some_table
set (vdist, oid, flags) = (select vdist, oid, flags
from temp.u
where u.rowid = some_table.rowid)
where some_table.rowid in (select rowid from temp.u)
drop table temp.u;
--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a
lot about anticipated traffic volume.
>-----Original Message-----
>From: Keith Medcalf <[email protected]>
>Sent: Friday, 8 November, 2019 13:21
>To: 'SQLite mailing list' <[email protected]>
>Subject: RE: [sqlite] SQLITE_DETERMINISTIC and custom function
>optimization
>
>
>SQLITE_DETERMINISTIC does not mean that the function is only called once
>for each unique set of arguments, only that when called with a unique set
>of arguments that it returns the same result. This means that if it is a
>constant it can be factored out of being called more than once. In your
>case, the parameters are not constant so it must be called once per row.
>
>The optimizer is prone to calculating things more often than it needs to,
>and is difficult to force to "materialize" things. Since your expensive
>function needs to be calculated for every row of the table anyway, it
>would be better to just create a table that has it calculated once, then
>compute the updates table, then perform the update, then get rid of the
>extra tables. You could do this in one statement if anyone could figure
>out how to get the optimizer to "materialize" table u in the following
>query (table t is materialized by the distinct, which does nothing other
>than force t to be materialized -- u is not so easy).
>
>with t(rowid, ef)
> as (
> select distinct rowid,
> expensive_function(?99, vdata)
> from some_table
> ),
> u(rowid, vdist, oid, flags)
> as (
> select some_table.rowid,
> ef as vdist,
> ?1 as oid,
> flags | (case when ef < ?2 then ?3 else 0 end)
> from some_table
> join t
> on t.rowid == some_table.rowid
> where (flags & ?3) == 0
> and (oid is null and ef < 0.6)
> or result < vdist
> )
>update some_table
> set (vdist, oid, flags) = (select vdist, oid, flags
> from u
> where u.rowid = some_table.rowid)
> where some_table.rowid in (select rowid from u)
>
>
>example, which materializes u, but requires more than one SQL statement,
>but still only requires you provide parameters to one statement:
>
>create temporary table u(rowid integer primary key, vdist, oid, flags);
>insert into temp.u
>with t(rowid, ef)
> as (
> select distinct rowid,
> expensive_function(?99, vdata)
> from some_table
> ),
> u(rowid, vdist, oid, flags)
> as (
> select t.rowid,
> ef as vdist,
> ?1 as oid,
> flags | (case when ef < ?2 then ?3 else 0 end)
> from some_table
> join t
> on t.rowid == some_table.rowid
> where (flags & ?3) == 0
> and (oid is null and ef < 0.6)
> or result < vdist
> )
>select rowid, vdist, oid, flags
> from u;
>update some_table
> set (vdist, oid, flags) = (select vdist, oid, flags
> from temp.u
> where u.rowid = some_table.rowid)
> where some_table.rowid in (select rowid from temp.u)
>drop table temp.u;
>
>This will calculate expensive_function only once per row in some_table.
>It assumes and requires that "some_table" is a rowid table.
>
>--
>The fact that there's a Highway to Hell but only a Stairway to Heaven
>says a lot about anticipated traffic volume.
>
>>-----Original Message-----
>>From: sqlite-users <[email protected]> On
>>Behalf Of Mario M. Westphal
>>Sent: Friday, 8 November, 2019 12:08
>>To: [email protected]
>>Subject: [sqlite] SQLITE_DETERMINISTIC and custom function optimization
>>
>>Hi all,
>>
>>
>>
>>I have a table with matrices stored as blobs. Each matrix has about 800
>>bytes.
>>
>>This table has between 20,000 and 500,000 rows.
>>
>>
>>
>>I use a custom function "EXPENSIVE_FUNCTION" which performs a
>calculation
>>using a matrix supplied via sqlite3_bind_pointer() as ?99 and the matrix
>>in
>>the vdata column.
>>
>>I create the function using the SQLITE_DETERMINISTIC flag. My hope was
>>that
>>the EXPENSIVE_FUNCTION is called only once per row. But that's not the
>>case.
>>
>>
>>
>>The query looks like this:
>>
>>
>>
>>UPDATE some_table
>>
>>
>>
>>SET
>>
>>vdist = EXPENSIVE_FUNCTION(?99,vdata),
>>
>>oid = ?1,
>>
>>flags = flags | (CASE WHEN EXPENSIVE_FUNCTION(?99,vdata) < ?2 THEN ?3
>>ELSE 0
>>END)
>>
>>
>>
>>WHERE
>>
>>(flags & ?3) = 0 AND
>>
>>(oid IS NULL AND EXPENSIVE_FUNCTION(?99,vdata) < 0.6) OR
>>
>>(EXPENSIVE_FUNCTION(?99,vdata) < vdist)
>>
>>
>>
>>The EXPENSIVE_FUNCTION function is referred multiple times in the update
>>statement. But it always returns the same result (for any given row).
>>
>>
>>
>>My stats report that SQLite calls EXPENSIVE_FUNCTION 91,806 times for a
>>table with 45,775 rows.
>>
>>256 rows are modified. This takes (only) 3.3 seconds.
>>
>>
>>
>>The profiler tells me that sqlite3VdbeExec() spends 47% in
>>vdbeMemFromBtreeResize and 36% in EXPENSIVE_FUNCTION.
>>
>>
>>
>>Can I change something so SQLite calls EXPENSIVE_FUNCTION only once per
>>row?
>>
>>
>>
>>Thanks in advance.
>>
>>
>>
>>-- Mario
>>
>>_______________________________________________
>>sqlite-users mailing list
>>[email protected]
>>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users