Re: [sqlite] How to interrupt a long running update without roll back?

Ryan Johnson Wed, 19 Jun 2013 06:14:48 -0700

On 19/06/2013 1:41 AM, jhnlmn wrote:

Thank you for your response


Simon Slavin <slavins@...> writes:

UPDATE T SET C1 = calculation(C2) WHERE C1 IS NULL AND rowid < 10000

This is the best solution when the table is freshly created
and max(rowid) == number of rows.
But after many deletes and inserts many rowid will be unused,
max(rowid) may grow indefinitely and the number of required updates will be
also indefinite.

So, no perfect solution so far.

Why not use a trigger on T to update C1 whenever it gets set to NULL? Isit actually important for the NULL to persist until your next batchupdate comes along?

If a trigger would work, that's by far the cleanest solution, and willhave almost no impact on concurrency. Otherwise...

Assuming you want to process between 10k and 20k rows at a time, youmight try something like this (writing in python, you should be able totranslate it easily to the language you actually use) :


import sqlite3
c = sqlite3.open('my-database.db')

histo = list(conn.execute('select min(rowid) lo, max(rowid) hi, count(*)n from lineitem group by rowid/10000' order by lo))

buckets,i = [(0,0,0),], 0
while i < len(histo):
    a,b,n = buckets[-1]
    c,d,m = histo[i]
    buckets[-1] = (a,d,n+m)
    i += 1
    if n+m >= 10000
        buckets.append((0,0,0))

for lo,hi,n in buckets:

conn.execute('update T set C1=calculation(C2) where C1 is NULL androwid between ? and ?', (lo, hi))

Translated into plain English: count the number of rows in each slice of10k, being sure to return slices in order. That query will be fastbecause it should read from your rowid index and has small output size(even a billion-row input will only produce 100k rows). Merge too-smallslices so that each contains somewhere between 10k and 20k-1 rows, thenrun your update query, passing the lower and upper bound of each sliceto limit where it looks. Again, the index on rowid will enforce therange limit without a table scan (but double-check the output of"explain query plan").

NOTE: by splitting the transaction, you risk the database changing outfrom under you before the last slice is done. Somebody *could* delete awhole chunk of the rowid space, for example, and throw off yourcarefully computed slice sizes. Or they could add rows after you makethe slices, and those rows would be ignored. Or they could set C1=NULLon rows you already looked at. Most likely, you can squint and claimthat all those kinds of things just happened "after" the batch update,but whether that's allowed is application dependent.


Ryan

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to interrupt a long running update without roll back?

Reply via email to