On 21/06/2013 2:55 AM, jhnlmn wrote:
Ryan Johnson <ryan.johnson@...> writes:
histo = list(conn.execute('select min(rowid) lo, max(rowid) hi, count(*)
n from lineitem group by rowid/10000' order by lo))
...
     a,b,n = buckets[-1]

Thank you for your reply.
You code is not very correct ("a" always remain 0),
but I got your idea (and tested it).
Bucket creation adds about 50% to the time of update, which is acceptable.

By the way, I also posted my question at
http://stackoverflow.com/questions/17099491/sqlite3-how-to-interrupt-a-long-running-update-without-roll-back

So, if you care, you may wish to post your "bucket" solution there as well,
in case somebody will need it.

Why not use a trigger on T to update C1 whenever it gets set to NULL?
My case is actually simpler: C1 is NULL only on newly inserted rows
(I should have said this from the beginning).

So, I just realized that if I enable AUTOINCREMENT feature,
then I do not need to process entire rowid range,
but only range of rows added after the last update.
This range should not have many holes and I can process this range
10,000 rowids at a time.
This will work until rowid will reach the largest possible integer.
This sounds like a good time to re-examine what end goal you're really after here (rather than how to implement this particular way of achieving that goal). Some questions worth asking yourself might include:

Q1: Is C1 *always* NULL in a newly-inserted row, or does the application sometimes insert some arbitrary value?

Q2: Does the transition from NULL to calculation(C2) mean something special to the application?

Scenario 1: C1=calculation(C2) is a constraint.
Solution: mediate everything through a view:

create table T_data(..., C2, ...); -- C1 is conspicuously absent here
create view T as select *, calculation(C2) C1 from T_data;
create trigger T_update instead of update on T begin update T_data set ..., C2=new.C2, ... where rowid = new.rowid;
end; -- need similar "instead of insert" as well

Scenario 2: C1=calculation(C2) is the default to use when the user fails to supply C1; the initial C1=NULL is a workaround for inability to express "DEFAULT calculation(C1)". Alternatively, C1 is a constraint, but calculation() is expensive enough we don't want to recompute it.
Solution: use an after-insert trigger:

create trigger tc1 after insert on T when new.c1 is NULL begin update T set c1=calculation(c2) where rowid=new.rowid; end;

Scenario 3: C1=NULL has a specific meaning in your application, and it matters when C1 transitions from NULL to calculation(C2). Solution: your AUTOINCREMENT + batch update is probably the best choice. The timing of the batch update almost certainly should depend on something other than number of rows inserted so far; otherwise a trigger could run the batch whenever a large enough rowid is created, which would imply that a normal after update trigger would work. I strongly suspect this is either *not* your actual scenario, or you have a lurking bug: if NULL C1 really matters, and it's important for the the transition away from NULL not to happen right away, that suggests something must happen in the system "promote" "old-enough" NULL C1 to non-NULL status. However, that batch, whatever the trigger, could catch a just-barely-inserted NULL C1 and change it to C2 before any reader notices it was even there. If that is OK, then there's a very good chance that you actually have Scenario 2.

Thoughts?
Ryan


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to