On 21/06/2013 2:55 AM, jhnlmn wrote:
Ryan Johnson <ryan.johnson@...> writes:
histo = list(conn.execute('select min(rowid) lo, max(rowid) hi, count(*)
n from lineitem group by rowid/10000' order by lo))
...
a,b,n = buckets[-1]
Thank you for your reply.
You code is not very correct ("a" always remain 0),
but I got your idea (and tested it).
Bucket creation adds about 50% to the time of update, which is acceptable.
By the way, I also posted my question at
http://stackoverflow.com/questions/17099491/sqlite3-how-to-interrupt-a-long-running-update-without-roll-back
So, if you care, you may wish to post your "bucket" solution there as well,
in case somebody will need it.
Why not use a trigger on T to update C1 whenever it gets set to NULL?
My case is actually simpler: C1 is NULL only on newly inserted rows
(I should have said this from the beginning).
So, I just realized that if I enable AUTOINCREMENT feature,
then I do not need to process entire rowid range,
but only range of rows added after the last update.
This range should not have many holes and I can process this range
10,000 rowids at a time.
This will work until rowid will reach the largest possible integer.
This sounds like a good time to re-examine what end goal you're really
after here (rather than how to implement this particular way of
achieving that goal). Some questions worth asking yourself might include:
Q1: Is C1 *always* NULL in a newly-inserted row, or does the application
sometimes insert some arbitrary value?
Q2: Does the transition from NULL to calculation(C2) mean something
special to the application?
Scenario 1: C1=calculation(C2) is a constraint.
Solution: mediate everything through a view:
create table T_data(..., C2, ...); -- C1 is conspicuously absent here
create view T as select *, calculation(C2) C1 from T_data;
create trigger T_update instead of update on T begin update T_data
set ..., C2=new.C2, ... where rowid = new.rowid;
end; -- need similar "instead of insert" as well
Scenario 2: C1=calculation(C2) is the default to use when the user fails
to supply C1; the initial C1=NULL is a workaround for inability to
express "DEFAULT calculation(C1)". Alternatively, C1 is a constraint,
but calculation() is expensive enough we don't want to recompute it.
Solution: use an after-insert trigger:
create trigger tc1 after insert on T when new.c1 is NULL begin update T
set c1=calculation(c2) where rowid=new.rowid; end;
Scenario 3: C1=NULL has a specific meaning in your application, and it
matters when C1 transitions from NULL to calculation(C2).
Solution: your AUTOINCREMENT + batch update is probably the best choice.
The timing of the batch update almost certainly should depend on
something other than number of rows inserted so far; otherwise a trigger
could run the batch whenever a large enough rowid is created, which
would imply that a normal after update trigger would work. I strongly
suspect this is either *not* your actual scenario, or you have a lurking
bug: if NULL C1 really matters, and it's important for the the
transition away from NULL not to happen right away, that suggests
something must happen in the system "promote" "old-enough" NULL C1 to
non-NULL status. However, that batch, whatever the trigger, could catch
a just-barely-inserted NULL C1 and change it to C2 before any reader
notices it was even there. If that is OK, then there's a very good
chance that you actually have Scenario 2.
Thoughts?
Ryan
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users