Maybe I'm confused but perhaps you don't understand the pseudo code or I don't understand your problem. The idea is to process in batches instead of all-at-once. I don't see where you came up with 1,000,000 inserts as though the pseudo-code is generating extra insert statements for you. If you made N=1 you would effectively be the same as running without BEGIN/COMMIT. 1-to-999 insert records with N=1000 would result in one BEGIN/COMMIT sequence. 1000 INSERTS woud result in two BEGIN/COMMIT sequences (the 2nd BEGIN/COMMIT would actually have 0 records to commit as you would be right on the boundary) 2000 INSERTs would have 3 BEGIN/COMMIT 3000 INSERTs would have 4 BEGIN/COMMIT It doesn't matter how many inserts you do...the pseudo-code logic just does a new COMMIT/BEGIN on every 1000 inserts. You adjust N to make the system responsive to your users -- the smaller the N the faster the user response time for competing queries. Am I missing something? Do you not know what BEGIN/COMMIT is? Let me rephase using your pseudocode --- this will commit every 1000 INSERTs. count=0 batch=1000 BEGIN while (have data) { count++ if ((count % batch)==0) { COMMIT BEGIN } process next piece of data } } COMMIT
Michael D. Black Senior Scientist Northrop Grumman Mission Systems ________________________________ From: sqlite-users-boun...@sqlite.org on behalf of Igor Sereda Sent: Thu 5/27/2010 6:29 AM To: sqlite-users@sqlite.org Subject: Re: [sqlite] Using cache stats to balance transaction size for optimal performance Michael, Thank you for your suggestion! The problem with this approach is that N would not be a constant that we could tune. As I mentioned, the amount of updates may vary, depending on the data received. For example, one piece of data may lead to a single INSERT. So it would be safe and effective to have N=1000, for example. Another piece of data may lead to 1000 INSERTs. Now, if we still have N=1000, then we'll have 1,000,000 INSERTs in a single transaction. It's completely unpredictable. And when the amount of data changed in a single transaction is large enough, it would cause either cache spill and exclusive lock on the database, or the growth of cache and memory consumption. Do you think this makes sense? We could theoretically count the number of DML statements or steps, but this would imply changing the underlying architecture of the application, so that any plug-in or extension that accesses SQLite also reports how much data did they change. It's not very convenient. Kind regards, Igor Black, Michael (IS) wrote: > > So only do N many records in one batch. That's the easiest thing. Forget > about the cache and just use responsiveness to adjust how many records you > allow at once. > > Pseudo-code: > recnum=0 > BEGIN; > while more records > INSERT.... > recnum++ > if (recnum % 1000) > COMMIT; > BEGIN; > end > COMMIT; > -- View this message in context: http://old.nabble.com/Using-cache-stats-to-balance-transaction-size-for-optimal-performance-tp28690967p28692687.html Sent from the SQLite mailing list archive at Nabble.com. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users