Re: [sqlite] quickly insert multiple rows into multiple table
W.-H. Gu wrote: > For (1), after I wrapped inserts into a transaction, I face an issue: > the cpu usage is too high. I think the reason is that I use prepare > statement to insert a row at a time, it than does sqlite3_step for > thousands times every second. If I'm reading this right, you're saying you're re-preparing the statement on every row. Don't do that. Prepare it once, and only re-bind the variables for each insert. Then finalize the statement after your loop. I gather that statement preparation is a non-trivial percentage of the compute time taken to insert a single row. > Every second I insert 9184 rows, Your optimal insertion rates will vary strongly depending on your hardware, your operating system, the efficiency of data generation between calls to the sqlite api, amount of data you're inserting per row, amount of RAM sqlite is allowed to use, any indices, triggers, check constraints, unique constraints, primary keys, and foreign keys you have defined, compile time flags, and dozens of other variables. I'm no SQL or SQLite expert myself -- the gurus might add many items to that list off the tops of their heads. We have no way of knowing what number you should be shooting for. > which leads to cpu usage ~30%. > > Is there any good way to resolve this issue? Obviously there are many variables here. Why do you think you want low CPU usage? Generally, I think you *want* to be CPU-bound for the duration of your insert batch. Anything else implies you're doing something besides useful computation (like waiting for disk i/o (due to a cache spill or one of many other possible reasons), or scheduled behind some other process, or something). At commit time, sqlite will write the new rows back to the disk, by default blocking til the OS tells SQLite that the write is complete (though this is configurable). During this time you expect cpu usage to go down while disk usage goes up. Another point is that you need to make sure the OS is really telling you what you think it's telling you. E.g. a common mistake for a Linux user is to think that the load average is the average CPU usage over some period of time. It isn't. Eric -- Eric A. Smith One of my most productive days was throwing away 1000 lines of code. -- Ken Thompson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] quickly insert multiple rows into multiple table
Hi, For (1), after I wrapped inserts into a transaction, I face an issue: the cpu usage is too high. I think the reason is that I use prepare statement to insert a row at a time, it than does sqlite3_step for thousands times every second. Every second I insert 9184 rows, which leads to cpu usage ~30%. Is there any good way to resolve this issue? Thanks, WeiHsin On Mon, Jul 12, 2010 at 10:48 AM, Kees Nuytwrote: > 1On Mon, 12 Jul 2010 09:29:35 -0700, "W.-H. Gu" > wrote: > > >Hi, > > > > I have 5 tables. Every second I want to insert 1 row into the main > table, > >and multiple rows into every other table. What I do right now is to > prepare > >5 statements for those 5 tables, where the statements for multiple row > >insertion is like > > > > INSERT INTO MyTable (FirstCol, SecondCol) > > SELECT 'First' ,1 > > UNION ALL > > SELECT 'Second' ,2 > > UNION ALL > > SELECT 'Third' ,3 > > UNION ALL > > SELECT 'Fourth' ,4 > > UNION ALL > > SELECT 'Fifth' ,5 > > > > My questions are: > > > > (1) Is the above statement for multiple row insertion efficient? What is > >the best approach? > > That is a very complicated way to combine multiple insert > statements. > > Just wrap the INSERTs (several thousands of them is not a > problem) into a transaction: > > BEGIN EXCLUSIVE TRANSACTION; > INSERT INTO MyTable (FirstCol, SecondCol) > VALUES ('First' ,1); > INSERT INTO MyTable (FirstCol, SecondCol) > VALUES ('Second' ,2); > INSERT INTO MyTable (FirstCol, SecondCol) > VALUES ('Third' ,3); > INSERT INTO MyTable (FirstCol, SecondCol) > VALUES ('Fourth' ,4); > INSERT INTO MyTable (FirstCol, SecondCol) > VALUES ('Fifth' ,5); > COMMIT; > > > (2) Should I need to bind parameters every time when inserting? Or is > >there a way to bind pointers so that I just need to do sqlite3_step every > >time when I insert rows? > > I think you have to bind. You make it sound like a lot of > work, but you only have to write the code once > > > (3) Is there a way to speed up the whole process? For example, disable > >index management first and enable it again after bulk insert, or one > prepare > >statement for multiple row multiple table insertion, etc.. > > a) Normalize redundancy out of the schema. > b) Wrap many inserts in a transaction > c) Before bulk load, you can indeed DROP indexes and CREATE > them again later to get some speed advantage > d) sort the data in primary key order before inserting > e) Give the database as much cache as you can (but not more > than the estimated database size) > f) fast hardware (especially disks) > > I think there are a few more suggestions in the wiki on > http://www.sqlite.org > -- > ( Kees Nuyt > ) > c[_] > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] quickly insert multiple rows into multiple table
1On Mon, 12 Jul 2010 09:29:35 -0700, "W.-H. Gu"wrote: >Hi, > > I have 5 tables. Every second I want to insert 1 row into the main table, >and multiple rows into every other table. What I do right now is to prepare >5 statements for those 5 tables, where the statements for multiple row >insertion is like > > INSERT INTO MyTable (FirstCol, SecondCol) > SELECT 'First' ,1 > UNION ALL > SELECT 'Second' ,2 > UNION ALL > SELECT 'Third' ,3 > UNION ALL > SELECT 'Fourth' ,4 > UNION ALL > SELECT 'Fifth' ,5 > > My questions are: > > (1) Is the above statement for multiple row insertion efficient? What is >the best approach? That is a very complicated way to combine multiple insert statements. Just wrap the INSERTs (several thousands of them is not a problem) into a transaction: BEGIN EXCLUSIVE TRANSACTION; INSERT INTO MyTable (FirstCol, SecondCol) VALUES ('First' ,1); INSERT INTO MyTable (FirstCol, SecondCol) VALUES ('Second' ,2); INSERT INTO MyTable (FirstCol, SecondCol) VALUES ('Third' ,3); INSERT INTO MyTable (FirstCol, SecondCol) VALUES ('Fourth' ,4); INSERT INTO MyTable (FirstCol, SecondCol) VALUES ('Fifth' ,5); COMMIT; > (2) Should I need to bind parameters every time when inserting? Or is >there a way to bind pointers so that I just need to do sqlite3_step every >time when I insert rows? I think you have to bind. You make it sound like a lot of work, but you only have to write the code once > (3) Is there a way to speed up the whole process? For example, disable >index management first and enable it again after bulk insert, or one prepare >statement for multiple row multiple table insertion, etc.. a) Normalize redundancy out of the schema. b) Wrap many inserts in a transaction c) Before bulk load, you can indeed DROP indexes and CREATE them again later to get some speed advantage d) sort the data in primary key order before inserting e) Give the database as much cache as you can (but not more than the estimated database size) f) fast hardware (especially disks) I think there are a few more suggestions in the wiki on http://www.sqlite.org -- ( Kees Nuyt ) c[_] ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users