Hi all, I have been trying to implement a couple of things in SQLite because I only need it for myself (so no concurrency issues here).
I am working on Arch Linux (uname: 2.6.28-ARCH #1 SMP PREEMPT Sun Mar 8 10:18:28 UTC 2009 i686 Intel(R) Xeon(TM) CPU 3.40GHz). I have a flat text file that I want to put into SQLite to be able to query it and also update fields it required. This raw text file has 7.2GB, roughly 11 million entries and 60 fields. (Perhaps not the best design, but that shall not be the issue here) Getting it into SQLite was not as easy as I anticipated. At first, I used the execute() method for every insert, and committed every 10000 rows. This run didn't finish overnight. The a commit every 100,000 rows: at around 2.5 million entries it slowed down so much that it was unacceptable. I used PRAGMA journal_mode = OFF which improved a bit, but not enough. The final rescue was to use executemany() for every 100,000 records combined with a commit(). That put the whole file in the DB in approx 17 min. Creation of indices where required took another 40 minutes. The final database file has roughly 11GB. Now, I have 3 spare fields in the table that I want to populate with data from another file. I loop over the file in question, from there I get the ID of the record to update in the table and the info to populate the empty field with. Again, I used executemany() every 100,000 records, assuming that it would work. But no. Even when I use executemany() every 10,000 rows, these 10,000 updates take around 5 minutes or so (I haven't properly timed it). That is also with PRAGMA journal_mode = OFF. When I just get the IDs of the entries that I need to update, look them up in the table (select a,b,c from table where id in (...)), and retrieve them with fetchall() and do nothing with them (no printing, etc.) then this works fine for the first 10,000 records. After that, again, it get's so slow that it's basically unusable. The ID field that I use for the lookups has an index on it. The fields that I am updating do not have indices on them. It's just text fields that are being populated with strings up to approx 150 characters. Why are the retrievals that slow, and why are they getting slower the further I go? And why are the updates so slow? Is it because the database file is too big? Any help or suggestions would be highly appreciated! Regards, Flo _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users