Re: [sqlite] SQLite performance for 10 Million Records
Joe Wilson wrote: I think some other factor is at play here. SQLite 2.x's memory databases are still twice as fast at batch inserts than either 3.x's disk-based databases or 2.x's disk-based databases when the DB size is less than physical machine memory. Joe, Yes there is another factor at work here. In version 3.x SQLite uses the same pager for both memory and file based databases. The file based pager calls to the OS for I/O to the disk file, but SQLite implements its own I/O (really read/write) routines to get and save pages in a memory based database. In version 2.x the memory based databases used a completely different pager than the file based databases. This change was made to simplify the code and helps to ensure all features work the same regardless of where the database pages are stored, but it has definitely slowed down the memory based databases. Memory based database used to be twice as fast as file based databases, but they are now slightly slower than file based databases. I suspect there are optimizations that could be made to the memory I/O routines to speed them up, they should at least be able to run slightly faster than file based I/O. Dennis Cote
Re: [sqlite] SQLite performance for 10 Million Records
Dennis Cote wrote: Joe Wilson wrote: I think some other factor is at play here. Yes there is another factor at work here. [...] I suspect there are optimizations that could be made to the memory I/O routines to speed them up, they should at least be able to run slightly faster than file based I/O. Dennis Cote I still find it rather hard to accept that disk databases are as fast or faster than memory databases, so I ran a *simple* test on WinXP and they are. It's so counterintuitive. :( So, I used the Task Manager to watch the sqlite shell run a simple bit of table stuffing SQL, shown below. For :memory: databases, long periods were observed where the VM size crept up but I/O write bytes did not, followed by periods where I/O bytes increased. For disk I/O VM size stayed constant at about 4MB and I/O bytes increased steadily throughout. The tests took about 3m5s and 3m47s, so the disk based database was slower in this case, but not by much. On the face of it, it seems that :memory: databases cause sqlite to spend a lot of time mallocing large amounts of memory by requesting a large number of small chunks, but I wouldn't have expected this to be as slow as disk I/O. Has anyone done a proper profiling analysis? I don't really buy the disk caching suggestion because if we have a large transaction, then surely we still have to wait an appreciable time after the commit for the data to be sync'd to the disk. If we extrapolate through extremes, are we saying that disk caching makes really slow hard disks and floppies as fast as memory I/O. Seems very odd. Where does the sync time go? Martin Noddy SQL follows: begin; select datetime(now); create table t(id INTEGER, t TEXT); insert into t values (1,'Hello'); insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; end; select datetime(now);
Re: [sqlite] SQLite performance for 10 Million Records
Hi, I am really really thankful to all the members of this group. The Discussion here was really very helpful for me and also for the others. I was not as much experienced as the other members who took part in this dicussion, but i worked hard ad spent a lot of time to find out why i am getting :memory: mode slower or sometime no change in results or sometime only little faster may be one second at most. Now I think I am in the position to just leave thinking about to use SQLite in memory mode. I did some tests in the past and also showed results here to you all. in which, memory mode was slower than disk mode. now again i did some tests 1 Disk mode 2 Memory mode 3 ATTACH DB these tests were as I mentioned earlier, tested at Pentium(R) D CPU 3.00GHz 2.00 GB of RAM WINXP Professional. For All these tests I was Using SQLite 3.3.5, but now when I saw that in version 3.3.6 there is some improvement in :memory: then I did some tests again, but the results are not satisfactory. As 'Dennis Cote', 'Joe Wilson', 'Martin' mentioned that there is another factor at work here, I wish to participate in this discussion but I can't because of my little knowledge about the SQLite working. But I wish, and I hope to find out the way , where we can see a clear difference between DISK and :memory: mode. Dear Dennis Cote, As u suggested me to change my design, so Now I changed it. and its much better now. Again really thanful to to all the members. but if you all think that still there is some way then I will be happy to know about. Regards TAMIMY.
Re: [sqlite] SQLite performance for 10 Million Records
--- Martin Jenkins [EMAIL PROTECTED] wrote: For :memory: databases, long periods were observed where the VM size crept up but I/O write bytes did not, followed by periods where I/O bytes increased. If you use PRAGMA temp_store=MEMORY with your :memory: database you will have no I/O whatsoever: PRAGMA temp_store=MEMORY; begin; select datetime(now); create table t(id INTEGER, t TEXT); insert into t values (1,'Hello'); insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; insert into t select * from t; end; select datetime(now); __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: [sqlite] SQLite performance for 10 Million Records
I think some other factor is at play here. SQLite 2.x's memory databases are still twice as fast at batch inserts than either 3.x's disk-based databases or 2.x's disk-based databases when the DB size is less than physical machine memory. I did some experimentation with an SQLite 2.8.17 :memory: database versus a 3.3.6 :memory: database (with PRAGMA temp_store=memory in both cases) and found something worth noting - SQLite 2.8.17 uses over four times as much in-process RAM to perform a batch insert of one million rows as compared to 3.3.6: version in-process RAM --- -- 2.8.17 268M 3.3.6 64M So although 3.3.6 takes almost twice the time to populate a memory DB with a single transaction, it uses less than a quarter of the RAM that 2.8.17 required for the same operation. It appears to be a classic memory-for-speed tradeoff. I was not able to perform a single batch insert of 4 million rows into a 2.8.17 :memory: database on my machine due to lack of RAM. No such problem with 3.3.6. __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: [sqlite] SQLite performance for 10 Million Records
Joe Wilson wrote: --- Martin Jenkins [EMAIL PROTECTED] wrote: For :memory: databases, long periods were observed where the VM size crept up but I/O write bytes did not, followed by periods where I/O bytes increased. If you use PRAGMA temp_store=MEMORY with your :memory: database you will have no I/O whatsoever: So it does. It smoothed the VM size counter rate, but didn't affect the overall run time. Martin
Re: [sqlite] SQLite performance for 10 Million Records
Manzoor Ilahi Tamimy wrote: sqlite3_exec(db, create table t (a integer, b float, c text,d integer, e float, f text, g float, h text), NULL, NULL, NULL); sqlite3_exec(db, begin transaction, NULL, NULL, NULL); sqlite3_prepare(db, insert into t values (?, ?, ?, ?, ?, ?, ? , ?), -1, insert, NULL); .. I really Need Help. You certainly do. 1) Your test program is not testing what you want to test, because you have not defined any indexes on the table. CREATE TABLE HVH ( Field1 VARCHAR(8), IDC VARCHAR(4), Field3 VARCHAR(2), Field4 VARCHAR(4), Field5 VARCHAR(7), Field6 VARCHAR(8), Field7 VARCHAR(1), Field8 FLOAT); CREATE TABLE ITM( IDC VARCHAR(4),ITEMNAME VARCHAR(20), COLUMN3 VARCHAR(1),COLUMN4 VARCHAR(1), COLUMN5 VARCHAR(1),COLUMN6 VARCHAR(1), COLUMN7 VARCHAR(1),COLUMN8 VARCHAR(1), COLUMN9 VARCHAR(1),COLUMN10 VARCHAR(1), COLUMN11 VARCHAR(1),COLUMN12 VARCHAR(1), COLUMN13 VARCHAR(1),COLUMN14 VARCHAR(1), COLUMN15 VARCHAR(1),COLUMN16 VARCHAR(1)); CREATE INDEX index1 ON ITM (IDC); .. Select count(*) from itm, HVH where itm .IDC = HVH.IDC 2) Your original schema is always going to be slow, because you have not defined a suitable index on HVH(IDC). -- Nikki Locke, Trumphurst Ltd. PC Unix consultancy programming http://www.trumphurst.com/
Re: [sqlite] SQLite performance for 10 Million Records
Manzoor Ilahi Tamimy wrote: I have tested my Code with the following PRAGMA and still not getting any change in time. //-- sqlite3_exec(db, PRAGMA temp_store=2, NULL, NULL, NULL); sqlite3_exec(db, PRAGMA synchronous=0, NULL, NULL, NULL); sqlite3_exec(db, PRAGMA count_changes=OFF, NULL, NULL, NULL); sqlite3_exec(db, pragma default_cache_size =65536, NULL, NULL, NULL); sqlite3_exec(db, pragma cache_size = 8192, NULL, NULL, NULL); //-- Actually I have a data in DBF files each file contain 10,30, 50 Million Records. What I am doing is First I have developed an Application to convert data from DBF to DB for using SQLite for fast data access. The Requirment of the project is like this, because every time i will get data in DBF. so I will link my applcation with another application and will first convert it into DB and then I will run different Select Queries. What I want is Fast Insert And Fast Select. I have to display this hige data in Grid. I am Using MFC to develop this application. Now I am thinking to Use SQLite in Memory Mode , may be it help me to reduce the Time. I don't know much that How to use it memory mode. because the methode I am using is taking more time than DISK mode. here is a small Application in Which I was testing a DISK mode and Memory mode for INSERTION and SELECT. can you guide me how to use it properly in memory mode. I have tested this code on P4, 3Ghz, 2 GB RAM , Windows XP Professional. //** #include stdafx.h #include conio.h #include stdio.h #include stdlib.h #include time.h #include sqlite3.h int main(int argc, char *argv[]) { char *database = backup.db; // char *database = :memory:; sqlite3 *db; sqlite3_stmt *insert; int samples = 300; int i; time_t bgn, end; double t; remove(database); bgn = time(NULL); sqlite3_open(database, db); sqlite3_exec(db, PRAGMA temp_store=2, NULL, NULL, NULL); sqlite3_exec(db, PRAGMA synchronous=0, NULL, NULL, NULL); sqlite3_exec(db, PRAGMA count_changes=OFF, NULL, NULL, NULL); sqlite3_exec(db, pragma default_cache_size =65536, NULL, NULL,NULL); sqlite3_exec(db, pragma cache_size = 8192, NULL, NULL, NULL); sqlite3_exec(db, create table t (a integer, b float, c text,d integer, e float, f text, g float, h text), NULL, NULL, NULL); sqlite3_exec(db, begin transaction, NULL, NULL, NULL); sqlite3_prepare(db, insert into t values (?, ?, ?, ?, ?, ?, ? , ?), -1, insert, NULL); for (i = 0; i samples; i++) { sqlite3_bind_int(insert, 1, 44); sqlite3_bind_double(insert, 2, 444.4); sqlite3_bind_text(insert, 3, two hundred, -1, SQLITE_STATIC); sqlite3_bind_int(insert, 4, 55); sqlite3_bind_double(insert, 5, 5.5); sqlite3_bind_text(insert, 6, two hundred, -1, SQLITE_STATIC); sqlite3_bind_double(insert, 7, 66.6); sqlite3_bind_text(insert, 8, two hundred, -1, SQLITE_STATIC); sqlite3_step(insert); sqlite3_reset(insert); } // Insert Time check //* end = time(NULL); sqlite3_exec(db, commit transaction, NULL, NULL, NULL); t = difftime(end, bgn); printf(Executed %d inserts in %.0f seconds, %.0f inserts/sec\n, samples, t, samples / t); getch(); //* // Select Time check //* bgn = time(NULL); sqlite3_exec(db, Select * from t, NULL, NULL, NULL); end = time(NULL); t = difftime(end, bgn); printf(Select in %.0f seconds, t); getch(); //* sqlite3_close(db); } //** DISK MODE 3000 000 INSERTS 31 Seconds 96774 INSERTS / Sec SELECT * from t 5 Seconds. MEMORY MODE 3000 000 INSERTS 53 Seconds 56604 INSERTS / Sec SELECT * from t 5 Seconds. Can I reduce the TIME of DISK mode or this is the limit. WHY the Time in MEMORY mode is higher than DISK in case of INSERTION. For the memory MODE I am just changing one statement is this the way to use memory mode. I changed char *database = backup.db; into char *database = :memory:; I really Need Help. Thanks in Advance. TAMIMY --- I think that Derrell has already answered your question. What you are trying to do is inherently slow. You are doing a cross join of two large tables and counting the results. Depending upon which indexes you have defined (i.e. an index on IDC in either HVH or ITM tables), SQLite needs to scan one table from start
RE: [sqlite] SQLite performance for 10 Million Records
This is a modified version of the test code I posted to show that there was a small but definite SLOWDOWN when using :memory: databases compared to a database in a file on disk. It seems strange, but it is true. Use a disk file for best speed. If true, this is crazy. Memory is thousands of times faster than disk. Why should SQLite be so sluggish when it comes to its use of memory? Cheers, David. -- *** The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. Authorized and regulated by the Financial Services Authority This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer. Internet e-mails are not necessarily secure. The Royal Bank of Scotland plc does not accept responsibility for changes made to this message after it was sent. Whilst all reasonable care has been taken to avoid the transmission of viruses, it is the responsibility of the recipient to ensure that the onward transmission, opening or use of this message and any attachments will not adversely affect its systems or data. No responsibility is accepted by The Royal Bank of Scotland plc in this regard and the recipient should carry out such virus and other checks as it considers appropriate. Visit our websites at: http://www.rbos.com http://www.rbsmarkets.com ***
Re: [sqlite] SQLite performance for 10 Million Records
At 03:09 23/06/2006, you wrote: #include stdafx.h samples, t, samples / t); getch(); //* Here you should create index for table t. In your previous example, for hvh itm // Select Time check //* bgn = time(NULL); sqlite3_exec(db, Select * from t, NULL, NULL, NULL); end = time(NULL); t = difftime(end, bgn); printf(Select in %.0f seconds, t); getch(); //* sqlite3_close(db); } //** DISK MODE 3000 000 INSERTS 31 Seconds 96774 INSERTS / Sec SELECT * from t 5 Seconds. MEMORY MODE 3000 000 INSERTS 53 Seconds 56604 INSERTS / Sec SELECT * from t 5 Seconds. Can I reduce the TIME of DISK mode or this is the limit. I'm doing a little patch for mine SQLite implementation, but have not checked nor stressed nor benchmark nor even compiled!!!, also if this patch corrupt any data, is unkown, i'll do it next monday. In btree.c (verison 1.324) change from line 4861 to line 4880, the original text says: for(i=0; ik-1; i++){ int minV = pgnoNew[i]; int minI = i; for(j=i+1; jk; j++){ if( pgnoNew[j](unsigned)minV ){ minI = j; minV = pgnoNew[j]; } } if( minIi ){ int t; MemPage *pT; t = pgnoNew[i]; pT = apNew[i]; pgnoNew[i] = pgnoNew[minI]; apNew[i] = apNew[minI]; pgnoNew[minI] = t; apNew[minI] = pT; } } i just changed the insertion sort (O(n^2)) for a modified bubble sort (added 5 lines) called comb sort which is O (nlog n). It's used as final step for more complex sort algorithm as quick, radix. If you or anyone discover a bug or want to benchmark or modifies it, please say here or mail me. I expect to boost inserts and deletes. Copyrigth under dr. hwaci license unmodified. My code is: int gap = k;/*LINE 4861 for (i=0; ik-1; i++) { int swapped = 0; gap = gap * 10 / 13; /* Added line to bubble sort if (gap 1) /* Added line to bubble sort gap = 1; /* Added line to bubble sort if (gap == 9 || gap == 10)/* Added line to bubble sort gap = 11; /* Added line to bubble sort minV = pgnoNew[i]; for (j = 0; j k - gap; j++) { /* Changed line to bubble sort if (pgnoNew [j] minV) { int t; MemPage *pT; t = pgnoNew[i]; pT = apNew[i]; pgnoNew[i] = pgnoNew[minI]; apNew[i] = apNew[minI]; pgnoNew[minI] = t; apNew[minI] = pT; swapped = 1; } } if (gap == 1 !swapped) break; } If it's not enough for you, perhaps you need a greater rdbms. Antivirus. Warning: User detected. Please, keep away from computer or you will be eliminated. Thanks.
Re: [sqlite] SQLite performance for 10 Million Records
Cacheing will drive you crazy. CARTER-HITCHIN, David, GBM wrote: This is a modified version of the test code I posted to show that there was a small but definite SLOWDOWN when using :memory: databases compared to a database in a file on disk. It seems strange, but it is true. Use a disk file for best speed. If true, this is crazy. Memory is thousands of times faster than disk. Why should SQLite be so sluggish when it comes to its use of memory? Cheers, David. -- *** The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB. Authorized and regulated by the Financial Services Authority This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer. Internet e-mails are not necessarily secure. The Royal Bank of Scotland plc does not accept responsibility for changes made to this message after it was sent. Whilst all reasonable care has been taken to avoid the transmission of viruses, it is the responsibility of the recipient to ensure that the onward transmission, opening or use of this message and any attachments will not adversely affect its systems or data. No responsibility is accepted by The Royal Bank of Scotland plc in this regard and the recipient should carry out such virus and other checks as it considers appropriate. Visit our websites at: http://www.rbos.com http://www.rbsmarkets.com ***
Re: [sqlite] SQLite performance for 10 Million Records
John Stanton wrote: Cacheing will drive you crazy. Very well put. Most of SQLite's disk I/O is actually going to the memory used for the operating system's disk cache, not directly to the disk. Hence its speed is not much different when using a disk based database than a memory based database. I'm still a little surprised that a disk based database is actually slightly faster than a memory based one, but that just means that SQLite's memory based page I/O is less optimized than the OS's file I/O libraries. Dennis Cote
Re: [sqlite] SQLite performance for 10 Million Records
On 23 Jun 2006, at 14:16, Dennis Cote wrote: John Stanton wrote: Cacheing will drive you crazy. Very well put. Most of SQLite's disk I/O is actually going to the memory used for the operating system's disk cache, not directly to the disk. Hence its speed is not much different when using a disk based database than a memory based database. I'm still a little surprised that a disk based database is actually slightly faster than a memory based one, but that just means that SQLite's memory based page I/O is less optimized than the OS's file I/O libraries. I did some experiments along these lines a few months ago, and was similarly surprised. I got zero improvement running a database in memory compared to the on-disk version of the same database. -Eric
Re: [sqlite] SQLite performance for 10 Million Records
--- Manzoor Ilahi Tamimy [EMAIL PROTECTED] wrote: DISK MODE 3000 000 INSERTS 31 Seconds 96774 INSERTS / Sec SELECT * from t 5 Seconds. MEMORY MODE 3000 000 INSERTS 53 Seconds 56604 INSERTS / Sec SELECT * from t 5 Seconds. Can I reduce the TIME of DISK mode or this is the limit. WHY the Time in MEMORY mode is higher than DISK in case of INSERTION. Are you running a version of SQLite older than 3.3.6? http://www.sqlite.org/cvstrac/tktview?tn=1790 __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: [sqlite] SQLite performance for 10 Million Records
On 23 Jun 2006, at 14:16, Dennis Cote wrote: Most of SQLite's disk I/O is actually going to the memory used for the operating system's disk cache, not directly to the disk. Hence its speed is not much different when using a disk based database than a memory based database. I'm still a little surprised that a disk based database is actually slightly faster than a memory based one, but that just means that SQLite's memory based page I/O is less optimized than the OS's file I/O libraries. I think some other factor is at play here. SQLite 2.x's memory databases are still twice as fast at batch inserts than either 3.x's disk-based databases or 2.x's disk-based databases when the DB size is less than physical machine memory. Your analysis: http://www.mail-archive.com/sqlite-users%40sqlite.org/msg14937.html corresponding ticket: http://www.sqlite.org/cvstrac/tktview?tn=1790 __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: [sqlite] SQLite performance for 10 Million Records
I have tested my Code with the following PRAGMA and still not getting any change in time. //-- sqlite3_exec(db, PRAGMA temp_store=2, NULL, NULL, NULL); sqlite3_exec(db, PRAGMA synchronous=0, NULL, NULL, NULL); sqlite3_exec(db, PRAGMA count_changes=OFF, NULL, NULL, NULL); sqlite3_exec(db, pragma default_cache_size =65536, NULL, NULL, NULL); sqlite3_exec(db, pragma cache_size = 8192, NULL, NULL, NULL); //-- Actually I have a data in DBF files each file contain 10,30, 50 Million Records. What I am doing is First I have developed an Application to convert data from DBF to DB for using SQLite for fast data access. The Requirment of the project is like this, because every time i will get data in DBF. so I will link my applcation with another application and will first convert it into DB and then I will run different Select Queries. What I want is Fast Insert And Fast Select. I have to display this hige data in Grid. I am Using MFC to develop this application. Now I am thinking to Use SQLite in Memory Mode , may be it help me to reduce the Time. I don't know much that How to use it memory mode. because the methode I am using is taking more time than DISK mode. here is a small Application in Which I was testing a DISK mode and Memory mode for INSERTION and SELECT. can you guide me how to use it properly in memory mode. I have tested this code on P4, 3Ghz, 2 GB RAM , Windows XP Professional. //** #include stdafx.h #include conio.h #include stdio.h #include stdlib.h #include time.h #include sqlite3.h int main(int argc, char *argv[]) { char *database = backup.db; // char *database = :memory:; sqlite3 *db; sqlite3_stmt *insert; int samples = 300; int i; time_t bgn, end; double t; remove(database); bgn = time(NULL); sqlite3_open(database, db); sqlite3_exec(db, PRAGMA temp_store=2, NULL, NULL, NULL); sqlite3_exec(db, PRAGMA synchronous=0, NULL, NULL, NULL); sqlite3_exec(db, PRAGMA count_changes=OFF, NULL, NULL, NULL); sqlite3_exec(db, pragma default_cache_size =65536, NULL, NULL,NULL); sqlite3_exec(db, pragma cache_size = 8192, NULL, NULL, NULL); sqlite3_exec(db, create table t (a integer, b float, c text,d integer, e float, f text, g float, h text), NULL, NULL, NULL); sqlite3_exec(db, begin transaction, NULL, NULL, NULL); sqlite3_prepare(db, insert into t values (?, ?, ?, ?, ?, ?, ? , ?), -1, insert, NULL); for (i = 0; i samples; i++) { sqlite3_bind_int(insert, 1, 44); sqlite3_bind_double(insert, 2, 444.4); sqlite3_bind_text(insert, 3, two hundred, -1, SQLITE_STATIC); sqlite3_bind_int(insert, 4, 55); sqlite3_bind_double(insert, 5, 5.5); sqlite3_bind_text(insert, 6, two hundred, -1, SQLITE_STATIC); sqlite3_bind_double(insert, 7, 66.6); sqlite3_bind_text(insert, 8, two hundred, -1, SQLITE_STATIC); sqlite3_step(insert); sqlite3_reset(insert); } // Insert Time check //* end = time(NULL); sqlite3_exec(db, commit transaction, NULL, NULL, NULL); t = difftime(end, bgn); printf(Executed %d inserts in %.0f seconds, %.0f inserts/sec\n, samples, t, samples / t); getch(); //* // Select Time check //* bgn = time(NULL); sqlite3_exec(db, Select * from t, NULL, NULL, NULL); end = time(NULL); t = difftime(end, bgn); printf(Select in %.0f seconds, t); getch(); //* sqlite3_close(db); } //** DISK MODE 3000 000 INSERTS 31 Seconds 96774 INSERTS / Sec SELECT * from t 5 Seconds. MEMORY MODE 3000 000 INSERTS 53 Seconds 56604 INSERTS / Sec SELECT * from t 5 Seconds. Can I reduce the TIME of DISK mode or this is the limit. WHY the Time in MEMORY mode is higher than DISK in case of INSERTION. For the memory MODE I am just changing one statement is this the way to use memory mode. I changed char *database = backup.db; into char *database = :memory:; I really Need Help. Thanks in Advance. TAMIMY --- I think that Derrell has already answered your question. What you are trying to do is inherently slow. You are doing a cross join of two large tables and counting the results. Depending upon which indexes you have defined (i.e. an index on IDC in either HVH or ITM tables), SQLite needs to scan one table from start to finish, and for each
Re: [sqlite] SQLite performance for 10 Million Records
Manzoor Ilahi Tamimy [EMAIL PROTECTED] writes: Here Is The Schema For these Tables. CREATE TABLE HVH ( Field1 VARCHAR(8),IDC VARCHAR(4), Field3 VARCHAR(2),Field4 VARCHAR(4), Field5 VARCHAR(7),Field6 VARCHAR(8), Field7 VARCHAR(1),Field8 FLOAT); CREATE TABLE ITM( IDC VARCHAR(4),ITEMNAME VARCHAR(20), COLUMN3 VARCHAR(1),COLUMN4 VARCHAR(1), COLUMN5 VARCHAR(1),COLUMN6 VARCHAR(1), COLUMN7 VARCHAR(1),COLUMN8 VARCHAR(1), COLUMN9 VARCHAR(1),COLUMN10 VARCHAR(1), COLUMN11 VARCHAR(1),COLUMN12 VARCHAR(1), COLUMN13 VARCHAR(1),COLUMN14 VARCHAR(1), COLUMN15 VARCHAR(1),COLUMN16 VARCHAR(1)); CREATE INDEX index1 ON ITM (IDC); Ok, so at this point, you have one index, only on ITM(IDC). // TEST 2 ( Disk DB ) Table Names itm , HVH Number of Records : itm #61664; 5 Million and HVH #61664; less than 10,000 QUERY: create index index1 on itm(IDC) Now you've created another index *on the same column of the same table*. That doesn't help you any. Instead of that, do CREATE INDEX index1 ON HVH(IDC); and you I expect you'll see dramatically lower select times. Derrell
Re: [sqlite] SQLite performance for 10 Million Records
At 03:30 20/06/2006, you wrote: Here Is The Schema For these Tables. CREATE TABLE HVH ( Field1 VARCHAR(8), IDC VARCHAR(4), Field3 VARCHAR(2), Field4 VARCHAR(4), Field5 VARCHAR(7), Field6 VARCHAR(8), Field7 VARCHAR(1), Field8 FLOAT); CREATE TABLE ITM( IDC VARCHAR(4),ITEMNAME VARCHAR(20), COLUMN3 VARCHAR(1),COLUMN4 VARCHAR(1), COLUMN5 VARCHAR(1),COLUMN6 VARCHAR(1), COLUMN7 VARCHAR(1),COLUMN8 VARCHAR(1), COLUMN9 VARCHAR(1),COLUMN10 VARCHAR(1), COLUMN11 VARCHAR(1),COLUMN12 VARCHAR(1), COLUMN13 VARCHAR(1),COLUMN14 VARCHAR(1), COLUMN15 VARCHAR(1),COLUMN16 VARCHAR(1)); CREATE INDEX index1 ON ITM (IDC); // TEST 1 ( Disk DB ) Table Names itm , HVH Number of Records : itm #61664; 100,000 and HVH #61664; less than 10,000 QUERY: Select count(*) from itm, HVH where itm .IDC = HVH.IDC Elapse Time : 54.359 (Sec) // TEST 2 ( Disk DB ) Table Names itm , HVH Number of Records : itm #61664; 5 Million and HVH #61664; less than 10,000 QUERY: create index index1 on itm(IDC) Elapse Time : 0.356 (Sec) Select count(*) from itm,HVH where itm.IDC=HVH.IDC Elapse Time : 64 (Sec) // TEST 3 ( Disk DB ) Table Names itm Number of Records : itm #61664; 20 Million QUERY: Select count(*) from itm Elapse Time : 133 (Sec) // Where need chnge?. add before create: pragma default_cache_size =65536 (64M pages in cache memory, each page 8.5 KBytes in memory - 00MB of memory, adjust as you need) pragma cache_size = 8192 (8KBytes per page in disk) pragma temp_store = 2( Use memory for all indices, temp tables, etc...) You can try before query pragma synchronous = 0 (Or OFF, but as it's a read, i dude it can help) Also you forget to add an index to HVH table. Also you can check the www.sqlite.org main page, there is a direct link to documentation. Remember: Software is not a ring that only can be pushed, it has instructions and technical documentation, and in this case, clear, concise and brief, and very good.
Re: [sqlite] SQLite performance for 10 Million Records
Manzoor Ilahi Tamimy wrote: Here Is The Schema For these Tables. CREATE TABLE HVH ( Field1 VARCHAR(8), IDC VARCHAR(4), Field3 VARCHAR(2), Field4 VARCHAR(4), Field5 VARCHAR(7), Field6 VARCHAR(8), Field7 VARCHAR(1), Field8 FLOAT); CREATE TABLE ITM( IDC VARCHAR(4),ITEMNAME VARCHAR(20), COLUMN3 VARCHAR(1),COLUMN4 VARCHAR(1), COLUMN5 VARCHAR(1),COLUMN6 VARCHAR(1), COLUMN7 VARCHAR(1),COLUMN8 VARCHAR(1), COLUMN9 VARCHAR(1),COLUMN10 VARCHAR(1), COLUMN11 VARCHAR(1),COLUMN12 VARCHAR(1), COLUMN13 VARCHAR(1),COLUMN14 VARCHAR(1), COLUMN15 VARCHAR(1),COLUMN16 VARCHAR(1)); CREATE INDEX index1 ON ITM (IDC); // TEST 1 ( Disk DB ) Table Names itm , HVH Number of Records : itm #61664; 100,000 and HVH #61664; less than 10,000 QUERY: Select count(*) from itm, HVH where itm .IDC = HVH.IDC Elapse Time : 54.359 (Sec) On Mon, 19 Jun 2006 12:38:32 -0400, Derrell.Lipman wrote The count() function is always slow, I think, because no index can be used; it must actually count the number of records. I suspect, though, that your problem is missing indexes. Is there an implicit or explicit index on itm.IDC? Is there an implicit or explicit index on HVH.IDC? With query times like you're seeing, I'd assume not. If you post your schema along with the queries you want to issue, many people here will likely provide suggestions for improvement. I think that Derrell has already answered your question. What you are trying to do is inherently slow. You are doing a cross join of two large tables and counting the results. Depending upon which indexes you have defined (i.e. an index on IDC in either HVH or ITM tables), SQLite needs to scan one table from start to finish, and for each row in that table it uses an index to locate the matching rows in the other table. You can look at the output of the explain query plan command to see how your tables and indexes are being scanned. I suspect that this is a contrived tests case, and that it is not what you really need to get done. Can you explain what you are trying to do in more detail? For example, it looks like IDC is, or should be, the primary key for the ITM table, and that IDC is a foreign key in the HVH table. Is that true? How many records in the HVH table match each record in the ITM table; one, a few, or many? What information do you need to get from these tables (i.e. Is a count of the results what you are really after)? If you can explain what you are trying to do in more detail, someone here can probably help you to generate a more suitable query. HTH Dennis Cote
[sqlite] SQLite performance for 10 Million Records
Hello All, We are Using SQLite for one of our project. The Database Size is more than 500 MB. It contain one table and about 10 million Records. We are facing Problem in the select with single Join. The join is between a big table and a small table. The small table contain records not more than 10,000. We are using a disk base db. But the performance is very slow. Any body can guide me, which mode should I use to improve performance for a huge amount of records. If memory mode is a better choice then , please guide me a little that how to use it. Any special parameters for using SQLite in memory mode. Thanks. Manzoor ILahi
Re: [sqlite] SQLite performance for 10 Million Records
Manzoor Ilahi Tamimy wrote: Hello All, We are Using SQLite for one of our project. The Database Size is more than 500 MB. It contain one table and about 10 million Records. We are facing Problem in the select with single Join. The join is between a big table and a small table. The small table contain records not more than 10,000. We are using a disk base db. But the performance is very slow. Any body can guide me, which mode should I use to improve performance for a huge amount of records. If memory mode is a better choice then , please guide me a little that how to use it. Any special parameters for using SQLite in memory mode. Thanks. Manzoor ILahi Err, for that size, I'd recommend going something heavier, like firebird. This is not sqlite's solution domain in the slightest.
Re: [sqlite] SQLite performance for 10 Million Records
Monday, June 19, 2006, 07:37:22, Manzoor Ilahi Tamimy wrote: The Database Size is more than 500 MB. It contain one table and about 10 million Records. I had problems with even more records (roughly 25 million, 1GB of data) and I've stopped efforts to do it in pure sqlite in the end, also because datasets with even more data (10 GB) are foreseeable. Anyway, the problem has lead to another solution. In _my case_ the bulky data are relatively simple and access to them required not anything, SQL has to offer. So, hdf5 (http://hdf.ncsa.uiuc.edu/HDF5/) for the mass data + sqlite for the more sophisticated (but much smaller) tables play pretty well together. E.g, the hdf5 library was able to write a complete 1.2 GB file in 25 s - and file I/O becomes a bottleneck for sqlite then. But when analyzing your problem have in mind, hdf5 has other limitations. Inserts and even appends are not easily achieved in hdf5. Even so, not every read operation. It's still great, when having multidimensional data in a sense, scientific communities use them. You can select so called hyperslabs from these fields very, very quickly. Micha --
Re: [sqlite] SQLite performance for 10 Million Records
Bill King [EMAIL PROTECTED] writes: Manzoor Ilahi Tamimy wrote: We are Using SQLite for one of our project. The Database Size is more than 500 MB. It contain one table and about 10 million Records. Err, for that size, I'd recommend going something heavier, like firebird. This is not sqlite's solution domain in the slightest. I'd have to differ on opinion here. I have an sqlite database that's at 6.9GB with about 40 million records, and it's working just fine. My query speed has not changed substantially as the database has grown. With sqlite, you *do* need to hand-optimize your queries. Determine which queries are slow, and consider splitting them into multiple queries which generate temporary tables. By doing this, you can create indexes, as appropriate, on the temporary tables that make the overall time to accomplish your goal much less than cramming it all into a single query that is not so highly optimized. Oh, and I'm using the old sqlite 2.8 series. I expect I'd be getting even better speed if I used the newer 3 series. Cheers, Derrell
Re: [sqlite] SQLite performance for 10 Million Records
Dear ALL, I am really thankful to Bill King, Micha Bieber , Derrell for your valuable suggestions. I was really confused that which way should I follow now, because I was sure that SQLite will work much better. when I got the suggestion about Firebird then again I went to the comparison page of SQLite and found that Firebird was also slow in many cases compared to SQLite. From the DerrellÂ’s reply again I got hope that yes I can achieve that using SQLite. Now I want to add some more information about my work may be you will add some suggestions. // TEST 1 ( Disk DB ) Table Names itm , HVH Number of Records : itm #61664; 100,000 and HVH #61664; less than 10,000 QUERY: Select count(*) from itm, HVH where itm .IDC = HVH.IDC Elapse Time : 54.359 (Sec) // TEST 2 ( Disk DB ) Table Names itm , HVH Number of Records : itm #61664; 5 Million and HVH #61664; less than 10,000 QUERY: create index index1 on itm(IDC) Elapse Time : 0.356 (Sec) Select count(*) from itm,HVH where itm.IDC=HVH.IDC Elapse Time : 64 (Sec) // TEST 3 ( Disk DB ) Table Names itm Number of Records : itm #61664; 20 Million QUERY: Select count(*) from itm Elapse Time : 133 (Sec) // Now Please have look at these results. What should I do now to improve the performance. Thanks and regards Manzoor Ilahi Tamimy On Mon, 19 Jun 2006 09:47:03 -0400, Derrell.Lipman wrote Bill King [EMAIL PROTECTED] writes: Manzoor Ilahi Tamimy wrote: We are Using SQLite for one of our project. The Database Size is more than 500 MB. It contain one table and about 10 million Records. Err, for that size, I'd recommend going something heavier, like firebird. This is not sqlite's solution domain in the slightest. I'd have to differ on opinion here. I have an sqlite database that's at 6.9GB with about 40 million records, and it's working just fine. My query speed has not changed substantially as the database has grown. With sqlite, you *do* need to hand-optimize your queries. Determine which queries are slow, and consider splitting them into multiple queries which generate temporary tables. By doing this, you can create indexes, as appropriate, on the temporary tables that make the overall time to accomplish your goal much less than cramming it all into a single query that is not so highly optimized. Oh, and I'm using the old sqlite 2.8 series. I expect I'd be getting even better speed if I used the newer 3 series. Cheers, Derrell
Re: [sqlite] SQLite performance for 10 Million Records
count(*) is pretty slow in sqlite because it basically does select * and then counts the results. This means it's looking through your whole big file. You can come up with some tricks like keeping a separate count up-to-date with triggers. There have been some old threads on optimizing count that you may want to look for. Brett
Re: [sqlite] SQLite performance for 10 Million Records
Here Is The Schema For these Tables. CREATE TABLE HVH ( Field1 VARCHAR(8), IDC VARCHAR(4), Field3 VARCHAR(2), Field4 VARCHAR(4), Field5 VARCHAR(7), Field6 VARCHAR(8), Field7 VARCHAR(1), Field8 FLOAT); CREATE TABLE ITM( IDC VARCHAR(4),ITEMNAME VARCHAR(20), COLUMN3 VARCHAR(1),COLUMN4 VARCHAR(1), COLUMN5 VARCHAR(1),COLUMN6 VARCHAR(1), COLUMN7 VARCHAR(1),COLUMN8 VARCHAR(1), COLUMN9 VARCHAR(1),COLUMN10 VARCHAR(1), COLUMN11 VARCHAR(1),COLUMN12 VARCHAR(1), COLUMN13 VARCHAR(1),COLUMN14 VARCHAR(1), COLUMN15 VARCHAR(1),COLUMN16 VARCHAR(1)); CREATE INDEX index1 ON ITM (IDC); // TEST 1 ( Disk DB ) Table Names itm , HVH Number of Records : itm #61664; 100,000 and HVH #61664; less than 10,000 QUERY: Select count(*) from itm, HVH where itm .IDC = HVH.IDC Elapse Time : 54.359 (Sec) // TEST 2 ( Disk DB ) Table Names itm , HVH Number of Records : itm #61664; 5 Million and HVH #61664; less than 10,000 QUERY: create index index1 on itm(IDC) Elapse Time : 0.356 (Sec) Select count(*) from itm,HVH where itm.IDC=HVH.IDC Elapse Time : 64 (Sec) // TEST 3 ( Disk DB ) Table Names itm Number of Records : itm #61664; 20 Million QUERY: Select count(*) from itm Elapse Time : 133 (Sec) // Where need chnge?. Regards TAMIMY On Mon, 19 Jun 2006 12:38:32 -0400, Derrell.Lipman wrote Manzoor Ilahi Tamimy [EMAIL PROTECTED] writes: // TEST 1 ( Disk DB ) Table Names itm , HVH Number of Records : itm -- 100,000 and HVH -- less than 10,000 QUERY: Select count(*) from itm, HVH where itm .IDC = HVH.IDC Elapse Time : 54.359 (Sec) The count() function is always slow, I think, because no index can be used; it must actually count the number of records. I suspect, though, that your problem is missing indexes. Is there an implicit or explicit index on itm.IDC? Is there an implicit or explicit index on HVH.IDC? With query times like you're seeing, I'd assume not. If you post your schema along with the queries you want to issue, many people here will likely provide suggestions for improvement. Derrell