Re: [sqlite] beat 120,000 inserts/sec
On Apr 11, 2005 4:06 PM, Christian Smith <[EMAIL PROTECTED]> wrote: > The test given is clearly CPU bound. All the big numbers are from people > with big CPUs, with equally big RAM performance as well, probably. I have done a few database test recently, and I often found them to be CPU bound, at least as long as the data fits into memory. If you are looking for a fast system, I think there is nothing affordable that currently beats an AMD64 (Athlon64 or Opteron). The build in memory controller leads to very low memory latency, which is quite important for databases, and the 64bit CPU gives you lots of raw power. Thomas
Re: [sqlite] beat 120,000 inserts/sec
On Sat, 9 Apr 2005, Al Danial wrote: >On Apr 9, 2005 12:43 AM, Andy Lutomirski <[EMAIL PROTECTED]> wrote: >> Al Danial wrote: >> > The attached C program measures insert performance for populating >> > a table with an integer and three random floating point values with >> > user defined transaction size. Usage is: >> > >> > ./sqlite_insert >> >> All of these are on Gentoo, Athlon 64 3200+, running 64 bit. >> >> Writing to /tmp, which is ext3 (acl,user_xattr,usrquota) over RAID0 (two >> slave drives on different channels): >> >> $ ./sqlitetest 10 5 >> 10 inserts to /tmp/a.db in 0.531 s = 188446.34 inserts/s >> $ ./sqlitetest 200 5 >> 200 inserts to /tmp/a.db in 11.546 s = 173223.61 inserts/s > >That's impressive. Clearly a well-implemented RAID0 configuration >and fast disks make a huge difference. This will be the direction I'll >take. My machine, 2x Athlon XP 1700+, 512MB RAM, U160 SCSI (3x10K disks): [EMAIL PROTECTED] tmp]$ ./sqlite_insert 10 5 10 inserts to ./a.db in 2.495 s = 40081.96 inserts/s [EMAIL PROTECTED] tmp]$ ./sqlite_insert 200 5 200 inserts to ./a.db in 47.068 s = 42491.71 inserts/s While I have a big fat SCSI IO system, vmstat during the run shows only shows about 2MB/s ever going to the HD, well within the capabilities of any SCSI or IDE HD (Laptop even!) [EMAIL PROTECTED] csmith]$ vmstat 1 procs ---memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si sobibo incs us sy id wa 3 0 93296 116156 30424 22960400 914 6231 96 3 1 0 3 0 93296 114308 30484 23154800 0 2167 333 1194 99 1 0 0 7 0 93296 112488 30512 23348800 0 2064 320 1092 99 1 0 0 4 0 93296 110716 30540 23543200 0 2072 314 1060 99 1 0 0 4 0 93296 108844 30568 23750400 128 2068 319 1069 99 1 0 0 4 0 93296 107380 30596 23944400 0 2076 383 1252 98 2 0 0 2 2 93296 103352 30640 24139200 0 2101 321 96 4 0 0 > >Thanks to everyone who posted performance numbers and machine >setup info. Some results were counterintuitive (I'd have guessed >SCSI drives would come out on top) but many variables are at work >so I won't try to draw too many conclusions. The test given is clearly CPU bound. All the big numbers are from people with big CPUs, with equally big RAM performance as well, probably. >Also thanks to the suggestions to study the pragma's. I did try >SYNCHRONOUS=off but that didn't seem to have an effect; I'll >study the docs to make sure I've got it right. -- Al For CPU bound work, it won't. You appear to be maxing out what the CPU and memory subsystem can handle. Tweaking IO will not help. Using ":memory:" gave about a 10% speedup, so that looks like the maximum amount of speedup you could possibly hope for. Christian -- /"\ \ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL X - AGAINST MS ATTACHMENTS / \
Re: [sqlite] beat 120,000 inserts/sec
On Apr 9, 2005, at 8:49 AM, Al Danial wrote: I did try SYNCHRONOUS=off but that didn't seem to have an effect; I'll study the docs to make sure I've got it right. This isn't surprising. fsync() is largely a no-op on just about any operating system. It doesn't actually guarantee that the bytes are written to the platter (though it will likely impact queueing and the like within the kernel). smime.p7s Description: S/MIME cryptographic signature
Re: [sqlite] beat 120,000 inserts/sec
Andrew Piskorski wrote: On Sat, Apr 09, 2005 at 11:49:17AM -0400, Al Danial wrote: Thanks to everyone who posted performance numbers and machine setup info. Some results were counterintuitive (I'd have guessed SCSI drives would come out on top) but many variables are at work It is basically impossible that a 10k or 15k rpm SCSI disk is genuinely substantially slower for writes than a 7200 rpm IDE disk. Most likely, the SCSI disks had write-through cache turned off, and the IDE disks had it turned on. This load is probably mostly sequential writes. The IDE drives I used were 200GB and have huge areal density, so their sequential write speed is probably a lot higher. SCSI'll win if there are seeks involved (or a drive like a WD Raptor). --Andy
Re: [sqlite] beat 120,000 inserts/sec
On Sat, Apr 09, 2005 at 11:49:17AM -0400, Al Danial wrote: > Thanks to everyone who posted performance numbers and machine > setup info. Some results were counterintuitive (I'd have guessed > SCSI drives would come out on top) but many variables are at work It is basically impossible that a 10k or 15k rpm SCSI disk is genuinely substantially slower for writes than a 7200 rpm IDE disk. Most likely, the SCSI disks had write-through cache turned off, and the IDE disks had it turned on. -- Andrew Piskorski <[EMAIL PROTECTED]> http://www.piskorski.com/
Re: [sqlite] beat 120,000 inserts/sec
On Apr 9, 2005 12:43 AM, Andy Lutomirski <[EMAIL PROTECTED]> wrote: > Al Danial wrote: > > The attached C program measures insert performance for populating > > a table with an integer and three random floating point values with > > user defined transaction size. Usage is: > > > > ./sqlite_insert > > All of these are on Gentoo, Athlon 64 3200+, running 64 bit. > > Writing to /tmp, which is ext3 (acl,user_xattr,usrquota) over RAID0 (two > slave drives on different channels): > > $ ./sqlitetest 10 5 > 10 inserts to /tmp/a.db in 0.531 s = 188446.34 inserts/s > $ ./sqlitetest 200 5 > 200 inserts to /tmp/a.db in 11.546 s = 173223.61 inserts/s That's impressive. Clearly a well-implemented RAID0 configuration and fast disks make a huge difference. This will be the direction I'll take. Thanks to everyone who posted performance numbers and machine setup info. Some results were counterintuitive (I'd have guessed SCSI drives would come out on top) but many variables are at work so I won't try to draw too many conclusions. Also thanks to the suggestions to study the pragma's. I did try SYNCHRONOUS=off but that didn't seem to have an effect; I'll study the docs to make sure I've got it right. -- Al
Re: [sqlite] beat 120,000 inserts/sec
Jeremy Hinegardner wrote: > ./sqlite_insert 10 5 10 inserts to /tmp/a.db in 0.671 s = 149057.52 inserts/s ./sqlite_insert 200 2 200 inserts to /tmp/a.db in 14.437 s = 138535.38 inserts/s ./sqlite_insert 200 5 200 inserts to /tmp/a.db in 15.322 s = 130530.52 inserts/s Not substantially better. My configuration: AMD Athlon(tm) 64 Processor 3000+ 1GB RAM 2 x Maxtor 200GB SATA150 7200RPM 8MB HD I'm running OpenBSD 3.6 with a customer kernel compiled to support OpenBSD's software raid (RAIDFrame). I have the 2 drives arranged in a RAID 1 configuration. As a confirmed OpenBSD devotee these numbers make me very happy. I use it for the security, but I can see that there are good performance reasons as well. I guess we know where the app should run if they need that kind of performance. Clay Dowling -- http://www.lazarusid.com/notes/ Lazarus Notes Articles and Commentary on Web Development
Re: [sqlite] beat 120,000 inserts/sec
There are also pragmas to control page size and in-memory caching. You will want to play with those, as well. If SQLite is in the middle of a transaction and you load it up with commands, it will create a journal file in /tmp/ to start pages that don't fit in the in-memory page cache (or something like that). As such, the pragmas can hugely affect performance. You will likely find that tuning the engine for particular roles -- bulk loading, queries, random updates -- will likely be useful. See the SQLite documentation for more information. b.bum smime.p7s Description: S/MIME cryptographic signature
Re: [sqlite] beat 120,000 inserts/sec
On Fri, Apr 08, 2005 at 11:01:02PM -0400, Al Danial wrote: [...] > What kind of insert performance do you see on your machine? If it > is substantially better than 120 kinserts/s what kind of hardware > do you have? I'm especially interested in how much faster the > code runs on systems with multiple disks in a RAID 0 configuration. > Are there other tricks to speeding insert performance? Is it even > reasonable to ask for more? -- Al ./sqlite_insert 10 5 10 inserts to /tmp/a.db in 0.671 s = 149057.52 inserts/s ./sqlite_insert 200 2 200 inserts to /tmp/a.db in 14.437 s = 138535.38 inserts/s ./sqlite_insert 200 5 200 inserts to /tmp/a.db in 15.322 s = 130530.52 inserts/s Not substantially better. My configuration: AMD Athlon(tm) 64 Processor 3000+ 1GB RAM 2 x Maxtor 200GB SATA150 7200RPM 8MB HD I'm running OpenBSD 3.6 with a customer kernel compiled to support OpenBSD's software raid (RAIDFrame). I have the 2 drives arranged in a RAID 1 configuration. enjoy, -jeremy -- Jeremy Hinegardner [EMAIL PROTECTED]
Re: [sqlite] beat 120,000 inserts/sec
Al Danial wrote: The attached C program measures insert performance for populating a table with an integer and three random floating point values with user defined transaction size. Usage is: ./sqlite_insert All of these are on Gentoo, Athlon 64 3200+, running 64 bit. Writing to /tmp, which is ext3 (acl,user_xattr,usrquota) over RAID0 (two slave drives on different channels): $ ./sqlitetest 10 5 10 inserts to /tmp/a.db in 0.531 s = 188446.34 inserts/s $ ./sqlitetest 200 5 200 inserts to /tmp/a.db in 11.546 s = 173223.61 inserts/s I'm actually surprised to see this kind of performance. The drives are brand-new Maxtor 200GB parallel ATA. I imagine I could do better with usrquota off. reiserfs over RAID5 (hideous config with different types of drives, 3 master, 2 slave): $ ./sqlitetest 10 5 10 inserts to /home/foobar/a.db in 0.884 s = 113121.53 inserts/s $ ./sqlitetest 200 5 200 inserts to /home/foobar/a.db in 20.025 s = 99875.94 inserts/s xfs on the same array: $ ./sqlitetest 10 5 10 inserts to /data/stage/a.db in 0.777 s = 128684.23 inserts/s [EMAIL PROTECTED] tmp $ ./sqlitetest 200 5 200 inserts to /data/stage/a.db in 14.541 s = 137544.14 inserts/s Last but not least, reiserfs noatime over RAID1 (two drives, masters on different channels): $ ./sqlitetest 10 5 10 inserts to /usr/foobar/a.db in 0.700 s = 142824.48 inserts/s $ ./sqlitetest 200 5 200 inserts to /usr/foobar/a.db in 15.376 s = 130071.89 inserts/s For kicks, I tried tmpfs. It crashed (segfault in sqlite3_bind_int64 -- go figure). I'll submit a bug report if I can reproduce it sanely. So it looks like ext3 over RAID0 is good. FWIW, I can read 99.18 MB/sec off /dev/md3 (where /tmp lives). --Andy
Re: [sqlite] beat 120,000 inserts/sec
Al Danial wrote: > What kind of insert performance do you see on your machine? If it > is substantially better than 120 kinserts/s what kind of hardware > do you have? I'm especially interested in how much faster the > code runs on systems with multiple disks in a RAID 0 configuration. > Are there other tricks to speeding insert performance? Is it even > reasonable to ask for more? -- Al > sqlite 3.0.8 Fedora core 2, kernel 2.6.8-1.521 /proc/cpuinfo: model name : AMD Athlon(tm) 64 Processor 3200+ cpu MHz : 2043.192 cache size : 1024 KB On a Seagate 7200.7 200GB Parallel IDE drive with ext3: $ ./sqlite_insert 10 5 10 inserts to /tmp/a.db in 0.841 s = 118872.29 inserts/s $ ./sqlite_insert 10 2 10 inserts to /tmp/a.db in 0.848 s = 117858.23 inserts/s $ ./sqlite_insert 200 5 200 inserts to /tmp/a.db in 18.446 s = 108424.30 inserts/s Same box with RAID 0 configuration with 2 U160 SCSI Seagate 15000 RPM 18GB Cheetahs (2nd generation, purchased in 2001) with ext3: $ ./sqlite_insert 10 2 10 inserts to ./a.db in 1.177 s = 84977.14 inserts/s $ ./sqlite_insert 10 2 10 inserts to ./a.db in 1.146 s = 87230.96 inserts/s $ ./sqlite_insert 200 5 200 inserts to ./a.db in 20.287 s = 98584.18 inserts/s -- Dan Nuffer
Re: [sqlite] beat 120,000 inserts/sec
Al Danial wrote: A scientific application I work with has clumsy data retrieval options. I dumped the application's output--integer and floating point numbers--into an SQLite database and soon after began to enjoy the power of SQL to pull out interesting results. The main complaint for making the transfer to SQLite a permanent part of our solution is the time it takes to insert the numbers into a database. It takes about a minute to insert 24 million numbers into three tables. Most database people (including me) would be thrilled to see this kind of insert performance but my colleagues are asking if it can be sped up. Try setting PRAGMA synchronous=OFF and seeing how much improvement you get. If it's significant, you'll have to decide for yourself if you can live with the increased risk of corruption if there's a power failure during a batch load.
Re: [sqlite] beat 120,000 inserts/sec
Random thought: You might squeeze some more performance out by trying a couple different filesystems. i.e. if you're using ext3, try some different journaling options, or try ext2. --- Al Danial <[EMAIL PROTECTED]> wrote: > A scientific application I work with has clumsy data retrieval > options. I dumped the application's output--integer and floating > point numbers--into an SQLite database and soon after began to > enjoy the power of SQL to pull out interesting results. > > The main complaint for making the transfer to SQLite a permanent > part of our solution is the time it takes to insert the numbers > into a database. It takes about a minute to insert 24 million > numbers into three tables. Most database people (including me) > would be thrilled to see this kind of insert performance but > my colleagues are asking if it can be sped up. > > The attached C program measures insert performance for populating > a table with an integer and three random floating point values with > user defined transaction size. Usage is: > > ./sqlite_insert > > It writes to the hardcoded database file /tmp/a.db > On my Dell Precision 360, Pentium4 3.0 GHz, 1 GB RAM, IDE disk > drive, ext3, RHEL v3, Linux kernel 2.4.21, I peak out around 121,000 > inserts/second using a transaction size of 20,000: > ./sqlite_insert 10 2 > 10 inserts to /tmp/a.db in 0.829 s = 120626.53 inserts/s > > Performance drops a bit when I increase the number of rows to > two million (a typical size for my application): > > ./sqlite_insert 200 5 > 200 inserts to /tmp/a.db in 17.124 s = 116795.07 inserts/s > > What kind of insert performance do you see on your machine? If it > is substantially better than 120 kinserts/s what kind of hardware > do you have? I'm especially interested in how much faster the > code runs on systems with multiple disks in a RAID 0 configuration. > Are there other tricks to speeding insert performance? Is it even > reasonable to ask for more? -- Al > > /* > [EMAIL PROTECTED] > > # sample build: > gcc -o sqlite_insert sqlite_insert.c \ > -L/usr/local/sqlite-3.2.1/lib -lsqlite3 > -I/usr/local/sqlite-3.2.1/include > > # sample run: > ./sqlite_insert 10 5 > > */ > > > #include > #include /* RAND_MAX */ > #include /* timeval, gettimeofday() */ > > > int sql_begin(sqlite3 *db) { /* {{{1 */ > char *errmsg; > if (sqlite3_exec(db, "BEGIN TRANSACTION", > NULL, NULL, ) != SQLITE_OK) { > printf("couldn't begin transaction: %s\n", errmsg); > return 0; > } else { > return 1; > } > } /* 1}}} */ > int sql_commit(sqlite3 *db) { /* {{{1 */ > char *errmsg; > if (sqlite3_exec(db, "COMMIT TRANSACTION", > NULL, NULL, ) != SQLITE_OK) { > printf("couldn't commit transaction: %s\n", errmsg); > return 0; > } else { > return 1; > } > } /* 1}}} */ > float elapsed(struct timeval start, struct timeval end) { /* {{{1 */ > return (float) (end.tv_sec - start.tv_sec ) + >((float) (end.tv_usec - start.tv_usec)/100); > } /* 1}}} */ > > > int main(int argc, char *argv[]) { > const char *zLeftover; > #define CMD_SIZE 1000 > char rm_command[CMD_SIZE], > *errmsg, *dbfile = "/tmp/a.db"; >/* *dbfile = ":memory:" is faster, but not of interest */ > intrc, i, N, xact_size, n_this_xact = 0; > double x, y, z; > float delta_T; > struct timeval start_time, end_time; > sqlite3 *db; > sqlite3_stmt *Stmt; > > > if (argc < 3) { > printf("\nUsage: %s\n\n", argv[0]); > printf("\tInsert rows into a table of an SQLite database\n"); > printf("\tusing transaction sizes of .\n"); > printf("\tThe table has four columns of numeric data:\n;); > printf("\t field_1 integer\n"); > printf("\t field_2 float\n"); > printf("\t field_3 float\n"); > printf("\t field_4 float\n"); > printf("\tThe integer field will have values 1.. while the\n"); > printf("\tdouble precision values are random on [-50.0, 50.0]\n"); > exit(0); > } > N = atoi(argv[1]); > xact_size = atoi(argv[2]); > snprintf(rm_command, CMD_SIZE-1, "rm -f %s", dbfile); > system(rm_command); /* the database file must not exist before > calling sqlite3_open() and trying to insert */ > > > gettimeofday(_time, 0); >
[sqlite] beat 120,000 inserts/sec
A scientific application I work with has clumsy data retrieval options. I dumped the application's output--integer and floating point numbers--into an SQLite database and soon after began to enjoy the power of SQL to pull out interesting results. The main complaint for making the transfer to SQLite a permanent part of our solution is the time it takes to insert the numbers into a database. It takes about a minute to insert 24 million numbers into three tables. Most database people (including me) would be thrilled to see this kind of insert performance but my colleagues are asking if it can be sped up. The attached C program measures insert performance for populating a table with an integer and three random floating point values with user defined transaction size. Usage is: ./sqlite_insert It writes to the hardcoded database file /tmp/a.db On my Dell Precision 360, Pentium4 3.0 GHz, 1 GB RAM, IDE disk drive, ext3, RHEL v3, Linux kernel 2.4.21, I peak out around 121,000 inserts/second using a transaction size of 20,000: ./sqlite_insert 10 2 10 inserts to /tmp/a.db in 0.829 s = 120626.53 inserts/s Performance drops a bit when I increase the number of rows to two million (a typical size for my application): ./sqlite_insert 200 5 200 inserts to /tmp/a.db in 17.124 s = 116795.07 inserts/s What kind of insert performance do you see on your machine? If it is substantially better than 120 kinserts/s what kind of hardware do you have? I'm especially interested in how much faster the code runs on systems with multiple disks in a RAID 0 configuration. Are there other tricks to speeding insert performance? Is it even reasonable to ask for more? -- Al /* [EMAIL PROTECTED] # sample build: gcc -o sqlite_insert sqlite_insert.c \ -L/usr/local/sqlite-3.2.1/lib -lsqlite3 -I/usr/local/sqlite-3.2.1/include # sample run: ./sqlite_insert 10 5 */ #include #include /* RAND_MAX */ #include /* timeval, gettimeofday() */ int sql_begin(sqlite3 *db) { /* {{{1 */ char *errmsg; if (sqlite3_exec(db, "BEGIN TRANSACTION", NULL, NULL, ) != SQLITE_OK) { printf("couldn't begin transaction: %s\n", errmsg); return 0; } else { return 1; } } /* 1}}} */ int sql_commit(sqlite3 *db) { /* {{{1 */ char *errmsg; if (sqlite3_exec(db, "COMMIT TRANSACTION", NULL, NULL, ) != SQLITE_OK) { printf("couldn't commit transaction: %s\n", errmsg); return 0; } else { return 1; } } /* 1}}} */ float elapsed(struct timeval start, struct timeval end) { /* {{{1 */ return (float) (end.tv_sec - start.tv_sec ) + ((float) (end.tv_usec - start.tv_usec)/100); } /* 1}}} */ int main(int argc, char *argv[]) { const char *zLeftover; #define CMD_SIZE 1000 char rm_command[CMD_SIZE], *errmsg, *dbfile = "/tmp/a.db"; /* *dbfile = ":memory:" is faster, but not of interest */ intrc, i, N, xact_size, n_this_xact = 0; double x, y, z; float delta_T; struct timeval start_time, end_time; sqlite3 *db; sqlite3_stmt *Stmt; if (argc < 3) { printf("\nUsage: %s\n\n", argv[0]); printf("\tInsert rows into a table of an SQLite database\n"); printf("\tusing transaction sizes of .\n"); printf("\tThe table has four columns of numeric data:\n;); printf("\t field_1 integer\n"); printf("\t field_2 float\n"); printf("\t field_3 float\n"); printf("\t field_4 float\n"); printf("\tThe integer field will have values 1.. while the\n"); printf("\tdouble precision values are random on [-50.0, 50.0]\n"); exit(0); } N = atoi(argv[1]); xact_size = atoi(argv[2]); snprintf(rm_command, CMD_SIZE-1, "rm -f %s", dbfile); system(rm_command); /* the database file must not exist before calling sqlite3_open() and trying to insert */ gettimeofday(_time, 0); rc = sqlite3_open(dbfile, ); sql_begin(db); rc = sqlite3_prepare(db, "create table table_name(field_1 integer primary key," "field_2 float, " "field_3 float, " "field_4 float)",