Re: [sqlite] beat 120,000 inserts/sec

2005-04-11 Thread Thomas Steffen
On Apr 11, 2005 4:06 PM, Christian Smith <[EMAIL PROTECTED]> wrote:

> The test given is clearly CPU bound. All the big numbers are from people
> with big CPUs, with equally big RAM performance as well, probably.

I have done a few database test recently, and I often found them to be
CPU bound, at least as long as the data fits into memory.

If you are looking for a fast system, I think there is nothing
affordable that currently beats an AMD64 (Athlon64 or Opteron). The
build in memory controller leads to very low memory latency, which is
quite important for databases, and the 64bit CPU gives you lots of raw
power.

Thomas


Re: [sqlite] beat 120,000 inserts/sec

2005-04-11 Thread Christian Smith
On Sat, 9 Apr 2005, Al Danial wrote:

>On Apr 9, 2005 12:43 AM, Andy Lutomirski <[EMAIL PROTECTED]> wrote:
>> Al Danial wrote:
>> > The attached C program measures insert performance for populating
>> > a table with an integer and three random floating point values with
>> > user defined transaction size.  Usage is:
>> >
>> > ./sqlite_insert 
>>
>> All of these are on Gentoo, Athlon 64 3200+, running 64 bit.
>>
>> Writing to /tmp, which is ext3 (acl,user_xattr,usrquota) over RAID0 (two
>> slave drives on different channels):
>>
>> $ ./sqlitetest 10 5
>>   10 inserts to /tmp/a.db in 0.531 s = 188446.34 inserts/s
>> $ ./sqlitetest 200 5
>>   200 inserts to /tmp/a.db in 11.546 s = 173223.61 inserts/s
>
>That's impressive.  Clearly a well-implemented RAID0 configuration
>and fast disks make a huge difference.  This will be the direction I'll
>take.


My machine, 2x Athlon XP 1700+, 512MB RAM, U160 SCSI (3x10K disks):
[EMAIL PROTECTED] tmp]$ ./sqlite_insert 10 5
 10 inserts to ./a.db in 2.495 s = 40081.96 inserts/s
[EMAIL PROTECTED] tmp]$ ./sqlite_insert 200 5
 200 inserts to ./a.db in 47.068 s = 42491.71 inserts/s

While I have a big fat SCSI IO system, vmstat during the run shows only
shows about 2MB/s ever going to the HD, well within the capabilities of
any SCSI or IDE HD (Laptop even!)
[EMAIL PROTECTED] csmith]$ vmstat 1
procs ---memory-- ---swap-- -io --system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   incs us sy id wa
 3  0  93296 116156  30424 22960400 914   6231 96  3  1  0
 3  0  93296 114308  30484 23154800 0  2167  333  1194 99  1  0  0
 7  0  93296 112488  30512 23348800 0  2064  320  1092 99  1  0  0
 4  0  93296 110716  30540 23543200 0  2072  314  1060 99  1  0  0
 4  0  93296 108844  30568 23750400   128  2068  319  1069 99  1  0  0
 4  0  93296 107380  30596 23944400 0  2076  383  1252 98  2  0  0
 2  2  93296 103352  30640 24139200 0  2101  321   96  4  0  0



>
>Thanks to everyone who posted performance numbers and machine
>setup info.  Some results were counterintuitive (I'd have guessed
>SCSI drives would come out on top) but many variables are at work
>so I won't try to draw too many conclusions.


The test given is clearly CPU bound. All the big numbers are from people
with big CPUs, with equally big RAM performance as well, probably.


>Also thanks to the suggestions to study the pragma's.  I did try
>SYNCHRONOUS=off but that didn't seem to have an effect; I'll
>study the docs to make sure I've got it right. -- Al


For CPU bound work, it won't. You appear to be maxing out what the CPU and
memory subsystem can handle. Tweaking IO will not help. Using ":memory:"
gave about a 10% speedup, so that looks like the maximum amount of speedup
you could possibly hope for.


Christian


-- 
/"\
\ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
 X   - AGAINST MS ATTACHMENTS
/ \


Re: [sqlite] beat 120,000 inserts/sec

2005-04-09 Thread bbum
On Apr 9, 2005, at 8:49 AM, Al Danial wrote:
I did try
SYNCHRONOUS=off but that didn't seem to have an effect; I'll
study the docs to make sure I've got it right.
This isn't surprising.
fsync() is largely a no-op on just about any operating system.  It  
doesn't actually guarantee that the bytes are written to the platter  
(though it will likely impact queueing and the like within the kernel).

smime.p7s
Description: S/MIME cryptographic signature


Re: [sqlite] beat 120,000 inserts/sec

2005-04-09 Thread Andy Lutomirski

Andrew Piskorski wrote:
On Sat, Apr 09, 2005 at 11:49:17AM -0400, Al Danial wrote:

Thanks to everyone who posted performance numbers and machine
setup info.  Some results were counterintuitive (I'd have guessed
SCSI drives would come out on top) but many variables are at work

It is basically impossible that a 10k or 15k rpm SCSI disk is
genuinely substantially slower for writes than a 7200 rpm IDE disk.
Most likely, the SCSI disks had write-through cache turned off, and
the IDE disks had it turned on.
This load is probably mostly sequential writes.  The IDE drives I used 
were 200GB and have huge areal density, so their sequential write speed 
is probably a lot higher.  SCSI'll win if there are seeks involved (or a 
drive like a WD Raptor).

--Andy


Re: [sqlite] beat 120,000 inserts/sec

2005-04-09 Thread Andrew Piskorski
On Sat, Apr 09, 2005 at 11:49:17AM -0400, Al Danial wrote:

> Thanks to everyone who posted performance numbers and machine
> setup info.  Some results were counterintuitive (I'd have guessed
> SCSI drives would come out on top) but many variables are at work

It is basically impossible that a 10k or 15k rpm SCSI disk is
genuinely substantially slower for writes than a 7200 rpm IDE disk.
Most likely, the SCSI disks had write-through cache turned off, and
the IDE disks had it turned on.

-- 
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com/


Re: [sqlite] beat 120,000 inserts/sec

2005-04-09 Thread Al Danial
On Apr 9, 2005 12:43 AM, Andy Lutomirski <[EMAIL PROTECTED]> wrote:
> Al Danial wrote:
> > The attached C program measures insert performance for populating
> > a table with an integer and three random floating point values with
> > user defined transaction size.  Usage is:
> >
> > ./sqlite_insert 
> 
> All of these are on Gentoo, Athlon 64 3200+, running 64 bit.
> 
> Writing to /tmp, which is ext3 (acl,user_xattr,usrquota) over RAID0 (two
> slave drives on different channels):
> 
> $ ./sqlitetest 10 5
>   10 inserts to /tmp/a.db in 0.531 s = 188446.34 inserts/s
> $ ./sqlitetest 200 5
>   200 inserts to /tmp/a.db in 11.546 s = 173223.61 inserts/s

That's impressive.  Clearly a well-implemented RAID0 configuration 
and fast disks make a huge difference.  This will be the direction I'll
take.

Thanks to everyone who posted performance numbers and machine
setup info.  Some results were counterintuitive (I'd have guessed
SCSI drives would come out on top) but many variables are at work
so I won't try to draw too many conclusions.
Also thanks to the suggestions to study the pragma's.  I did try
SYNCHRONOUS=off but that didn't seem to have an effect; I'll 
study the docs to make sure I've got it right. -- Al


Re: [sqlite] beat 120,000 inserts/sec

2005-04-09 Thread Clay Dowling
Jeremy Hinegardner wrote:
 > ./sqlite_insert 10 5
 10 inserts to /tmp/a.db in 0.671 s = 149057.52 inserts/s
./sqlite_insert 200 2
 200 inserts to /tmp/a.db in 14.437 s = 138535.38 inserts/s
./sqlite_insert 200 5
 200 inserts to /tmp/a.db in 15.322 s = 130530.52 inserts/s
Not substantially better.  My configuration:
AMD Athlon(tm) 64 Processor 3000+
1GB RAM
2 x Maxtor 200GB SATA150 7200RPM 8MB HD
I'm running OpenBSD 3.6 with a customer kernel compiled to support
OpenBSD's software raid (RAIDFrame).  I have the 2 drives arranged in a
RAID 1 configuration. 
As a confirmed OpenBSD devotee these numbers make me very happy.  I use 
it for the security, but I can see that there are good performance 
reasons as well.  I guess we know where the app should run if they need 
that kind of performance.

Clay Dowling
--
http://www.lazarusid.com/notes/
Lazarus Notes
Articles and Commentary on Web Development


Re: [sqlite] beat 120,000 inserts/sec

2005-04-09 Thread bbum
There are also pragmas to control page size and in-memory caching.
You will want to play with those, as well.   If SQLite is in the
middle of a transaction and you load it up with commands, it will
create a journal file in /tmp/ to start pages that don't fit in the
in-memory page cache (or something like that).   As such, the pragmas
can hugely affect performance.   You will likely find that tuning the
engine for particular roles -- bulk loading, queries, random updates
-- will likely be useful.
See the SQLite documentation for more information.
b.bum


smime.p7s
Description: S/MIME cryptographic signature


Re: [sqlite] beat 120,000 inserts/sec

2005-04-08 Thread Jeremy Hinegardner
On Fri, Apr 08, 2005 at 11:01:02PM -0400, Al Danial wrote:
[...]

> What kind of insert performance do you see on your machine?  If it
> is substantially better than 120 kinserts/s what kind of hardware
> do you have?  I'm especially interested in how much faster the
> code runs on systems with multiple disks in a RAID 0 configuration.
> Are there other tricks to speeding insert performance?  Is it even
> reasonable to ask for more? -- Al

./sqlite_insert 10 5 
 10 inserts to /tmp/a.db in 0.671 s = 149057.52 inserts/s

./sqlite_insert 200 2
 200 inserts to /tmp/a.db in 14.437 s = 138535.38 inserts/s

./sqlite_insert 200 5
 200 inserts to /tmp/a.db in 15.322 s = 130530.52 inserts/s

Not substantially better.  My configuration:

AMD Athlon(tm) 64 Processor 3000+
1GB RAM
2 x Maxtor 200GB SATA150 7200RPM 8MB HD

I'm running OpenBSD 3.6 with a customer kernel compiled to support
OpenBSD's software raid (RAIDFrame).  I have the 2 drives arranged in a
RAID 1 configuration. 

enjoy,

-jeremy


-- 

 Jeremy Hinegardner  [EMAIL PROTECTED] 



Re: [sqlite] beat 120,000 inserts/sec

2005-04-08 Thread Andy Lutomirski
Al Danial wrote:
The attached C program measures insert performance for populating
a table with an integer and three random floating point values with
user defined transaction size.  Usage is:
./sqlite_insert 
All of these are on Gentoo, Athlon 64 3200+, running 64 bit.
Writing to /tmp, which is ext3 (acl,user_xattr,usrquota) over RAID0 (two 
slave drives on different channels):

$ ./sqlitetest 10 5
 10 inserts to /tmp/a.db in 0.531 s = 188446.34 inserts/s
$ ./sqlitetest 200 5
 200 inserts to /tmp/a.db in 11.546 s = 173223.61 inserts/s
I'm actually surprised to see this kind of performance.  The drives are 
brand-new Maxtor 200GB parallel ATA.  I imagine I could do better with 
usrquota off.

reiserfs over RAID5 (hideous config with different types of drives, 3 
master, 2 slave):

$ ./sqlitetest 10 5
 10 inserts to /home/foobar/a.db in 0.884 s = 113121.53 inserts/s
$ ./sqlitetest 200 5
 200 inserts to /home/foobar/a.db in 20.025 s = 99875.94 inserts/s
xfs on the same array:
$ ./sqlitetest 10 5
 10 inserts to /data/stage/a.db in 0.777 s = 128684.23 inserts/s
[EMAIL PROTECTED] tmp $ ./sqlitetest 200 5
 200 inserts to /data/stage/a.db in 14.541 s = 137544.14 inserts/s
Last but not least, reiserfs noatime over RAID1 (two drives, masters on 
different channels):
$ ./sqlitetest 10 5
 10 inserts to /usr/foobar/a.db in 0.700 s = 142824.48 inserts/s
$ ./sqlitetest 200 5
 200 inserts to /usr/foobar/a.db in 15.376 s = 130071.89 inserts/s

For kicks, I tried tmpfs.  It crashed (segfault in sqlite3_bind_int64 -- 
go figure).  I'll submit a bug report if I can reproduce it sanely.

So it looks like ext3 over RAID0 is good.  FWIW, I can read 99.18 MB/sec 
off /dev/md3 (where /tmp lives).

--Andy



Re: [sqlite] beat 120,000 inserts/sec

2005-04-08 Thread Dan Nuffer
Al Danial wrote:
> What kind of insert performance do you see on your machine?  If it
> is substantially better than 120 kinserts/s what kind of hardware
> do you have?  I'm especially interested in how much faster the
> code runs on systems with multiple disks in a RAID 0 configuration.
> Are there other tricks to speeding insert performance?  Is it even
> reasonable to ask for more? -- Al
> 

sqlite 3.0.8
Fedora core 2, kernel 2.6.8-1.521
/proc/cpuinfo:
model name  : AMD Athlon(tm) 64 Processor 3200+
cpu MHz : 2043.192
cache size  : 1024 KB

On a Seagate 7200.7 200GB Parallel IDE drive with ext3:

$ ./sqlite_insert 10 5
 10 inserts to /tmp/a.db in 0.841 s = 118872.29 inserts/s
$ ./sqlite_insert 10 2
 10 inserts to /tmp/a.db in 0.848 s = 117858.23 inserts/s
$ ./sqlite_insert 200 5
 200 inserts to /tmp/a.db in 18.446 s = 108424.30 inserts/s

Same box with RAID 0 configuration with 2 U160 SCSI Seagate 15000 RPM
18GB Cheetahs (2nd generation, purchased in 2001) with ext3:

$ ./sqlite_insert 10 2
 10 inserts to ./a.db in 1.177 s = 84977.14 inserts/s
$ ./sqlite_insert 10 2
 10 inserts to ./a.db in 1.146 s = 87230.96 inserts/s
$ ./sqlite_insert 200 5
 200 inserts to ./a.db in 20.287 s = 98584.18 inserts/s

-- 
Dan Nuffer




Re: [sqlite] beat 120,000 inserts/sec

2005-04-08 Thread Eric Bohlman
Al Danial wrote:
A scientific application I work with has clumsy data retrieval
options.  I dumped the application's output--integer and floating
point numbers--into an SQLite database and soon after began to
enjoy the power of SQL to pull out interesting results.
The main complaint for making the transfer to SQLite a permanent
part of our solution is the time it takes to insert the numbers
into a database.  It takes about a minute to insert 24 million
numbers into three tables.  Most database people (including me)
would be thrilled to see this kind of insert performance but
my colleagues are asking if it can be sped up.
Try setting PRAGMA synchronous=OFF and seeing how much improvement you 
get.  If it's significant, you'll have to decide for yourself if you can 
live with the increased risk of corruption if there's a power failure 
during a batch load.


Re: [sqlite] beat 120,000 inserts/sec

2005-04-08 Thread Dan Kennedy
Random thought: You might squeeze some more performance out by 
trying a couple different filesystems. i.e. if you're using ext3, 
try some different journaling options, or try ext2.

--- Al Danial <[EMAIL PROTECTED]> wrote:

> A scientific application I work with has clumsy data retrieval
> options.  I dumped the application's output--integer and floating
> point numbers--into an SQLite database and soon after began to
> enjoy the power of SQL to pull out interesting results.
> 
> The main complaint for making the transfer to SQLite a permanent
> part of our solution is the time it takes to insert the numbers
> into a database.  It takes about a minute to insert 24 million
> numbers into three tables.  Most database people (including me)
> would be thrilled to see this kind of insert performance but
> my colleagues are asking if it can be sped up.
> 
> The attached C program measures insert performance for populating
> a table with an integer and three random floating point values with
> user defined transaction size.  Usage is:
> 
> ./sqlite_insert 
> 
> It writes to the hardcoded database file /tmp/a.db
> On my Dell Precision 360, Pentium4 3.0 GHz, 1 GB RAM, IDE disk
> drive, ext3, RHEL v3, Linux kernel 2.4.21, I peak out around 121,000
> inserts/second using a transaction size of 20,000:
> ./sqlite_insert 10 2
> 10 inserts to /tmp/a.db in 0.829 s = 120626.53 inserts/s
> 
> Performance drops a bit when I increase the number of rows to
> two million (a typical size for my application):
> 
> ./sqlite_insert 200 5
> 200 inserts to /tmp/a.db in 17.124 s = 116795.07 inserts/s
> 
> What kind of insert performance do you see on your machine?  If it
> is substantially better than 120 kinserts/s what kind of hardware
> do you have?  I'm especially interested in how much faster the
> code runs on systems with multiple disks in a RAID 0 configuration.
> Are there other tricks to speeding insert performance?  Is it even
> reasonable to ask for more? -- Al
> > /*
>   [EMAIL PROTECTED]
> 
>   # sample build:
>   gcc -o sqlite_insert sqlite_insert.c \
>   -L/usr/local/sqlite-3.2.1/lib -lsqlite3 
> -I/usr/local/sqlite-3.2.1/include
> 
>   # sample run:
>   ./sqlite_insert 10 5
> 
> */
>   
>   
> #include 
> #include /* RAND_MAX */
> #include   /* timeval, gettimeofday() */
>   
>   
> int sql_begin(sqlite3 *db) { /* {{{1 */
> char *errmsg;
> if (sqlite3_exec(db, "BEGIN TRANSACTION",
>  NULL, NULL, ) != SQLITE_OK) {
> printf("couldn't begin transaction:  %s\n", errmsg);
> return 0;
> } else {
> return 1;
> }
> } /* 1}}} */
> int sql_commit(sqlite3 *db) { /* {{{1 */
> char *errmsg;
> if (sqlite3_exec(db, "COMMIT TRANSACTION",
>  NULL, NULL, ) != SQLITE_OK) {
> printf("couldn't commit transaction:  %s\n", errmsg);
> return 0;
> } else {
> return 1;
> }
> } /* 1}}} */
> float elapsed(struct timeval start, struct timeval end) { /* {{{1 */
> return  (float) (end.tv_sec  - start.tv_sec ) +
>((float) (end.tv_usec - start.tv_usec)/100);
> } /* 1}}} */
>   
>   
> int main(int argc, char *argv[]) {
> const char *zLeftover;
> #define CMD_SIZE 1000
> char  rm_command[CMD_SIZE],
>  *errmsg, *dbfile = "/tmp/a.db"; 
>/* *dbfile = ":memory:" is faster, but not of interest */
> intrc, i, N, xact_size, n_this_xact = 0;
> double x, y, z;
> float  delta_T;
> struct timeval start_time, end_time;
> sqlite3  *db;
> sqlite3_stmt *Stmt;
>   
>   
> if (argc < 3) {
> printf("\nUsage:   %s\n\n", argv[0]);
> printf("\tInsert  rows into a table of an SQLite database\n");
> printf("\tusing transaction sizes of .\n");
> printf("\tThe table has four columns of numeric data:\n;);
> printf("\t  field_1 integer\n");
> printf("\t  field_2 float\n");
> printf("\t  field_3 float\n");
> printf("\t  field_4 float\n");
> printf("\tThe integer field will have values 1.. while the\n");
> printf("\tdouble precision values are random on [-50.0, 50.0]\n");
> exit(0);
> }
> N = atoi(argv[1]);
> xact_size = atoi(argv[2]);
> snprintf(rm_command, CMD_SIZE-1, "rm -f %s", dbfile);
> system(rm_command);  /* the database file must not exist before
> calling sqlite3_open() and trying to insert */
>   
>   
> gettimeofday(_time, 0);
>   

[sqlite] beat 120,000 inserts/sec

2005-04-08 Thread Al Danial
A scientific application I work with has clumsy data retrieval
options.  I dumped the application's output--integer and floating
point numbers--into an SQLite database and soon after began to
enjoy the power of SQL to pull out interesting results.

The main complaint for making the transfer to SQLite a permanent
part of our solution is the time it takes to insert the numbers
into a database.  It takes about a minute to insert 24 million
numbers into three tables.  Most database people (including me)
would be thrilled to see this kind of insert performance but
my colleagues are asking if it can be sped up.

The attached C program measures insert performance for populating
a table with an integer and three random floating point values with
user defined transaction size.  Usage is:

./sqlite_insert 

It writes to the hardcoded database file /tmp/a.db
On my Dell Precision 360, Pentium4 3.0 GHz, 1 GB RAM, IDE disk
drive, ext3, RHEL v3, Linux kernel 2.4.21, I peak out around 121,000
inserts/second using a transaction size of 20,000:
./sqlite_insert 10 2
10 inserts to /tmp/a.db in 0.829 s = 120626.53 inserts/s

Performance drops a bit when I increase the number of rows to
two million (a typical size for my application):

./sqlite_insert 200 5
200 inserts to /tmp/a.db in 17.124 s = 116795.07 inserts/s

What kind of insert performance do you see on your machine?  If it
is substantially better than 120 kinserts/s what kind of hardware
do you have?  I'm especially interested in how much faster the
code runs on systems with multiple disks in a RAID 0 configuration.
Are there other tricks to speeding insert performance?  Is it even
reasonable to ask for more? -- Al
/*
  [EMAIL PROTECTED]

  # sample build:
  gcc -o sqlite_insert sqlite_insert.c \
  -L/usr/local/sqlite-3.2.1/lib -lsqlite3 -I/usr/local/sqlite-3.2.1/include

  # sample run:
  ./sqlite_insert 10 5

*/

#include 
#include /* RAND_MAX */
#include   /* timeval, gettimeofday() */

int sql_begin(sqlite3 *db) { /* {{{1 */
char *errmsg;
if (sqlite3_exec(db, "BEGIN TRANSACTION",
 NULL, NULL, ) != SQLITE_OK) {
printf("couldn't begin transaction:  %s\n", errmsg);
return 0;
} else {
return 1;
}
} /* 1}}} */
int sql_commit(sqlite3 *db) { /* {{{1 */
char *errmsg;
if (sqlite3_exec(db, "COMMIT TRANSACTION",
 NULL, NULL, ) != SQLITE_OK) {
printf("couldn't commit transaction:  %s\n", errmsg);
return 0;
} else {
return 1;
}
} /* 1}}} */
float elapsed(struct timeval start, struct timeval end) { /* {{{1 */
return  (float) (end.tv_sec  - start.tv_sec ) +
   ((float) (end.tv_usec - start.tv_usec)/100);
} /* 1}}} */

int main(int argc, char *argv[]) {
const char *zLeftover;
#define CMD_SIZE 1000
char  rm_command[CMD_SIZE],
 *errmsg, *dbfile = "/tmp/a.db"; 
   /* *dbfile = ":memory:" is faster, but not of interest */
intrc, i, N, xact_size, n_this_xact = 0;
double x, y, z;
float  delta_T;
struct timeval start_time, end_time;
sqlite3  *db;
sqlite3_stmt *Stmt;

if (argc < 3) {
printf("\nUsage:   %s\n\n", argv[0]);
printf("\tInsert  rows into a table of an SQLite database\n");
printf("\tusing transaction sizes of .\n");
printf("\tThe table has four columns of numeric data:\n;);
printf("\t  field_1 integer\n");
printf("\t  field_2 float\n");
printf("\t  field_3 float\n");
printf("\t  field_4 float\n");
printf("\tThe integer field will have values 1.. while the\n");
printf("\tdouble precision values are random on [-50.0, 50.0]\n");
exit(0);
}
N = atoi(argv[1]);
xact_size = atoi(argv[2]);
snprintf(rm_command, CMD_SIZE-1, "rm -f %s", dbfile);
system(rm_command);  /* the database file must not exist before
calling sqlite3_open() and trying to insert */

gettimeofday(_time, 0);

rc = sqlite3_open(dbfile, );

sql_begin(db);
rc = sqlite3_prepare(db, "create table table_name(field_1 integer primary key,"
 "field_2 float,  "
 "field_3 float,  "
 "field_4 float)",