Re: [sqlite] Write performance question for 3.7.15
Ah, interesting. However, yes, we need production-ready. Good luck with sqlite4 tho. On Tue, Jan 1, 2013 at 11:43 AM, Richard Hippwrote: > On Tue, Jan 1, 2013 at 12:33 PM, Dan Frankowski > wrote: > > > > > We are comparing to leveldb, which seems to have much better write > > performance even in a limited-memory situation. Of course it offers much > > less than sqlite. It is a partially-ordered key/value store, rather than > a > > relational database. > > > > The default LSM storage layer for SQLite4 gives much better performance > than LevelDB on average. Note that most LevelDB inserts are a little > faster than LSM, however, every now and then LevelDB encounters a really, > really slow insert. SQLite4 LSM avoids these spikes and hence is able to > perform significantly faster in the long run. SQLite4 LSM also gives you > concurrent access and transactions - capabilities that are missing from > LevelDB. > > SQLite4 gives you all the high-level schema and querying capabilities as > SQLite3, with enhancements. > > OTOH, SQLite4 is not anything close to being production ready at this time. > > > -- > D. Richard Hipp > d...@sqlite.org > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Write performance question for 3.7.15
On Tue, Jan 1, 2013 at 12:33 PM, Dan Frankowskiwrote: > > We are comparing to leveldb, which seems to have much better write > performance even in a limited-memory situation. Of course it offers much > less than sqlite. It is a partially-ordered key/value store, rather than a > relational database. > The default LSM storage layer for SQLite4 gives much better performance than LevelDB on average. Note that most LevelDB inserts are a little faster than LSM, however, every now and then LevelDB encounters a really, really slow insert. SQLite4 LSM avoids these spikes and hence is able to perform significantly faster in the long run. SQLite4 LSM also gives you concurrent access and transactions - capabilities that are missing from LevelDB. SQLite4 gives you all the high-level schema and querying capabilities as SQLite3, with enhancements. OTOH, SQLite4 is not anything close to being production ready at this time. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Write performance question for 3.7.15
I appreciate everyone's thoughts about this. Knowing larger batch sizes help is interesting. Unfortunately, we don't always control the batch size. We're using 1000 as an optimistic estimate, but we receive things and may just have to commit after awhile. Knowing that more OS file cache or a faster disk helps is also interesting. Unfortunately, it is non-trivial to switch to SSDs. We will have a whole fleet of machines, each storing several hundred terabytes. The sqlite databases are meta-data about that. We might be able to use one SSD just for the meta-data. We haven't explored that yet. We also can't use lots of OS disk cache, as it will probably be taken by writing things other than this meta-data. Still, all of your observations are useful. We are comparing to leveldb, which seems to have much better write performance even in a limited-memory situation. Of course it offers much less than sqlite. It is a partially-ordered key/value store, rather than a relational database. Michael Black writes: Referencing the C program I sent earlierI've found a COMMIT every 1M records does best. I had an extra zero on my 100,000 which gives the EKG appearance. I averaged 25,000 inserts/sec over 50M records with no big knees in the performance (there is a noticeable knee on the commit though around 12M records). But the average performance curve is pretty smooth. Less than that and you're flushing out the index too often which causes an awful lot of disk thrashing it would seem. During the 1M commit the CPU drops to a couple % and the disk I/O is pretty constant...albeit slow P.S. I'm using 3.7.15.1 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Write performance question for 3.7.15
On 29 Dec 2012, at 9:45pm, Michael Blackwrote: > During the 1M commit the CPU drops to a couple % and the disk I/O is pretty > constant...albeit slow For the last few years, since multi-core processors have been common on computers, SQLite performance has usually been limited by the performance of storage. Several times I've recommended to some users that rather than pouring their money and development effort into unintuitive programming (e.g. splitting one TABLE into smaller ones) they just update from spinning disks to SSD. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Write performance question for 3.7.15
Referencing the C program I sent earlierI've found a COMMIT every 1M records does best. I had an extra zero on my 100,000 which gives the EKG appearance. I averaged 25,000 inserts/sec over 50M records with no big knees in the performance (there is a noticeable knee on the commit though around 12M records). But the average performance curve is pretty smooth. Less than that and you're flushing out the index too often which causes an awful lot of disk thrashing it would seem. During the 1M commit the CPU drops to a couple % and the disk I/O is pretty constant...albeit slow P.S. I'm using 3.7.15.1 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Write performance question for 3.7.15
I wrote a C program doing your thing (with random data so each key is unique) I see some small knees at 20M and 23M -- but nothing like what you're seeing as long as I don't do the COMMIT. Seems the COMMIT is what's causing the sudden slowdown. When doing the COMMIT I see your dramatic slowdown (an order of magnitude) at around 5M records...regardless of cache sizeso cache size isn't the problem. I'm guessing the COMMIT is paging out the index which starts thrashing the disk. Increasing the COMMIT to every 100,000 seems to help a lot. The plot looks almost like an EKG then with regular slowdowns. And...when not doing the commit is it normal for memory usage to increase like the WAL file does? #include #include #include #include #include "sqlite3.h" time_t base_seconds; suseconds_t base_useconds; void tic() { struct timeval tv; gettimeofday(,NULL); base_seconds=tv.tv_sec; base_useconds=tv.tv_usec; } // returns time in seconds since tic() was called double toc() { struct timeval tv; gettimeofday(,NULL); double mark=(tv.tv_sec-base_seconds)+(tv.tv_usec-base_useconds)/1.0e6; return mark; } void checkrc(sqlite3 *db,int rc,int checkrc,int flag,char *msg,char *str) { if (rc != checkrc) { fprintf(stderr,msg,str); fprintf(stderr,"%s\n",sqlite3_errmsg(db)); if (flag) { // then fatal exit(1); } } } int main(int argc, char *argv[]) { int rc; long i; char *sql,*errmsg=NULL; char *databaseName="data.db"; sqlite3 *db; sqlite3_stmt *stmt1,*stmt2; remove(databaseName); rc = sqlite3_open_v2(databaseName,,SQLITE_OPEN_READWRITE|SQLITE_OPEN_CREATE,NU LL); checkrc(db,SQLITE_OK,rc,1,"Error opening database '%s': ",databaseName); sql = "create table if not exists t_foo (key binary(16) primary key, value binary(16))"; rc=sqlite3_prepare_v2(db,sql,-1,,NULL); checkrc(db,SQLITE_OK,rc,1,"Error preparing statement '%s': ",sql); rc=sqlite3_step(stmt1); checkrc(db,SQLITE_DONE,rc,1,"Error executing statement '%s': ",sql); rc=sqlite3_step(stmt1); checkrc(db,SQLITE_DONE,rc,1,"Error executing statement '%s': ",sql); rc=sqlite3_finalize(stmt1); checkrc(db,SQLITE_OK,rc,1,"Error finalizing statement '%s': ",sql); rc=sqlite3_exec(db, "PRAGMA journal_mode=WAL",NULL,NULL,); checkrc(db,SQLITE_OK,rc,1,"Error on WAL mode statement '%s': ",sql); rc=sqlite3_exec(db, "PRAGMA synchronous=OFF",NULL,NULL,); checkrc(db,SQLITE_OK,rc,1,"Error on synchronous mode statement '%s': ",sql); rc=sqlite3_exec(db, "PRAGMA cache_size=10",NULL,NULL,); checkrc(db,SQLITE_OK,rc,1,"Error on cache size statement '%s': ",sql); sql="BEGIN"; rc=sqlite3_exec(db,sql,NULL,NULL,); checkrc(db,SQLITE_OK,rc,1,"Error preparing statement '%s': ",sql); sql = "insert or replace into t_foo(key,value) values(?,?)"; rc=sqlite3_prepare_v2(db,sql,-1,,NULL); checkrc(db,SQLITE_OK,rc,1,"Error preparing statement '%s': ",sql); tic(); for(i=0; i<5000; ++i) { char key[16],value[16]; long number = random(); if (i>0 && (i % 10) == 0) { printf("%ld,%g \n",i,10/toc()); tic(); } #if 0 // COMMIT? if (i>0&&(i % 1000)==0) { // try 100,000 sql="COMMIT"; rc=sqlite3_exec(db,sql,NULL,NULL,); checkrc(db,SQLITE_OK,rc,1,"Error executing statement '%s': ",errmsg); sql="BEGIN"; rc=sqlite3_exec(db,sql,NULL,NULL,); checkrc(db,SQLITE_OK,rc,1,"Error executing statement '%s': ",errmsg); } #endif memcpy(key,,8); memcpy([8],,8); memcpy(value,,8); rc=sqlite3_bind_blob(stmt2,1,key,16,SQLITE_STATIC); checkrc(db,SQLITE_OK,rc,1,"Error bind1 statement '%s': ",sql); rc=sqlite3_bind_blob(stmt2,2,value,16,SQLITE_STATIC); checkrc(db,SQLITE_OK,rc,1,"Error bind2 statement '%s': ",sql); rc=sqlite3_step(stmt2); checkrc(db,SQLITE_DONE,rc,1,"Error finalizing statement '%s': ",sql); rc=sqlite3_reset(stmt2); checkrc(db,SQLITE_OK,rc,1,"Error resetting statement '%s': ",sql); } return 0; } -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Simon Slavin Sent: Saturday, December 29, 2012 8:19 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Write performance question for 3.7.15 On 29 Dec 2012, at 12:37pm, Stephen Chrzanowski <pontia...@gmail.com> wrote: > My guess would be the OS slowing things down with write caching. The > system will hold so much data in memory as a cache to write to the disk, > and when the cache gets full, the OS slows down and waits on the HDD. Try > doing a [dd] to a few gig worth of random data and see if you get the same > kind of sl
Re: [sqlite] Write performance question for 3.7.15
On Fri, Dec 28, 2012 at 03:35:17PM -0600, Dan Frankowski wrote: > > 3. Would horizontal partitioning (i.e. creating multiple tables, each for a > different key range) help? This would seriously impair read performance (you'd have to access two indices instead of one). Valentin Davydov. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Write performance question for 3.7.15
On 29 Dec 2012, at 12:37pm, Stephen Chrzanowskiwrote: > My guess would be the OS slowing things down with write caching. The > system will hold so much data in memory as a cache to write to the disk, > and when the cache gets full, the OS slows down and waits on the HDD. Try > doing a [dd] to a few gig worth of random data and see if you get the same > kind of slow down. Makes sense. A revealing of how much memory the operating system is using for caching. Once you hit 30M rows you exceed the amount of memory the system is using for caching, and it has to start reading or writing disk for every operation which is far slower. Or it's the amount of memory that the operating system is allowing the benchmarking process to use. Or some other OS limitation. But the underlying information in our responses is that it's not a decision built into SQLite. There's nothing in SQLite which says we use a fast strategy for up to 25M rows and then a slower one from then on. A good way to track it down would be to close the database at the point where performance starts to tank, and look at how big the filesize is. That size should give a clue about what resource the OS is limiting to that size. Another might be to add an extra unindexed column to the test database and fill it with a fixed text string in each row. If this changes the number of rows before the cliff edge then it's dependent on total filesize. If it doesn't, then it's dependent on the size of the index being searched for each INSERT. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Write performance question for 3.7.15
On Fri, Dec 28, 2012 at 3:34 PM, Dan Frankowskiwrote: > I am running a benchmark of inserting 100 million (100M) items into a > table. I am seeing performance I don't understand. Graph: > http://imgur.com/hH1Jr. Can anyone explain: > > 1. Why does write speed (writes/second) slow down dramatically around 28M > items? > 2. Are there parameters (perhaps related to table size) that would change > this write performance? > 3. Would horizontal partitioning (i.e. creating multiple tables, each for a different key range) help? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users