I wrote a C program doing your thing (with random data so each key is
unique)

I see some small knees at 20M and 23M -- but nothing like what you're seeing
as long as I don't do the COMMIT.
Seems the COMMIT is what's causing the sudden slowdown.
When doing the COMMIT I see your dramatic slowdown (an order of magnitude)
at around 5M records...regardless of cache size....so cache size isn't the
problem.
I'm guessing the COMMIT is paging out the index which starts thrashing the
disk.
Increasing the COMMIT to every 100,000 seems to help a lot.  The plot looks
almost like an EKG then with regular slowdowns.


And...when not doing the commit is it normal for memory usage to increase
like the WAL file does?


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/time.h>
#include "sqlite3.h"
  
time_t base_seconds;
suseconds_t base_useconds;
  
void tic() {
  struct timeval tv;
  gettimeofday(&tv,NULL);
  base_seconds=tv.tv_sec;
  base_useconds=tv.tv_usec;
} 
  
// returns time in seconds since tic() was called
double toc() {
  struct timeval tv;
  gettimeofday(&tv,NULL);
  double mark=(tv.tv_sec-base_seconds)+(tv.tv_usec-base_useconds)/1.0e6;
  return mark;
}
    
void checkrc(sqlite3 *db,int rc,int checkrc,int flag,char *msg,char *str) {
  if (rc != checkrc) {
    fprintf(stderr,msg,str);
    fprintf(stderr,"%s\n",sqlite3_errmsg(db));
    if (flag) { // then fatal
      exit(1);
    }
  }
}   

int main(int argc, char *argv[]) {
  int rc;
  long i;
  char *sql,*errmsg=NULL;
  char *databaseName="data.db";
  sqlite3 *db;
  sqlite3_stmt *stmt1,*stmt2;
  remove(databaseName);
  rc =
sqlite3_open_v2(databaseName,&db,SQLITE_OPEN_READWRITE|SQLITE_OPEN_CREATE,NU
LL);
  checkrc(db,SQLITE_OK,rc,1,"Error opening database '%s': ",databaseName);
  sql = "create table if not exists t_foo (key binary(16) primary key, value
binary(16))";
  rc=sqlite3_prepare_v2(db,sql,-1,&stmt1,NULL);
  checkrc(db,SQLITE_OK,rc,1,"Error preparing statement '%s': ",sql);
  rc=sqlite3_step(stmt1);
  checkrc(db,SQLITE_DONE,rc,1,"Error executing statement '%s': ",sql);
  rc=sqlite3_step(stmt1);
  checkrc(db,SQLITE_DONE,rc,1,"Error executing statement '%s': ",sql);
  rc=sqlite3_finalize(stmt1);
  checkrc(db,SQLITE_OK,rc,1,"Error finalizing statement '%s': ",sql);
  rc=sqlite3_exec(db, "PRAGMA journal_mode=WAL",NULL,NULL,&errmsg);
  checkrc(db,SQLITE_OK,rc,1,"Error on WAL mode statement '%s': ",sql);
  rc=sqlite3_exec(db, "PRAGMA synchronous=OFF",NULL,NULL,&errmsg);
  checkrc(db,SQLITE_OK,rc,1,"Error on synchronous mode statement '%s':
",sql);
  rc=sqlite3_exec(db, "PRAGMA cache_size=10",NULL,NULL,&errmsg);
  checkrc(db,SQLITE_OK,rc,1,"Error on cache size statement '%s': ",sql);
  sql="BEGIN";
  rc=sqlite3_exec(db,sql,NULL,NULL,&errmsg);
  checkrc(db,SQLITE_OK,rc,1,"Error preparing statement '%s': ",sql);
  sql = "insert or replace into t_foo(key,value) values(?,?)";
  rc=sqlite3_prepare_v2(db,sql,-1,&stmt2,NULL);
  checkrc(db,SQLITE_OK,rc,1,"Error preparing statement '%s': ",sql);
  tic();
  for(i=0; i<50000000; ++i) {
    char key[16],value[16];
    long number = random();
    if (i>0 && (i % 100000) == 0) {
      printf("%ld,%g \n",i,100000/toc());
      tic();
    }
#if 0 // COMMIT?
    if  (i>0&&(i % 1000)==0) { // try 100,000 
      sql="COMMIT";
      rc=sqlite3_exec(db,sql,NULL,NULL,&errmsg);
      checkrc(db,SQLITE_OK,rc,1,"Error executing statement '%s': ",errmsg);
      sql="BEGIN";
      rc=sqlite3_exec(db,sql,NULL,NULL,&errmsg);
      checkrc(db,SQLITE_OK,rc,1,"Error executing statement '%s': ",errmsg);
    }
#endif
    memcpy(key,&number,8);
    memcpy(&key[8],&number,8);
    memcpy(value,&i,8);
    rc=sqlite3_bind_blob(stmt2,1,key,16,SQLITE_STATIC);
    checkrc(db,SQLITE_OK,rc,1,"Error bind1 statement '%s': ",sql);
    rc=sqlite3_bind_blob(stmt2,2,value,16,SQLITE_STATIC);
    checkrc(db,SQLITE_OK,rc,1,"Error bind2 statement '%s': ",sql);
    rc=sqlite3_step(stmt2);
    checkrc(db,SQLITE_DONE,rc,1,"Error finalizing statement '%s': ",sql);
    rc=sqlite3_reset(stmt2);
    checkrc(db,SQLITE_OK,rc,1,"Error resetting statement '%s': ",sql);
  }
  return 0;
}


-----Original Message-----
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Simon Slavin
Sent: Saturday, December 29, 2012 8:19 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Write performance question for 3.7.15


On 29 Dec 2012, at 12:37pm, Stephen Chrzanowski <pontia...@gmail.com> wrote:

> My guess would be the OS slowing things down with write caching.  The
> system will hold so much data in memory as a cache to write to the disk,
> and when the cache gets full, the OS slows down and waits on the HDD.  Try
> doing a [dd] to a few gig worth of random data and see if you get the same
> kind of slow down.

Makes sense.  A revealing of how much memory the operating system is using
for caching.  Once you hit 30M rows you exceed the amount of memory the
system is using for caching, and it has to start reading or writing disk for
every operation which is far slower.  Or it's the amount of memory that the
operating system is allowing the benchmarking process to use.  Or some other
OS limitation.

But the underlying information in our responses is that it's not a decision
built into SQLite.  There's nothing in SQLite which says we use a fast
strategy for up to 25M rows and then a slower one from then on.

A good way to track it down would be to close the database at the point
where performance starts to tank, and look at how big the filesize is.  That
size should give a clue about what resource the OS is limiting to that size.
Another might be to add an extra unindexed column to the test database and
fill it with a fixed text string in each row.  If this changes the number of
rows before the cliff edge then it's dependent on total filesize.  If it
doesn't, then it's dependent on the size of the index being searched for
each INSERT.

Simon.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to