> Does anyone know why the performance degrades this way and what can be done
> to resolve the problem?

Do you have by any chance shared cache turned on? I believe when
shared cache is used SQLite searches through all open databases to
understand if the one you want to open is already opened and available
for cache reuse.

Also I believe there's some similar search happens through all open
file descriptors in unix VFS. So that if several file descriptors to
the same file are open there was no issues with dropping advisory
locks on one descriptor while second is closing. And data below seem
to prove that this is the problem for you.

> I don't see the times you're seeing using this program...
> time ./db 5000
> 1.602s
>
> time ./db 10000
> 5.357s
>
> time ./db 15000
> 11.280s
>
> time ./db 20000
> 19.570s
>
> time ./db 25000
> 28.166s

Michael,
So in your test first 5000 are opened in 1.6 seconds and last 5000 are
opened in 8.6 seconds. I'd say it's a noticeable increase that can't
be attributed to OS.

> Each sample counts as 0.01 seconds.
>  %   cumulative   self              self     total
>  time   seconds   seconds    calls  us/call  us/call  name
>  52.06      0.76     0.76     5000   152.00   152.00  findInodeInfo
>  43.84      1.40     0.64     5000   128.00   128.00  findReusableFd

That's exactly what I talked about above. Roughly speaking
findReusableFd searches through all open file descriptors to
understand if the process already have some for the file you want to
open. Or more precisely it looks for fd for the same file which is
still open when corresponding sqlite3* connection is already closed
(and yes if you open 5 connections to the same file and then close 4
of them SQLite will still keep 5 file descriptors open to avoid
problems with POSIX advisory locks).


Pavel


On Fri, Sep 2, 2011 at 10:08 AM, Black, Michael (IS)
<michael.bla...@ngc.com> wrote:
> I profiled the sqlite3 test program below and I get this for 25,000 files
>
>
>
> Flat profile:
>
> Each sample counts as 0.01 seconds.
>  %   cumulative   self              self     total
>  time   seconds   seconds    calls  ms/call  ms/call  name
>  61.15     17.61    17.61    25000     0.70     0.70  findInodeInfo
>  36.67     28.17    10.56    25000     0.42     0.42  findReusableFd
>
> For 5,000 files looks like this:
>
> Each sample counts as 0.01 seconds.
>  %   cumulative   self              self     total
>  time   seconds   seconds    calls  us/call  us/call  name
>  52.06      0.76     0.76     5000   152.00   152.00  findInodeInfo
>  43.84      1.40     0.64     5000   128.00   128.00  findReusableFd
>
>
>
> Michael D. Black
>
> Senior Scientist
>
> NG Information Systems
>
> Advanced Analytics Directorate
>
>
>
> ________________________________
> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
> behalf of Black, Michael (IS) [michael.bla...@ngc.com]
> Sent: Friday, September 02, 2011 8:00 AM
> To: General Discussion of SQLite Database
> Subject: EXT :Re: [sqlite] sqlite3_open_v2 performance degrades as number of 
> opens increase
>
> I assume you've overridden the system default for 1024 files in ulimit for # 
> of open files?
>
> I don't see the times you're seeing using this program...though my databases 
> are empty which probably makes a difference.
> I do see the gradual increase in time...I think this is almost all due to the 
> OS when you try and open thousands of files in one process.
>
> I ran this once to create 25,000 databases
> ./db 25000
> ls -l file* | wc -l
> 25000
> The tested opening them
> time ./db 5000
> 1.602s
>
> time ./db 10000
> 5.357s
>
> time ./db 15000
> 11.280s
>
> time ./db 20000
> 19.570s
>
> time ./db 25000
> 28.166s
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include "sqlite3.h"
>
> int main(int argc,char *argv[])
> {
>  int i,rc;
>  sqlite3 *db;
>  for(i=0;i<atoi(argv[1]);++i) {
>    char name[4096];
>    sprintf(name,"file%d",i);
>    
> rc=sqlite3_open_v2(name,&db,SQLITE_OPEN_READWRITE|SQLITE_OPEN_CREATE|SQLITE_OPEN_NOMUTEX,NULL);
>    if (rc != SQLITE_OK) {
>        printf("%s",sqlite3_errmsg(db));
>        exit(1);
>    }
>    //sqlite3_close(db); // if you close the database it runs a LOT faster.  
> So I don't think it's sqlite3 causing it.
>  }
>  return 0;
> }
>
> If you run this you'll see similar behavior
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <unistd.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
>
> int main(int argc,char *argv[])
> {
>  int i;
>  int fd;
>  for(i=0;i<atoi(argv[1]);++i) {
>    char name[4096];
>    sprintf(name,"%dfile",i);
>    fd=open(name,O_RDWR|O_CREAT);
>    if (fd==-1) perror("open error:");
>    write(fd,name,strlen(name));
>    //close(fd); // again it runs a lot faster if you close the file 
> descriptor each time -- so it's the OS slowing you down.
>    fsync(fd);
>  }
>  return 0;
> }
>
>
>
>
> Michael D. Black
> Senior Scientist
> NG Information Systems
> Advanced Analytics Directorate
>
>
>
>
> From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
> behalf of Terry Cumaranatunge [cumar...@gmail.com]
> Sent: Friday, September 02, 2011 6:30 AM
> To: sqlite-users@sqlite.org
> Subject: EXT :[sqlite] sqlite3_open_v2 performance degrades as number of 
> opens increase
>
>
> Hello,
>
> We have an application that creates many small databases (over 100K) to be
> able to control the size of the database and provide more deterministic
> performance. At process startup, it opens many of these databases in a loop
> to keep them open for faster transaction response times. The behavior we are
> noticing is that the it takes progressively a longer time for each
> sqlite3_open_v2 to complete as the number of databases kept
> opened increases. These are some of the measurements:
>
> 5000 DBs = 3 secs
> 10000 DBs = 11 secs
> 15000 DBs = 35 secs
> 20000 DBs = 75 secs
>
> Many processes can concurrently open 5000 db's at the same time and it takes
> about the same time as a single process doing the work. So, it doesn't
> appear to be related to the OS related issue with the number of opens.
>
> The open is done as:
> sqlite3_open_v2(dbName, db, SQLITE_OPEN_READWRITE  | SQLITE_OPEN_NOMUTEX,
> NULL)
>
> We are using RHEL 6 with SQLite 3.7.7.1. Each database was created with
> journaling mode set to WAL.
> Does anyone know why the performance degrades this way and what can be done
> to resolve the problem?
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to