Hi there,

I've added support for SQLite 3 in my PyKota print accounting software.

This software supports PostgreSQL, MySQL and SQLite as its database
backend, and so it's easy to do some benchmarking, since in all cases
the very same SQL queries are executed (I know this **may not** be
the optimal solution).

I've got a test script which basically creates hundreds of records 
in Table1, thousands of records in Table2, and fills a third
table, let's name it Table3, with sizeTable1*sizeTable2 records.

Each record in Table3 references a record in Table1 and a record
in Table2.

SQLite is way faster than the two other databases at least when the
number of records is reasonable, but when the database reaches
around 300 Mb which is something like 4000000 records in Table3,
it slows down dramatically (in fact the slowdown is probably from
the beginning, but becomes noticeable around 250 / 300 Mb), 
and disk I/O becomes the bottleneck since the top command
gives me "wa: 90%" (if I understand correctly).

All the test program does in the mainloop is a bunch of INSERT 
queries all on the same table, but all INSERTS (millions of them) 
are done within a single SQL transaction. 

The program which creates the database is written in Python, uses 
pysqlite, uses around 20 Mb of RAM, the machine is a P-IV 3.2 Ghz 
with 512 Mb of RAM, swap area is not used, operating system is 
Kubuntu with the default sqlite package (machine is down at the 
moment, sorry for missing version number), and the disk is a
single not very fast IDE one (with 32 bits, multisector, and dma
activated through hdparm)

Is there a particular reason why the program slows down but is
very fast when starting (it displays percent done so it's easy
to spot) ? I haven't noticed such a slowdown with PostgreSQL or
MySQL.

Is there any optimization hint I could use with SQLite ?

NB : I'm perfectly happy with SQLite, the test program in any case 
will create much larger datasets than in real life so the slow down 
is not really important, but if something could be done I'd try and 
see if results are better. 

Thanks in advance for any pointer !

bye

Jerome Alet

Reply via email to