[sqlite] Database replication question

spaminos-sqlite Mon, 11 Jun 2007 19:45:38 -0700

Hi all

I am trying to put in place a simple replication process to copy a database 
from one machine to an other.
The table I have is something like
CREATE TABLE sn2uid(sn VARCHAR(100) NOT NULL, uid INTEGER NOT NULL, PRIMARY KEY 
(sn));
CREATE INDEX uidindex on sn2uid ( uid )


Where the (sn,uid) pairs are pretty much random.

On my test data, I have around 3 million entries, the size on disk being about 
280 Mb
If I do a 'select * from sn2uid' > db, I get around 100Mb worth of data. I was 
thinking to simply stream the result from that query over tcp (http really), 
and do the inserts on the other side...

The problem I have is that, while doing this select takes about 10 seconds on 
my machine, I didn't find any quick way to insert quickly onto the other 
machine.

After a while, the db file size grows very very slowly, even when using 
transactions

My question is:

is there a way to do a select or a .dump so that when inserting the data on the 
other end, things will be faster? Or maybe there are some pragmas I can use 
that would improve performance?

To me, it seems that the reason things are slow is that even though I do the 
inserts in a transaction, the btrees are modified independently, and in that 
case randomly. If I was getting the data in the right order in terms of the 
btree, I think things could be significantly faster...

What I tried was to simply something like:
sqlite3 myorg.db '.dump sn2uid' > ~/sn2uid.txt
sqlite3 mynew.db < ~/sn2uid.txt

Would grouping inserts together by groups of 10000 or so make things faster 
instead of one gigantic transaction? I am wondering in particular if the btree 
insert code is smart enough to build the tree and merge it into the main db 
file faster in that case?

Thanks!

Nicolas

[sqlite] Database replication question

Reply via email to