Hi all I am trying to put in place a simple replication process to copy a database from one machine to an other. The table I have is something like CREATE TABLE sn2uid(sn VARCHAR(100) NOT NULL, uid INTEGER NOT NULL, PRIMARY KEY (sn)); CREATE INDEX uidindex on sn2uid ( uid )
Where the (sn,uid) pairs are pretty much random. On my test data, I have around 3 million entries, the size on disk being about 280 Mb If I do a 'select * from sn2uid' > db, I get around 100Mb worth of data. I was thinking to simply stream the result from that query over tcp (http really), and do the inserts on the other side... The problem I have is that, while doing this select takes about 10 seconds on my machine, I didn't find any quick way to insert quickly onto the other machine. After a while, the db file size grows very very slowly, even when using transactions My question is: is there a way to do a select or a .dump so that when inserting the data on the other end, things will be faster? Or maybe there are some pragmas I can use that would improve performance? To me, it seems that the reason things are slow is that even though I do the inserts in a transaction, the btrees are modified independently, and in that case randomly. If I was getting the data in the right order in terms of the btree, I think things could be significantly faster... What I tried was to simply something like: sqlite3 myorg.db '.dump sn2uid' > ~/sn2uid.txt sqlite3 mynew.db < ~/sn2uid.txt Would grouping inserts together by groups of 10000 or so make things faster instead of one gigantic transaction? I am wondering in particular if the btree insert code is smart enough to build the tree and merge it into the main db file faster in that case? Thanks! Nicolas