Mmmm?. Initial tests are not conclusive, it does look as if using 
rsync ?-inplace does speed things up but nowhere near as much as we 
anticipated.

Testing consisted of a 6GB test database which is a backup copy from a 
few months ago.

We timed copying the database over using cp

# time cp tfl.sqlite.backup t1

real    2m30.528s
user    0m0.052s
sys     0m10.403s

We then edited the database and deleted the contents of a table that 
would have changed over the lifetime of the database. We will freely 
admit we have no idea where this table is in the database file and have 
no intention of finding out. Thats SQLites problem :) The file had 
65,000 lines or so and would have been updated regularly at the start of 
the database and over the last few months would have had small daily 
updates but they would be getting fewer and fewer.

We then did

# time rsync -av --inplace --no-whole-file tfl.sqlite.backup test_db
sending incremental file list
tfl.sqlite.backup



sent 564,081,269 bytes  received 623,255 bytes  4,805,995.95 bytes/sec
total size is 6,067,933,184  speedup is 10.75

real    1m57.689s
user    2m4.947s
sys     0m8.952s

The ?no-while-file is apparently needed as well as ?inplace.

We checked the md5sums at the end

# md5sum tfl.sqlite.backup test_db t1
b5bd91cc9b49ee1f54a8a2d013005586  tfl.sqlite.backup
b5bd91cc9b49ee1f54a8a2d013005586  test_db
b5bd91cc9b49ee1f54a8a2d013005586  t1

So we can see that the file integrity has been preserved which is what 
we expect but its always good to check. Yes we know that md5sum is not 
perfect but for this purpose its fine. However the rsync vs cp time is 
not as big a difference as we expected. Its still taking 80% of the time 
of the cp.

Our first thoughts are that 80% is still better than 100% so thats good, 
our second thoughts are that we need to test this on a more 
representative set of changes to the database. Deleting the contents of 
a table that has been changed over the lifetime of the database may be 
the worst-case scenario as it might well touch many, many pages in the 
database. We are certainly not expert enough to comment on this 
assumption of how rows are distributed in the database and if anybody 
would like to chip in, please do so.

Our intention now is to take a more recent and representative database, 
run a days and a weeks set of database transactions through it which is 
easy enough, though takes time, and see how that compares.

Thanks for the suggestion,

Rob,

On 5 May 2016, at 16:42, J Decker wrote:

> Instead of cp, rsync might help it is able to send delta changes.
>
> On Wed, May 4, 2016 at 10:55 AM, Rob Willett
> <rob.sqlite at robertwillett.com> wrote:
>> Scott,
>>
>> OK, We can see how to do this (I think). Our app is written in Perl 
>> and we?d
>> just need to capture the command we write down. The only issue I can 
>> think
>> of is the prepare statement and making sure we capture the right SQL
>> command. W
>>
>> We?ll dig into it and have a look,
>>
>> Thanks for taking the time to reply.
>>
>> Rob
>>
>>
>> On 4 May 2016, at 18:52, Scott Robison wrote:
>>
>>> On Wed, May 4, 2016 at 11:47 AM, Rob Willett
>>> <rob.sqlite at robertwillett.com>
>>> wrote:
>>>
>>>> Scott,
>>>>
>>>> Thats an interesting idea. Is there an option in SQLite to do this 
>>>> for
>>>> us,
>>>> or do we have to write a small shim in our app?
>>>>
>>>> I like the idea of this as its simple and elegant.
>>>
>>>
>>>
>>> It would require a little extra work on your part. Nothing built 
>>> into the
>>> system that would accomplish this directly. However, I've done 
>>> similar
>>> things and they don't involve a ton of overhead. You could use 
>>> another
>>> SQLite database as the append only log, or a simple text file.
>>>
>>> I'm not aware of a free lunch solution, sadly.
>>>
>>>
>>>>
>>>>
>>>> Rob
>>>>
>>>>
>>>> On 4 May 2016, at 16:51, Scott Robison wrote:
>>>>
>>>> This is going to become a bigger problem for us as the database 
>>>> will
>>>>>>>
>>>>>>> only get bigger so any advice welcomed.
>>>>>>>
>>>>>>
>>>>> Perhaps, rather than backing up the live data, you create an 
>>>>> append only
>>>>> log of each and every query you send to the database. Should you 
>>>>> need to
>>>>> restore, you replay the log of statements. Or at the appointed 
>>>>> backup
>>>>> time,
>>>>> you replay the day's log of statements into another database. No 
>>>>> need to
>>>>> ever take the live database offline at the cost of slightly longer
>>>>> running
>>>>> commands during the day to handle the append operation.
>>>>> _______________________________________________
>>>>> sqlite-users mailing list
>>>>> sqlite-users at mailinglists.sqlite.org
>>>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>>>
>>>> _______________________________________________
>>>> sqlite-users mailing list
>>>> sqlite-users at mailinglists.sqlite.org
>>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>>
>>>
>>>
>>>
>>> --
>>> Scott Robison
>>> _______________________________________________
>>> sqlite-users mailing list
>>> sqlite-users at mailinglists.sqlite.org
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users at mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to