Yes, realised after I sent the e-mail that I hadn?t said we had 
created the destination file in which to run the command. You cannot 
actually run rsync ?inlace if the destination file doesn?t exist, we 
found that out earlier ;) So it was a reasonable test though only one 
test. We need to do more to see whats going on.

Top marks for spotting our deliberate omission :)

Rob

On 5 May 2016, at 17:42, J Decker wrote:

> On Thu, May 5, 2016 at 9:38 AM, Rob Willett
> <rob.sqlite at robertwillett.com> wrote:
>> Mmmm?. Initial tests are not conclusive, it does look as if using 
>> rsync
>> ?-inplace does speed things up but nowhere near as much as we 
>> anticipated.
>>
>> Testing consisted of a 6GB test database which is a backup copy from 
>> a few
>> months ago.
>>
>> We timed copying the database over using cp
>>
>> # time cp tfl.sqlite.backup t1
>>
>> real    2m30.528s
>> user    0m0.052s
>> sys     0m10.403s
>>
>> We then edited the database and deleted the contents of a table that 
>> would
>> have changed over the lifetime of the database. We will freely admit 
>> we have
>> no idea where this table is in the database file and have no 
>> intention of
>> finding out. Thats SQLites problem :) The file had 65,000 lines or so 
>> and
>> would have been updated regularly at the start of the database and 
>> over the
>> last few months would have had small daily updates but they would be 
>> getting
>> fewer and fewer.
>>
>> We then did
>>
>> # time rsync -av --inplace --no-whole-file tfl.sqlite.backup test_db
>> sending incremental file list
>> tfl.sqlite.backup
>>
>>
>
> was the file already in place with sending changes?  Or is it a full
> copy of the whole thing all the time?   if the later... well cp is
> gonna really be as good as it gets....
>
>>
>> sent 564,081,269 bytes  received 623,255 bytes  4,805,995.95 
>> bytes/sec
>> total size is 6,067,933,184  speedup is 10.75
>>
>> real    1m57.689s
>> user    2m4.947s
>> sys     0m8.952s
>>
>> The ?no-while-file is apparently needed as well as ?inplace.
>>
>> We checked the md5sums at the end
>>
>> # md5sum tfl.sqlite.backup test_db t1
>> b5bd91cc9b49ee1f54a8a2d013005586  tfl.sqlite.backup
>> b5bd91cc9b49ee1f54a8a2d013005586  test_db
>> b5bd91cc9b49ee1f54a8a2d013005586  t1
>>
>> So we can see that the file integrity has been preserved which is 
>> what we
>> expect but its always good to check. Yes we know that md5sum is not 
>> perfect
>> but for this purpose its fine. However the rsync vs cp time is not as 
>> big a
>> difference as we expected. Its still taking 80% of the time of the 
>> cp.
>>
>> Our first thoughts are that 80% is still better than 100% so thats 
>> good, our
>> second thoughts are that we need to test this on a more 
>> representative set
>> of changes to the database. Deleting the contents of a table that has 
>> been
>> changed over the lifetime of the database may be the worst-case 
>> scenario as
>> it might well touch many, many pages in the database. We are 
>> certainly not
>> expert enough to comment on this assumption of how rows are 
>> distributed in
>> the database and if anybody would like to chip in, please do so.
>>
>> Our intention now is to take a more recent and representative 
>> database, run
>> a days and a weeks set of database transactions through it which is 
>> easy
>> enough, though takes time, and see how that compares.
>>
>> Thanks for the suggestion,
>>
>> Rob,
>>
>> On 5 May 2016, at 16:42, J Decker wrote:
>>
>>> Instead of cp, rsync might help it is able to send delta changes.
>>>
>>> On Wed, May 4, 2016 at 10:55 AM, Rob Willett
>>> <rob.sqlite at robertwillett.com> wrote:
>>>>
>>>> Scott,
>>>>
>>>> OK, We can see how to do this (I think). Our app is written in Perl 
>>>> and
>>>> we?d
>>>> just need to capture the command we write down. The only issue I 
>>>> can
>>>> think
>>>> of is the prepare statement and making sure we capture the right 
>>>> SQL
>>>> command. W
>>>>
>>>> We?ll dig into it and have a look,
>>>>
>>>> Thanks for taking the time to reply.
>>>>
>>>> Rob
>>>>
>>>>
>>>> On 4 May 2016, at 18:52, Scott Robison wrote:
>>>>
>>>>> On Wed, May 4, 2016 at 11:47 AM, Rob Willett
>>>>> <rob.sqlite at robertwillett.com>
>>>>> wrote:
>>>>>
>>>>>> Scott,
>>>>>>
>>>>>> Thats an interesting idea. Is there an option in SQLite to do 
>>>>>> this for
>>>>>> us,
>>>>>> or do we have to write a small shim in our app?
>>>>>>
>>>>>> I like the idea of this as its simple and elegant.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> It would require a little extra work on your part. Nothing built 
>>>>> into
>>>>> the
>>>>> system that would accomplish this directly. However, I've done 
>>>>> similar
>>>>> things and they don't involve a ton of overhead. You could use 
>>>>> another
>>>>> SQLite database as the append only log, or a simple text file.
>>>>>
>>>>> I'm not aware of a free lunch solution, sadly.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> Rob
>>>>>>
>>>>>>
>>>>>> On 4 May 2016, at 16:51, Scott Robison wrote:
>>>>>>
>>>>>> This is going to become a bigger problem for us as the database 
>>>>>> will
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> only get bigger so any advice welcomed.
>>>>>>>>>
>>>>>>>>
>>>>>>> Perhaps, rather than backing up the live data, you create an 
>>>>>>> append
>>>>>>> only
>>>>>>> log of each and every query you send to the database. Should you 
>>>>>>> need
>>>>>>> to
>>>>>>> restore, you replay the log of statements. Or at the appointed 
>>>>>>> backup
>>>>>>> time,
>>>>>>> you replay the day's log of statements into another database. No 
>>>>>>> need
>>>>>>> to
>>>>>>> ever take the live database offline at the cost of slightly 
>>>>>>> longer
>>>>>>> running
>>>>>>> commands during the day to handle the append operation.
>>>>>>> _______________________________________________
>>>>>>> sqlite-users mailing list
>>>>>>> sqlite-users at mailinglists.sqlite.org
>>>>>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>>>>>
>>>>>> _______________________________________________
>>>>>> sqlite-users mailing list
>>>>>> sqlite-users at mailinglists.sqlite.org
>>>>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Scott Robison
>>>>> _______________________________________________
>>>>> sqlite-users mailing list
>>>>> sqlite-users at mailinglists.sqlite.org
>>>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>>
>>>>
>>>> _______________________________________________
>>>> sqlite-users mailing list
>>>> sqlite-users at mailinglists.sqlite.org
>>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>
>>> _______________________________________________
>>> sqlite-users mailing list
>>> sqlite-users at mailinglists.sqlite.org
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users at mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to