Re: [fsck] Gzip is... terrible!

Stan Park Mon, 11 Jul 2011 11:02:15 -0700
Well, losing a diff shouldn't corrupt your backup as the semantics of each
diff are independent of the others. Losing a diff means your backup may
appear inconsistent in the sense that you could be missing data. This is why
data duplication is important. Diffs are not backups; they allow you to
construct a backup. Diffs are cheap so you ought to  and need to replicate
it. With full backups, your replication is inherent in the backup itself.
On Jul 11, 2011 10:54 AM, "Huan Truong (Android)" <[email protected]>
wrote:
> In other words, if anything goes wrong in one of the rdiff backups along
the chain, it's gonna be a disaster. I wouldn't trust this kind of backup,
although it sounds very tempting.
>
> I think it's best to play safe when it comes to backups. When you need
them, you *really* need them to work.
>
> James Park <[email protected]> wrote:
>
>>Yes, that answers my question, more or less. To clarify, doing a diff-only
>>backup solution without a periodic build of a consistent snapshot means
that
>>when you do need to recover the backup copy, you have to read and apply
all
>>the diffs, which is bad for recovery response. The file system analogy is
>>like having a journaling file system with a journal with infinite space
and
>>never flushes.
>>
>>On Sun, Jul 10, 2011 at 3:51 PM, Patrick Kilgore
>><[email protected]>wrote:
>>
>>> Does rdiff do any compression
>>>
>>> zlib compression when sending the diffs over the network. Space
efficiency
>>> is from keeping diffs and not multiple snapshots. I think I found a stat
>>> somewhere that its > 90% more space efficient than full snapshots.
>>>
>>> How is the retrieval latency with that or is a full snapshot
periodically
>>>> built when the number of diffs reach a threshold?
>>>
>>>
>>> Not quite sure what you're asking. Yes, restoring data from an increment
>>> (or the mirror) takes some time, but addressing the original post
(dealing
>>> with daily backups of large sets of data), using diffs just always
seemed to
>>> be a much better solution to me, and much more efficient than
compressing
>>> the whole deal. I suppose the point is moot if that data is being
generated
>>> 100% new everyday, but really how many times is that true?
>>>
>>> --::--
>>> Patrick Kilgore
>>> 617.910.0332
>>>
>>>
>>>
>>>
>>> On Sun, Jul 10, 2011 at 12:59 AM, Stan Park <[email protected]>
wrote:
>>>
>>>> Does rdiff do any compression
>>>
>>>
>>>
>>
>>
>>--
>>Stan Park
Re: [fsck] Gzip is... terrible!

Reply via email to