Re: [gentoo-user] How to compress lots of tarballs

Dale Wed, 29 Sep 2021 13:58:49 -0700

Laurence Perkins wrote:
>>
>> Curious question here.  As you may recall, I backup to a external hard 
>> drive.  Would it make sense to use that software for a external hard drive?  
>> Right now, I'm just doing file updates with rsync and the drive is 
>> encrypted.  Thing is, I'm going to have to split into three drives soon.  
>> So, compressing may help.  Since it is video files, it may not help much but 
>> I'm not sure about that.  Just curious. 
>>
>> Dale
>>
>> :-)  :-) 
>>
>>
> If I understand correctly you're using rsync+tar and then keeping a set of 
> copies of various ages.


Actually, it is uncompressed and just stores one version and one copy. 


>
> If you lose a single file that you want to restore and have to go hunting for 
> it, with tar you can only list the files in the archive by reading the entire 
> thing into memory and only extract by reading from the beginning until you 
> stumble across the matching filename.  So with large archives to hunt 
> through, that could take...  a while...
>
> dar is compatible with tar (Pretty sure, would have to look again, but I 
> remember that being one of its main selling-points) but adds an index at the 
> end of the file allowing listing of the contents and jumping to particular 
> files without having to read the entire thing.  Won't help with your space 
> shortage, but will make searching and single-file restores much faster.
>
> Duplicity and similar has the indices, and additionally a full+incremental 
> scheme.  So searching is reasonably quick, and restoring likewise doesn't 
> have to grovel over all the data.  It can be slower than tar or dar for 
> restore though because it has to restore first from the full, and then walk 
> through however many incrementals are necessary to get the version you want.  
> This comes with a substantial space savings though as each set of archive 
> files after the full contains only the pieces which actually changed.  
> Coupled with compression, that might solve your space issues for a while 
> longer.
>
> Borg and similar break the files into variable-size chunks and store each 
> chunk indexed by its content hash.  So each chunk gets stored exactly once 
> regardless of how many times it may occur in the data set.  Backups then 
> become simply lists of file attributes and what chunks they contain.  This 
> results both in storing only changes between backup runs and in deduplication 
> of commonly-occurring data chunks across the entire backup.  The 
> database-like structure also means that all backups can be searched and 
> restored from in roughly equal amounts of time and that backup sets can be 
> deleted in any order.  Many of them (Borg included) also allow mounting 
> backup sets via FUSE.  The disadvantage is that restore requires a compatible 
> version of the backup tool rather than just a generic utility.
>
> LMP


I guess that is the downside of not having just plain uncompressed
files.  Thing is, so far, I've never needed to restore a single file or
even several files.  So it's not a big deal for me.  If I accidentally
delete something tho, that could be a problem, if it has left the trash
already. 

Since the drive also uses LVM, someone mentioned using snapshots.  Still
not real clear on those even tho I've read a bit about them.  Some of
the backup technics are confusing to me.  I get plain files, even
incremental to a extent but some of the new stuff just muddies the water. 

I really need to just build a file server, RAID or something.  :/

Dale

:-)  :-)

Re: [gentoo-user] How to compress lots of tarballs

Reply via email to