Les Mikesell wrote:
> Evren Yurtesen wrote:
> 
>> 2MB/sec isn't bad when handling a lot of files (try unpacking a tar with
>>>  hundreds of thousands of little files to see).  The problem is that you 
>>
>> Yes it is terrible. I get much better performance if I do the tar 
>> option with the same files. As a matter of fact I was using a smaller 
>> script for taking backups earlier. (which I still use on some servers) 
>> and transfer files over NFS. It works way faster, especially 
>> incremental backups take 5-10 minutes compared to 400 minutes with 
>> backuppc
> 
> The big difference is that rsync loads the entire directory from the 
> remote system into ram before starting - and the backuppc implementation 
>   does it in perl so the data structure is larger than you might 
> expect.  Then it compares what you have in your existing backup, again 
> in perl code, and for fulls, uncompresses the files on the fly so it 
> takes more cpu than you might expect.  My guess about your system is 
> that if it isn't actively swapping it is at least close and loading the 
> directory takes all of the RAM that should be your disk buffers for 
> efficient access.  Then you won't be keeping inodes in cache and end up 
> waiting for the seeks back for each of them.  I don't think you've 
> mentioned your processor speed either - even if it is waiting most of 
> the time it may contribute to the slowness when it gets a chance to run. 
>  You probably shouldn't be running more than one backup concurrently. 
> You could try backing up a target that contains only a few large files 
> as a timing test to see how much loading a large directory listing 
> affects the run.

This makes a lot of sense. Anyhow, as I mentioned earlier. I will get
the memory upgraded and I will see if it will solve the problem.

Also I see the --ignore-times option in rsync args for full backups. Why
is this necessary exactly?


>> I agree, handling a lot of files might be slow but this depends on how 
>> you handle the files. But I was handling the same files before and it 
>> wasnt taking this long.
> 
> With tar, you never need to read the existing files so you are doing 
> much less work and you don't hold the remote directory in memory at all.
> But if tar works better on your system, why not use it?  The only real 
> drawback is that incremental runs won't catch old files in their new 
> positions under renamed directories.  The full runs will straighten this 
> out.
> 

Is it still possible to pool the files for saving disk space when using tar?

I wonder something, why cant backuppc set the backed up file modification times 
to
match the client when backing up the files so it can compare the file 
modification
time only at incrementals (and perhaps at full backups) like tar? In either case
backuppc can do file by file checks when using rsync anyhow, but the 
limitations tar
imposes would be removed, wouldnt it? Also it wouldnt need to decompress files
only for checking if the one in client is modified or not. Is this impossible? 

Thanks,
Evren

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/

Reply via email to