Re: [BackupPC-users] memory usage

2005-09-11 Thread Craig Barratt
Hamish Guthrie writes:

> I am sorry to harp on about this, but, it gets back to my analysis of 
> rsync under backuppc a few months ago - I still think that we need to do 
> an implementation of File::RsyncP in C as opposed to perl. I have done 
> some memory requirement analysis of a raw rsync backup of a large 
> filesystem against a BackupPC backup of the same filesystem, and the 
> differences are astonishing in terms of memory requirements, not to 
> mention the raw speed of the actual backup. I know that there are some 
> additional requirements for File:RsyncP, but, I am sure that running in 
> C will be a huge amount more efficient than running in an interpreted 
> perl environment.
> 
> Unfortunately, at the moment, I have a few other commercial projects I 
> am busy with and cannot immediately contribute to such a development, 
> but in my view, this would be the most impotant performance-enhancing 
> developments for BackupPC.
> 
> I could quote specific examples of BackupPC (using File::RsyncP) vs 
> rsync itself, but I will not bore you or myself, but I can assure you 
> that the difference in speed is about 10:1. Memory usage is also 
> DRAMATICALLY less in the raw rsync environment, I would guess at least 
> 50% less RAM requirement.
> 
> I am hoping to finish my current commercial projects in the next few 
> weeks, at which stage, I will have a look at File::RsyncP unless someone 
> else has taken the bait resulting from this e-mail and developed the 
> appropriate C code in the mean-time.

I agree with you and David Relson that the perl parts of File::RsyncP
(and BackupPC), have higher memory usage beyond native rsync itself.

Data that David sent suggests a factor of 2, which is believable.

But remember that rsync (and BackupPC) suffer from the problem that
the entire file list must be stored in memory.  And the receiver
forks, so two processes (whether rsync or BackupPC_dump) need the
full file list.  Sure, BackupPC does it less efficiently in perl,
but it is a basic architectural issue for both native rsync and
BackupPC.  With a lot of effort, that factor of 2 could be reduced
to maybe 1.1 or 1.2.  But the basic problem of memory usage
proportional to the number of files remains.

Roy Keene's approach of developing a native backup client for BackupPC
that doesn't suffer this limitation - and will be native C at both
ends - looks like the most encouraging solution.  It will not have
memory usage proportional to the total number of files.  Unfortunately,
Huuricane Katrina has forced his office to relocate and development
will stall for at least the next month.

Craig


---
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] memory usage

2005-09-06 Thread Hamish Guthrie

Carl,

I am sorry to harp on about this, but, it gets back to my analysis of 
rsync under backuppc a few months ago - I still think that we need to do 
an implementation of File::RsyncP in C as opposed to perl. I have done 
some memory requirement analysis of a raw rsync backup of a large 
filesystem against a BackupPC backup of the same filesystem, and the 
differences are astonishing in terms of memory requirements, not to 
mention the raw speed of the actual backup. I know that there are some 
additional requirements for File:RsyncP, but, I am sure that running in 
C will be a huge amount more efficient than running in an interpreted 
perl environment.


Unfortunately, at the moment, I have a few other commercial projects I 
am busy with and cannot immediately contribute to such a development, 
but in my view, this would be the most impotant performance-enhancing 
developments for BackupPC.


I could quote specific examples of BackupPC (using File::RsyncP) vs 
rsync itself, but I will not bore you or myself, but I can assure you 
that the difference in speed is about 10:1. Memory usage is also 
DRAMATICALLY less in the raw rsync environment, I would guess at least 
50% less RAM requirement.


I am hoping to finish my current commercial projects in the next few 
weeks, at which stage, I will have a look at File::RsyncP unless someone 
else has taken the bait resulting from this e-mail and developed the 
appropriate C code in the mean-time.


Regards

Hamish

Carl Wilhelm Soderstrom wrote:

On 09/02 05:09 , David Relson wrote:


For the curious, the locate command indicates I have 2927627 files
using 46 GB of the single partition HD.



that's a healthy number. the box I have with the most files, only has 3/4
million (developer's workstation). that will likely change as I'm now
backing up a mailserver that has 2000 users, gets several GB of mail a day,
and they're all running Maildir now...



With that large file count, it's not surprising that BackupPC_dump
needs a lot of memory. The big question is whether the memory is being
used efficiently.  I'd rather not add more ram to the box, but if
that's the only way to do backups, then I'll have to do so.



AFAIK, that's what you have to do. I wouldn't recommend less than 1GB of RAM
for a backuppc box.




---
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] memory usage

2005-09-05 Thread David Relson
On Mon, 05 Sep 2005 15:07:11 -0700
Craig Barratt wrote:

> Carl Wilhelm Soderstrom writes:
> 
> > On 09/03 06:22 , Hamish Guthrie wrote:
> > > I am sorry to harp on about this, but, it gets back to my analysis of 
> > > rsync under backuppc a few months ago - I still think that we need to do 
> > > an implementation of File::RsyncP in C as opposed to perl. 
> > 
> > Your point is well-taken and I'm willing to accept your expertise on the
> > subject. It's not bad to harp on a subject if it's important. :)
> 
> The key parts of File::RsyncP are written in C for exactly this
> reason.  The fileList part of the code (storing the entire file
> list in memory) should have similar memory usage to the rsync
> version.
> 
> Craig

Craig,

How does this relate to the memory usage figures I posted?

  527MB BackupPC_dump
  250MB rsync
  425MB BackupPC_dumpe

Are these resonable numbers for 47GB in 2.9M files, or is something
amiss?

Thanks.

David



---
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] memory usage

2005-09-05 Thread Craig Barratt
Carl Wilhelm Soderstrom writes:

> On 09/03 06:22 , Hamish Guthrie wrote:
> > I am sorry to harp on about this, but, it gets back to my analysis of 
> > rsync under backuppc a few months ago - I still think that we need to do 
> > an implementation of File::RsyncP in C as opposed to perl. 
> 
> Your point is well-taken and I'm willing to accept your expertise on the
> subject. It's not bad to harp on a subject if it's important. :)

The key parts of File::RsyncP are written in C for exactly this
reason.  The fileList part of the code (storing the entire file
list in memory) should have similar memory usage to the rsync
version.

Craig


---
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] memory usage

2005-09-05 Thread Carl Wilhelm Soderstrom
On 09/03 06:22 , Hamish Guthrie wrote:
> I am sorry to harp on about this, but, it gets back to my analysis of 
> rsync under backuppc a few months ago - I still think that we need to do 
> an implementation of File::RsyncP in C as opposed to perl. 

Your point is well-taken and I'm willing to accept your expertise on the
subject. It's not bad to harp on a subject if it's important. :)

-- 
Carl Soderstrom
Systems Administrator
Real-Time Enterprises
www.real-time.com


---
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] memory usage

2005-09-02 Thread Carl Wilhelm Soderstrom
On 09/02 05:09 , David Relson wrote:
> For the curious, the locate command indicates I have 2927627 files
> using 46 GB of the single partition HD.

that's a healthy number. the box I have with the most files, only has 3/4
million (developer's workstation). that will likely change as I'm now
backing up a mailserver that has 2000 users, gets several GB of mail a day,
and they're all running Maildir now...

> With that large file count, it's not surprising that BackupPC_dump
> needs a lot of memory. The big question is whether the memory is being
> used efficiently.  I'd rather not add more ram to the box, but if
> that's the only way to do backups, then I'll have to do so.

AFAIK, that's what you have to do. I wouldn't recommend less than 1GB of RAM
for a backuppc box.

-- 
Carl Soderstrom
Systems Administrator
Real-Time Enterprises
www.real-time.com


---
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] memory usage

2005-09-02 Thread David Relson
On Fri, 02 Sep 2005 09:46:06 -0500
Les Mikesell wrote:

> On Fri, 2005-09-02 at 08:29, Carl Wilhelm Soderstrom wrote:
> 
> > > I'm running a machine with 512MB ram and 1GB swap and BackupPC is
> > > triggering Linux's OOM (the out of memory killer) and that's making it
> > > impossible to backup my main machine.  
> > > 
> > > At the moment VSZ for the 2 _dump processes are at 527m and 425m and
> > > rsync is at 256m (see below).  How can I reduce BackupPC's memory
> > > demands???
> > 
> > try using tar instead of rsync; it has much lower memory requirements. (tho
> > it has many other limitations).
> 
> Rsync has to store an in-memory table of all the filenames before it
> starts checking them so it can help a bit to split the runs.  If you
> have several large filesystems, you can do this by adding the
> --one-file-system option to the command and explicitly adding
> each mount point to the list to back up.

Initially I used tar, but it wasn't able to deal with the number of
files on my hard drive.  Rsync is able to handle _that_ load.  

Most of the memory usage is in BackupPC - 527MB for 1 of the
BackupPC_dump processes and 425MB for the other.  By comparison, rsync
is only using 256MB.  The reason for 2 BackupPC_dump processes is that
the machine in question is being backed up to an external USB hard
drive mounted on the same machine.

For the curious, the locate command indicates I have 2927627 files
using 46 GB of the single partition HD.

With that large file count, it's not surprising that BackupPC_dump
needs a lot of memory. The big question is whether the memory is being
used efficiently.  I'd rather not add more ram to the box, but if
that's the only way to do backups, then I'll have to do so.


---
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] memory usage

2005-09-02 Thread Les Mikesell
On Fri, 2005-09-02 at 08:29, Carl Wilhelm Soderstrom wrote:

> > I'm running a machine with 512MB ram and 1GB swap and BackupPC is
> > triggering Linux's OOM (the out of memory killer) and that's making it
> > impossible to backup my main machine.  
> > 
> > At the moment VSZ for the 2 _dump processes are at 527m and 425m and
> > rsync is at 256m (see below).  How can I reduce BackupPC's memory
> > demands???
> 
> try using tar instead of rsync; it has much lower memory requirements. (tho
> it has many other limitations).

Rsync has to store an in-memory table of all the filenames before it
starts checking them so it can help a bit to split the runs.  If you
have several large filesystems, you can do this by adding the
--one-file-system option to the command and explicitly adding
each mount point to the list to back up.

-- 
   Les Mikesell
[EMAIL PROTECTED]




---
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/


Re: [BackupPC-users] memory usage

2005-09-02 Thread Carl Wilhelm Soderstrom
On 09/01 07:24 , David Relson wrote:
> I'm running a machine with 512MB ram and 1GB swap and BackupPC is
> triggering Linux's OOM (the out of memory killer) and that's making it
> impossible to backup my main machine.  
> 
> At the moment VSZ for the 2 _dump processes are at 527m and 425m and
> rsync is at 256m (see below).  How can I reduce BackupPC's memory
> demands???

try using tar instead of rsync; it has much lower memory requirements. (tho
it has many other limitations).

-- 
Carl Soderstrom
Systems Administrator
Real-Time Enterprises
www.real-time.com


---
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/