Re: [BackupPC-users] Multiple backuppc server

2009-07-09 Thread Les Mikesell
Holger Parplies wrote:
  
 So, do you insist on making the original poster believe that running several
 instances of BackupPC on the same pool is a good idea, or can we maybe find
 some other topic to disagree on?

I'll insist that it should be possible, with the only real conflicts 
being what happens when a pool file is created, removed, or renamed. 
Since only the thing that removes or renames pool files is 
BackupPC_nightly, it's relatively easy to eliminate those possibilities 
by controlling when/where it runs.   Creating files should just be a 
matter of  making sure the file creation is atomic with O_CREAT,O_EXCL 
(or the old linking workaround for linux nfs bugs) and handling errors 
by doing whatever you would have done if the file had been there a 
moment before when you looked for it.

-- 
   Les Mikesell
lesmikes...@gmail.com

--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Multiple backuppc server

2009-07-08 Thread Tino Schwarze
On Tue, Jul 07, 2009 at 01:50:56PM +0100, Andy Brown wrote:
 Hi All,
 We've started to setup a large multiple server backuppc environment, and 
 wanted a few thoughts/ideas/advice.
 We've got a large 2TB nas at the back of it with gig connectivity.
 Filesystem is LVM on top of OCFS2 so we have multiple front-end servers with 
 read/write.
 Backuppc on each host is setup with relevant different hostnames and setup 
 separate logdirectories. The actual top/backup location is shared on the main 
 nas store.
 
 So
 $Conf{TopDir} = /backups/backuppc/
 $Conf{ConfDir} = '/etc/backuppc';
 $Conf{LogDir}  = '/backups/backup02';
 $Conf{InstallDir}  = '/usr/share/backuppc';
 $Conf{CgiDir}  = '/usr/share/backuppc/cgi-bin';

 Each server has its own list of hosts in /etc/backuppc/hosts as that's
 how I'm splitting the job queues. i.e. a host only exists in one
 server hosts file (at present either backup01 or backup02).
 
 Can anyone see any pitfalls with this? The only strange thing I've
 noticed is with the trashClean process, it seems to be trying to clean
 things that the other server is creating/working on and failing with
 Can't read /var/lib/backuppc/trash//home/blah/thing/file: No such
 file or directory. It doesn't seem a major thing so I'm ignoring it
 for now!

You will run into lots of troubles since BackupPC is not designed to
support multiple instances accessing the same storage. There are
processes like BackupPC_nightly which need to have exclusive access to
the pool (e.g. no parallel BackupPC_link running).

 Anyone see any pitfalls/problems with what I'm doing here?

What are you trying to accomplish by using multiple BackupPC instances?

For the time being, they need exclusive pool directories, that is,
exclusive TopDir.

HTH,

Tino.

-- 
What we nourish flourishes. - Was wir nähren erblüht.

www.lichtkreis-chemnitz.de
www.craniosacralzentrum.de

--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Multiple backuppc server

2009-07-08 Thread Holger Parplies
Hi,

Tino Schwarze wrote on 2009-07-08 10:11:43 +0200 [Re: [BackupPC-users] Multiple 
backuppc server]:
 On Tue, Jul 07, 2009 at 01:50:56PM +0100, Andy Brown wrote:
  We've started to setup a large multiple server backuppc environment [...]
  We've got a large 2TB nas at the back of it with gig connectivity.
  Filesystem is LVM on top of OCFS2 so we have multiple front-end servers
  with read/write. [...] The actual top/backup location is shared on the
  main nas store.
  [...]
  
  Can anyone see any pitfalls with this?
 
 You will run into lots of troubles [...]

actually, I'm not sure you will. I'd expect subtle corruption which you won't
notice until it's too late. Things like single files in backups containing the
wrong contents. There might be more obvious things like garbled status.pl
contents, which might make BackupPC crash or display (and use) incorrect
values. I wouldn't be surprised if each BackupPC instance would remove
information for hosts it doesn't know about (from status.pl, not the host
directories). Random things may or may not happen.

You might even be lucky and simply get away with it. Race conditions are
things waiting to happen, although they may turn out not to. I don't know and
it doesn't seem important to me either. Are you doing backups for the odd
chance of them being correct?

 BackupPC is not designed to support multiple instances accessing the same
 storage.


 There are
 processes like BackupPC_nightly which need to have exclusive access to
 the pool (e.g. no parallel BackupPC_link running).

While you might even be able to rule that out by some clever scheduling (and
some luck), there's no sane way to prevent more than one instance of
BackupPC_link from running.

  The only strange thing I've
  noticed is with the trashClean process, it seems to be trying to clean
  things that the other server is creating/working on and failing with
  Can't read /var/lib/backuppc/trash//home/blah/thing/file: No such
  file or directory. It doesn't seem a major thing so I'm ignoring it
  for now!

This is one example of a race condition. Two trashClean processes are
simultaneously trying to delete the same tree. Each file can only be deleted
once, so each trashClean will fail for an arbitrary number of files (and log
it, as it is unexpected). Obviously, running multiple trashClean processes on
the same file system is a waste of resources, i.e. slows things down
considerably compared to only one instance running. Multiple BackupPC_nightly
instances would be even more wasteful by far.

 What are you trying to accomplish by using multiple BackupPC instances?

This is an important question. You are probably assuming that your NAS can
handle more I/O than one server could generate. You are very likely wrong.
Concurrent write access to a file system needs to be synchronized. Your
cluster file system might do that, but it does at a cost.

Your bottleneck is very likely to be disk seeks rather than raw disk I/O
bandwidth. Concurrent independent access to your disk(s) will make that
even more so (which is why concurrent backups under one BackupPC server are
usually limited to a very small number), and I'd expect your cluster FS to
make your bottleneck thinner than it already is.


So, aside from not working, because BackupPC is not designed to support it,
you will probably be achieving the opposite of what you want. But that's just
my guess. You've got it running. What do your measurements say?

Regards,
Holger

--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Multiple backuppc server

2009-07-08 Thread Les Mikesell
Holger Parplies wrote:

 We've started to setup a large multiple server backuppc environment [...]
 We've got a large 2TB nas at the back of it with gig connectivity.
 Filesystem is LVM on top of OCFS2 so we have multiple front-end servers
 with read/write. [...] The actual top/backup location is shared on the
 main nas store.
 [...]

 Can anyone see any pitfalls with this?
 You will run into lots of troubles [...]
 
 actually, I'm not sure you will. I'd expect subtle corruption which you won't
 notice until it's too late. Things like single files in backups containing the
 wrong contents. There might be more obvious things like garbled status.pl
 contents, which might make BackupPC crash or display (and use) incorrect
 values. I wouldn't be surprised if each BackupPC instance would remove
 information for hosts it doesn't know about (from status.pl, not the host
 directories). Random things may or may not happen.

I thought someone had reported doing this successfully over NFS - using 
a high capacity commercial NAS.

 You might even be lucky and simply get away with it. Race conditions are
 things waiting to happen, although they may turn out not to. I don't know and
 it doesn't seem important to me either. Are you doing backups for the odd
 chance of them being correct?
 
 BackupPC is not designed to support multiple instances accessing the same
 storage.
 
 
 There are
 processes like BackupPC_nightly which need to have exclusive access to
 the pool (e.g. no parallel BackupPC_link running).
 
 While you might even be able to rule that out by some clever scheduling (and
 some luck), there's no sane way to prevent more than one instance of
 BackupPC_link from running.

That shouldn't matter - and in fact probably happens with multiple 
processes on a single server. link() should be an atomic operation so 
creation of a hash collision should be detected even if it is simultaneous.

 The only strange thing I've
 noticed is with the trashClean process, it seems to be trying to clean
 things that the other server is creating/working on and failing with
 Can't read /var/lib/backuppc/trash//home/blah/thing/file: No such
 file or directory. It doesn't seem a major thing so I'm ignoring it
 for now!
 
 This is one example of a race condition. Two trashClean processes are
 simultaneously trying to delete the same tree. Each file can only be deleted
 once, so each trashClean will fail for an arbitrary number of files (and log
 it, as it is unexpected). Obviously, running multiple trashClean processes on
 the same file system is a waste of resources, i.e. slows things down
 considerably compared to only one instance running. Multiple BackupPC_nightly
 instances would be even more wasteful by far.

The BackupPC_nightly run is the more dangerous part.  There you have the 
possibility that it might delete a pool link at the same time another 
process just re-used it.  In older versions, this could not run 
concurrently with backups for this reason.  In the current version there 
is some sort of locking around the operations that might collide so this 
might or might not also work on a network filesystem.  But in any case 
you would probably only want one nightly run and keep it outside the 
backup window.

-- 
   Les Mikesell
lesmikes...@gmail.com


--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Multiple backuppc server

2009-07-08 Thread Holger Parplies
Hi,

Les Mikesell wrote on 2009-07-08 10:32:50 -0500 [Re: [BackupPC-users] Multiple 
backuppc server]:
 [...]

Les, you are missing the important part, so I'll begin with it and repeat it a
few times throughout the mail:

  Tino Schwarze wrote on 2009-07-08 10:11:43 +0200 [this thread]:
   BackupPC is not designed to support multiple instances accessing the same
   storage.

 Holger Parplies wrote:
 Tino wrote:
 Andy Brown wrote:
  We've started to setup a large multiple server backuppc environment
  [...]
  Can anyone see any pitfalls with this?
  You will run into lots of troubles [...]
  
  actually, I'm not sure you will. I'd expect subtle corruption which you
  won't notice until it's too late. [...]
  You might even be lucky and simply get away with it. Race conditions are
  things waiting to happen, although they may turn out not to.
 
 I thought someone had reported doing this successfully over NFS - using 
 a high capacity commercial NAS.

If you know there are race conditions, how much faith do you put in a report
saying it has been done successfully (presuming you remember correctly)?
Granted, it *might* mean that status.pl confusion will not happen. They
*might* even have figured out that the race conditions do not exist or how to
avoid them, but I wouldn't believe it without examining the reasoning behind
that, because

   BackupPC is not designed to support multiple instances accessing the same
   storage.

so it doesn't take any expensive measures to avoid race conditions resulting
from doing so anyway.

  there's no sane way to prevent more than one instance of BackupPC_link
  from running.
 
 That shouldn't matter - and in fact probably happens with multiple 
 processes on a single server.

No, it probably doesn't. I checked that before writing what I wrote. Did you
check before contradicting me?

 link() should be an atomic operation so 
 creation of a hash collision should be detected even if it is simultaneous.

Detecting it is trivial. Please provide a correct implementation of *handling*
it. It's not necessarily a hash collision as BackupPC uses the term, by the
way. There is currently no need to handle this, because

   BackupPC is not designed to support multiple instances accessing the same
   storage.

and it avoids it happening within a single server instance, because that is
*much* easier than handling it.

 [...]
 The BackupPC_nightly run is the more dangerous part.  There you have the 
 possibility that it might delete a pool link at the same time another 
 process just re-used it.

You are correct in that this is something we want to avoid. If I were so
inclined, it would be trivial to contradict you with your own arguments,
though. Something like link() should return a failure code if the source file
does not exist so this should be easily detected. Aside from the comment
that it's not dangerous to have no pool link for a file, it's just wasteful,
because you won't be able to reuse it for other copies.
But I won't do that. You are right. There should not be more than one instance
of BackupPC_nightly running on a pool, and BackupPC_nightly and BackupPC_link
should not run concurrently.

 In the current version there 
 is some sort of locking around the operations that might collide so this 
 might or might not also work on a network filesystem.

This sounds like an urban myth. Did you check how this locking operation
works? What version of BackupPC introduced it? I went *part of the way* through
the diffs. What I found wasn't locking, it was design, and it will, in fact,
extend to several BackupPC servers accessing one pool. But that is only part
of the mechanism. The rest, I believe, is in fact really a form of locking:
the provisions a BackupPC server takes to avoid two jobs from running
concurrently that shouldn't - BackupPC_nightly and BackupPC_link, including
more than one instance of either (i.e. *only one* BackupPC_nightly(*) or
BackupPC_link job may be running at one point in time). This part will
obviously *not* extend to several independent server instances accessing the
pool. In other words,

   BackupPC is not designed to support multiple instances accessing the same
   storage.

 But in any case 
 you would probably only want one nightly run and keep it outside the 
 backup window.

I have nothing to add to that.


So, do you insist on making the original poster believe that running several
instances of BackupPC on the same pool is a good idea, or can we maybe find
some other topic to disagree on?

Regards,
Holger

(*) With $Conf{MaxBackupPCNightlyJobs} you can split one BackupPC_nightly
job into 2, 4, 8 ... processes which will run concurrently and each process
a distinct part of the pool. In the sense of the above definition, they
comprise one logical BackupPC_nightly entity. What you can't have is
more than one BackupPC_nightly processing the *same* part of the pool

[BackupPC-users] Multiple backuppc server

2009-07-07 Thread Andy Brown
Hi All,
We've started to setup a large multiple server backuppc environment, and wanted 
a few thoughts/ideas/advice.
We've got a large 2TB nas at the back of it with gig connectivity.
Filesystem is LVM on top of OCFS2 so we have multiple front-end servers with 
read/write.
Backuppc on each host is setup with relevant different hostnames and setup 
separate logdirectories. The actual top/backup location is shared on the main 
nas store.

So
$Conf{TopDir} = /backups/backuppc/
$Conf{ConfDir} = '/etc/backuppc';
$Conf{LogDir}  = '/backups/backup02';
$Conf{InstallDir}  = '/usr/share/backuppc';
$Conf{CgiDir}  = '/usr/share/backuppc/cgi-bin';

Each server has its own list of hosts in /etc/backuppc/hosts as that's how I'm 
splitting the job queues. i.e. a host only exists in one server hosts file (at 
present either backup01 or backup02).

Can anyone see any pitfalls with this? The only strange thing I've noticed is 
with the trashClean process, it seems to be trying to clean things that the 
other server is creating/working on and failing with Can't read 
/var/lib/backuppc/trash//home/blah/thing/file: No such file or directory. 
It doesn't seem a major thing so I'm ignoring it for now!

Anyone see any pitfalls/problems with what I'm doing here?

Cheers!


Andy

This message (and any associated files) is intended only for the use of the 
intended recipient(s) and may contain information that is confidential, subject 
to copyright or constitutes a trade secret. If you have received this message 
in error, please contact Onyx Group immediately by replying to the message and 
deleting it from your computer. Any content included in this mail does not 
necessarily reflect the views of The Onyx Group.

Onyx Group Ltd, Registered in England Registration Number: 5682619, VAT Number: 
884255494

Registered Office: Aurora Court, Barton Road, Middlesbrough, TS2 1RY.


--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have 
the opportunity to enter the BlackBerry Developer Challenge. See full prize 
details at: http://p.sf.net/sfu/blackberry
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Multiple backuppc server

2009-07-07 Thread Filipe Brandenburger
Hi,

On Tue, Jul 7, 2009 at 08:50, Andy Browna...@onyx.net wrote:
 Anyone see any pitfalls/problems with what I'm doing here?

Yes, IMHO you are introducing complexity where it is not needed.

What are you trying to accomplish?

Fault-tolerance? In that case you should probably have an
active/passive deployment where the backup host is only used when the
master fails.

Load-balancing for compressing which is CPU intensive? In that case
you should probably split your storage in two 1TB LUNs, export one of
them to each host and run BackupPC over a regular ext3 (or XFS)
filesystem.

OCFS or any other cluster filesystem will only introduce overhead and
management complexity, and from your description it seems you are not
sharing files between hosts anyway.

HTH,
Filipe

--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have 
the opportunity to enter the BlackBerry Developer Challenge. See full prize 
details at: http://p.sf.net/sfu/blackberry
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/