Re: [BackupPC-users] Filter by file type
Hi, Matteo Barbieri wrote on 16.01.2007 at 15:46:47 [[BackupPC-users] Filter by file type]: > [...] > I have to backup some Win Laptops, but I only want to save some file types. > So i set $Conf {BackupFilesOnly} to '*.doc', but backuppc saves only > word files that are in the "root" of the module (setting it to '*/*.doc' > saves only files on a subdirectory). > Is there a way to save only files with certain extensions recursively? I'll assume you're using rsync(d) as transfer method, because that's the case I'd like to answer. >From looking at the BackupPC code and one or two rsync man pages, I'd expect the following to work. Leave $Conf{BackupFilesOnly} empty and add the following elements at the *end* of the $Conf{RsyncArgs} array: '--include=*/', '--include=*.doc', '--exclude=*', (meaning: include any directories anywhere in the tree (not the files in them, just the directory skeleton), include any doc files, exclude any other files). These patterns are adopted from rsync's man page, making them likely to actually work. I can't see any trick to make $Conf{BackupFilesOnly} do what you want, but you can use $Conf{BackupFilesExclude}, even if that sounds paradoxical. $Conf {BackupFilesExclude} = [ '+ */', '+ *.doc', '*' ]; See the rsync man page for details on why that should work. Regards, Holger - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/backuppc-users http://backuppc.sourceforge.net/
Re: [BackupPC-users] every hour backup
Phong Nguyen wrote: > Hi all, > > I just would like to know if it is possible to make an incremental > backup of a host every hour. > I don't know how to set the value for $Conf{IncrPeriod} since it juste > take a value counted in days. > Thanks a lot > > Phong Nguyen > > Axone S.A. > Geneva / Swiss > > The value is in days but will happily accept numbers less than 1 - just set it to a number slightly less than 1/24. you'll also need to mess with the blackout period (make sure there isn't one). You will also need to account for full backups which may take more than an hour to run. Oh and make sure you've configure enough simultaneous jobs to allow your backup to always run. What problem are you actually trying to solve?Keep in mind that you may have open file contention issues as well. John - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/backuppc-users http://backuppc.sourceforge.net/
Re: [BackupPC-users] Signal =PIPE?
On Sat, 2007-01-27 at 17:26 -0500, Randy Barlow wrote: > Every now and then (2 - 3 days or so) I get an e-mail like this: > > > The following hosts had an error that is probably caused by a > > misconfiguration. Please fix these hosts: > > - abc.ece.ncsu.edu (aborted by signal=PIPE) > > > > Regards, > > PC Backup Genie > > It's always about the same PC, and I don't get these errors with other > machines. The things that make this PC unique from my other machines > are that it is a remote host (so backup over Internet), and I use rsync > over ssh rather than rsyncd. It's not something I'm worried about, but > I was wondering if anyone might have any helpful comments... > Have you tried increasing the client timeout? -- Travis Fraser <[EMAIL PROTECTED]> - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/backuppc-users http://backuppc.sourceforge.net/
[BackupPC-users] Backuppc host discovery for linux computers
Hi, Reading the documenation on host discovery, it's not clear to me how to set up my network so that backuppc knows what dhcp assigned IP address to associate with a linux box. Can someone offer some help or point me in the right direction? Thanks, Ted To - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/backuppc-users http://backuppc.sourceforge.net/
[BackupPC-users] Signal =PIPE?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Howdy, Every now and then (2 - 3 days or so) I get an e-mail like this: > The following hosts had an error that is probably caused by a > misconfiguration. Please fix these hosts: > - abc.ece.ncsu.edu (aborted by signal=PIPE) > > Regards, > PC Backup Genie It's always about the same PC, and I don't get these errors with other machines. The things that make this PC unique from my other machines are that it is a remote host (so backup over Internet), and I use rsync over ssh rather than rsyncd. It's not something I'm worried about, but I was wondering if anyone might have any helpful comments... Randy -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFu9GThOwpC65EoKARAtRiAKCeQlDzsiJlxT7kcgVky74/kS9WvQCfYMMZ /X0G2fieF/x9i+GjYkxAcdQ= =+ozv -END PGP SIGNATURE- - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/backuppc-users http://backuppc.sourceforge.net/
Re: [BackupPC-users] Long: How BackupPC handles pooling, and how transfer methods affect bandwidth usage
Timothy J. Massey wrote: > > > As a start, how about a utility that simply clones one host to another > > > using only the pc/host directory tree, and assumes that none of the > > > source files are in the pool, just like it would during a brand-new > > > rsync backup? > > > > That would be better than nothing, but if you have multiple full runs > > that you want to keep you'll have to transfer a lot of duplicates that > > could probably be avoided. > > Correct. But it's a proof of concept that can be refined. I understand > that some sort of inode or hash caching is required. But the first step > can be done with the parts we've already got. Agreed, but it's a lot easier to design in the out-of-band info you'll need later than to try to figure out where to put it afterwards. > > But what is the advantage over just letting the remote server make its > > run directly against the same targets? > > I thought a lot of Holger's points were good. But for me, it comes down > to two points: > > Point 1: Distributing Load > === > I have hosts that take, across a LAN, 12 hours to back up. The deltas > are not necessarily very big: there's just *lots* of files. And these > are reasonably fast hosts: >2GHz Xeon processors, 10k and 15k RPM > drives, hardware SCSI (and now SAS) RAID controllers, etc. > > I want to store the data in multiple places, both on the local LAN and > in at least 2 remote locations. That would mean 3 backups. It's > probably not going to take 36 hours to do that, but it's going to take a > *lot* more than 12... > > Other times, it's not the host's fault, but the Internet connection. > Maybe it's a host that's behind a low-end DSL that only offers 768k up > (or worse). It's hard enough to get *one* backup done over that, let > alone two. > > So how can I speed this up? Brute force approach: park a linux box with a big disk on the local LAN side. Do scripted stock rsync backups to this box to make full uncompressed copies with each host in its own directory. It's not as elegant as a local backuppc but you get quick access to a copy locally plus offloading any issues you might have in the remote transfer. I actually use this approach in several remote offices, taking advantage of an existing box that also provides VPN and some file shares. One up side is that you can add the -C option on the ssh command that runs rsync to get compression on the transfer (although starting over, I'd use openvpn as the VPN and add compression there). > And once one remote BackupPC server has the data, the > rest can get it over the very fast Internet connections that they have > between them. So I only have to get the data across that slow link > once, and I can still get it to multiple remote locations. For this case you might also want to do a stock rsync copy of the backups on the remote LAN to an uncompressed copy at the central location, then point 2 or more backuppc instances that have faster connections at that copy. Paradoxically, stock rsync with the -z option can move data more efficiently than just about anything but it requires the raw storage at both ends to be uncompressed. This might be cumbersome if you have a lot of individual hosts to add but it isn't bad if everyone is already saving the files that need backup onto one or a few servers at the remote sites. As I've mentioned before, I raid-mirror to an external drive weekly to get an offsite copy. > On top of this, the BackupPC server has a much easier task to replicate > a pool than the host does in the first place. Pooling has already been > taken care of. We *know* which files are new, and which ones are not. I don't think you can count on any particular relationship between local and remote pools. > There are only two things the replication need worry about: 1) > Transferring the new files and see if they already exist in the new > pool, and 2) integrating these new files into the remote server's own pool. That happens now if you can arrange for the rsync method to see a raw uncompressed copy. I agree that a more elegant method could be written, but so far it hasn't. > Point 2: Long-term Management of Data LVM on top of RAID is probably the right approach for being able to maintain an archive that needs to grow and have failing drives replaced. > However, with the ability to migrate hosts from one host to another, I > can have tiers of BackupPC servers. As hosts are retired, I still need > to keep their data. 7 years was not chosen for the fun of it: SOX > compliance requires it. However, I can migrate it *out* of my > first-line backup servers onto secondary servers. Again there is a brute force fix: keep the old servers with the old data but add new ones at whatever interval is necessary to keep current data. You'll have to rebuild the pool of any still-existing files, but as a tradeoff you get some redundancy. >
Re: [BackupPC-users] Long: How BackupPC handles pooling, and how transfer methods affect bandwidth usage
Les Mikesell <[EMAIL PROTECTED]> wrote on 01/26/2007 09:53:11 PM: > Timothy J. Massey wrote: > > > > As a start, how about a utility that simply clones one host to another > > using only the pc/host directory tree, and assumes that none of the > > source files are in the pool, just like it would during a brand-new > > rsync backup? > > That would be better than nothing, but if you have multiple full runs > that you want to keep you'll have to transfer a lot of duplicates that > could probably be avoided. Correct. But it's a proof of concept that can be refined. I understand that some sort of inode or hash caching is required. But the first step can be done with the parts we've already got. > But what is the advantage over just letting the remote server make its > run directly against the same targets? I thought a lot of Holger's points were good. But for me, it comes down to two points: Point 1: Distributing Load === I have hosts that take, across a LAN, 12 hours to back up. The deltas are not necessarily very big: there's just *lots* of files. And these are reasonably fast hosts: >2GHz Xeon processors, 10k and 15k RPM drives, hardware SCSI (and now SAS) RAID controllers, etc. I want to store the data in multiple places, both on the local LAN and in at least 2 remote locations. That would mean 3 backups. It's probably not going to take 36 hours to do that, but it's going to take a *lot* more than 12... Other times, it's not the host's fault, but the Internet connection. Maybe it's a host that's behind a low-end DSL that only offers 768k up (or worse). It's hard enough to get *one* backup done over that, let alone two. So how can I speed this up? I could use a faster host. Unfortunately, I've already *got* a pretty powerful host, and it is doing *its* job just fine, so why do I want to spend multiple thousands of dollars on this? Short answer: that is not possible. I could use a faster Internet connection. Usually, if a faster option were available affordably, they'd already have it. Even a T1, at $400/month, only offers 1.5Mb up. Not a lot. So getting a dramatically faster Internet connection is not possible, either. The other way to manage this is to distribute the load to multiple systems. By being able to replicate between BackupPC servers, I can still limit the number of backups the host must perform to 1 (with a local BackupPC server). The BackupPC server can then take on the load of performing multiple time-consuming replications with remote BackupPC servers. I'm not kidding when I say that the task can take a week for all I care, as long as it can get one week's worth of backups done during that time. And once one remote BackupPC server has the data, the rest can get it over the very fast Internet connections that they have between them. So I only have to get the data across that slow link once, and I can still get it to multiple remote locations. On top of this, the BackupPC server has a much easier task to replicate a pool than the host does in the first place. Pooling has already been taken care of. We *know* which files are new, and which ones are not. There are only two things the replication need worry about: 1) Transferring the new files and see if they already exist in the new pool, and 2) integrating these new files into the remote server's own pool. By distributing the load, we can get more backups replicated out to more places more quickly, with exactly zero increase in load on the most important devices in the entire process: the hosts and their Internet connections. Those are the machines that have "real" work to do, servicing real people with real tasks. I cannot load these machines 24x7. The *only* person who cares about the BackupPC machines (until something is lost) is me. They can stay 100% utilized 24x7 for all I care. Point 2: Long-term Management of Data == With BackupPC, you have a single, intertwined pool that stores data for all hosts. Viewed as a static entity (the data and hosts I need today, or even over a couple of weeks), that's fine. However, over time, I envision this getting unwieldy. As hosts come and go, and as hosts' data needs change (usually upward), and as data storage requirements increase, this single, solid, unbreakable, indivisible pool still needs to be managed. We are right now envisioning needing 2TB of space to back up a single host: our mail server, which has less than 100GB of data. The deltas on our mail server are currently in the neighborhood of 50GB/day. That's because we have 50GB of mail data, and we all receive at least one mail a day. Now, there are things like transaction logs which can reduce this, but greatly increase the complexity of restoring individual mail files. This is just the worst host, but far from unique. We have other servers that have multi-GB daily d