Robin Lee Powell wrote at about 13:17:43 -0800 on Monday, December 6, 2010: > On Mon, Dec 06, 2010 at 03:48:04PM -0500, Jeffrey J. Kosowsky wrote: > > Robin Lee Powell wrote at about 10:37:52 -0800 on Monday, December 6, 2010: > > > > > > So I'm writing a script to transfer a client from one host to > > > another, using tarPCCopy, and I'm getting messages like this: > > > > > > Can't find > > foo--tm50-e00145--tm50-s00339---shared/47/f%2f/fshared/ffoo/fpurchase_order_assets/fbatch_7813/f7105620_done.txt > > in pool, will copy file > > > > > > which is fascinating because the first column in ls -l is *3*. -_- > > > > > > The tarPCCopy tar file therefore ends up becoming really large > > > (hundreds of gibibytes) with files that already exist in the pool, > > > presumably. > > > > > > I've tried running md5sum on that file; can't find that in the pool. > > > I've tried BackupPC_zcat | md5sum; can't find that in the pool. > > > > Well the 'md5sum' used in pool naming is only a partial file md5sum. > > I wrote (and posted) a routine to calculate and optionally test for > > existence of the md5sum pool name corresponding to any pc tree > > file. I will attach a copy to the end of this post. > > > > > BackupPC_fixLinks, from the wiki, doesn't see the problem at all, > > > which I'd *very* much like to fix. > > > > First check to make sure there really is a problem with the pool... > > Then, we need to figure out whether there is a problem with tarcopy or > > with my program BackupPC_fixLinks etc. > > $ ls -l > /backups/pc/foo--tm50-e00145--tm50-s00339---shared/47/f%2f/fshared/ffoo/fpurchase_order_assets/fbatch_7813/f7105620_done.txt > -rw-r----- 3 backuppc backuppc 27 Nov 24 09:59 > /backups/pc/foo--tm50-e00145--tm50-s00339---shared/47/f%2f/fshared/ffoo/fpurchase_order_assets/fbatch_7813/f7105620_done.txt > > $ perl /tmp/bpctest.pl > /backups/pc/foo--tm50-e00145--tm50-s00339---shared/47/f%2f/fshared/ffoo/fpurchase_order_assets/fbatch_7813/f7105620_done.txt > 15c0e4b08058ef3704b8fc24887e2bcc > /backups/pc/foo--tm50-e00145--tm50-s00339---shared/47/f%2f/fshared/ffoo/fpurchase_order_assets/fbatch_7813/f7105620_done.txt > > $ ls -l /backups/cpool/1/5/c/15c0e4b08058ef3704b8fc24887e2bcc > -rw-r----- 3 backuppc backuppc 27 Nov 22 19:33 > /backups/cpool/1/5/c/15c0e4b08058ef3704b8fc24887e2bcc > > ls -li /backups/cpool/1/5/c/15c0e4b08058ef3704b8fc24887e2bcc > /backups/pc/foo--tm50-e00145--tm50-s00339---shared/47/f%2f/fshared/ffoo/fpurchase_order_assets/fbatch_7813/f7105620_done.txt > > *BUT*. Not linked. > > $ ls -li /backups/cpool/1/5/c/15c0e4b08058ef3704b8fc24887e2bcc > /backups/pc/foo--tm50-e00145--tm50-s00339---shared/47/f%2f/fshared/ffoo/fpurchase_order_assets/fbatch_7813/f7105620_done.txt > 255523133 -rw-r----- 3 backuppc backuppc 27 Nov 22 19:33 > /backups/cpool/1/5/c/15c0e4b08058ef3704b8fc24887e2bcc > 2376493624 -rw-r----- 3 backuppc backuppc 27 Nov 24 09:59 > /backups/pc/foo--tm50-e00145--tm50-s00339---shared/47/f%2f/fshared/ffoo/fpurchase_order_assets/fbatch_7813/f7105620_done.txt > > Isn't that fascinating, boys and girls? > > Let's check another. > > $ ls -l > /backups/pc/foo--tm50-e00145--tm50-s00339---shared/47/f%2f/fshared/ffoo/fpurchase_order_assets/fbatch_7809/fposter/fposter_234517.zip > -rw-r----- 3 backuppc backuppc 8510861 Nov 24 09:14 > /backups/pc/foo--tm50-e00145--tm50-s00339---shared/47/f%2f/fshared/ffoo/fpurchase_order_assets/fbatch_7809/fposter/fposter_234517.zip > > $ perl /tmp/bpctest.pl > /backups/pc/foo--tm50-e00145--tm50-s00339---shared/47/f%2f/fshared/ffoo/fpurchase_order_assets/fbatch_7809/fposter/fposter_234517.zip > 42a13e7f5875b2d8ff79ae54e2cb41a9 > /backups/pc/foo--tm50-e00145--tm50-s00339---shared/47/f%2f/fshared/ffoo/fpurchase_order_assets/fbatch_7809/fposter/fposter_234517.zip > > $ ls -l /backups/cpool/4/2/a/42a13e7f5875b2d8ff79ae54e2cb41a9 > -rw-r----- 3 backuppc backuppc 8510861 Nov 22 18:53 > /backups/cpool/4/2/a/42a13e7f5875b2d8ff79ae54e2cb41a9 > > $ ls -li /backups/cpool/4/2/a/42a13e7f5875b2d8ff79ae54e2cb41a9 > /backups/pc/foo--tm50-e00145--tm50-s00339---shared/47/f%2f/fshared/ffoo/fpurchase_order_assets/fbatch_7809/fposter/fposter_234517.zip > 3696447185 -rw-r----- 3 backuppc backuppc 8510861 Nov 22 18:53 > /backups/cpool/4/2/a/42a13e7f5875b2d8ff79ae54e2cb41a9 > 145635130 -rw-r----- 3 backuppc backuppc 8510861 Nov 24 09:14 > /backups/pc/foo--tm50-e00145--tm50-s00339---shared/47/f%2f/fshared/ffoo/fpurchase_order_assets/fbatch_7809/fposter/fposter_234517.zip > > So, yeah. More than one link, matches something in the pool, but > not actually linked to it. Isn't that *awesome*? ;'(
Ah yess... Now I understand what you mean. My script only looks for present but unlinked pc files (ie links=1). The reason for doing it that way was two-fold: 1. Most of the use cases I was aware of dealing with pool corruption occurred during the backup run itself when the files are put into the pc directory but the link step to the pool fails. This leaves files with just one hard link. 2. More importantly, there is no fast way that I know of checking whether a pc entry with more than one link is in the pool. You first need to read in the 1st MB of each file, calculate the partial file md5sum, then see if it is present in the pool and then if there is a chain of files with the same partial md5sum, you need to compare the files individually. All the pieces to do that are in my various posted routines and code snippets, it's just not wrapped with one big for loop to go through the pc directory. In particular, you could use my routine BackupPC_zfile2MD5 to traverse the pc tree checking that each file has a partial md5sum equivalent in the pool. Then any errors can be acted on by the correction routines in BackupPC_fixLinks. Still, blindly traversing the entire pc tree will generally be an order of magnitude slower or more than traversing the pool assuming that you have a lot of current and incrementals. If you are satisfied just going through the latest backup then it might be more manageable... In fact, it was for applications like this that I had suggested a while back adding the partial md5sum to the attrib file so that the reverse lookup can be done more cheaply (the need for all of this will be obviated when Craig finishes the next version :P ) > > I very much want BackupPC_fixLinks to deal with this, and I'm trying > to modify it to do that now. Sounds good. Let me know if you are thinking differently than I am here... If not I could probably do the modification pretty fast by patching my pieces of code together. ------------------------------------------------------------------------------ What happens now with your Lotus Notes apps - do you make another costly upgrade, or settle for being marooned without product support? Time to move off Lotus Notes and onto the cloud with Force.com, apps are easier to build, use, and manage than apps on traditional platforms. Sign up for the Lotus Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/