Re: [BackupPC-users] Could Digest::MD5 be broken on ARM-based - SOLUTION TO FIX broken pools computers?

2011-01-25 Thread Jeffrey J. Kosowsky
Jeffrey J. Kosowsky wrote at about 20:07:37 -0500 on Sunday, January 23, 2011:
 > Jeffrey J. Kosowsky wrote at about 19:19:54 -0500 on Sunday, January 23, 
 > 2011:
 >  > I was testing some of my md5sum routines and I kept getting weird
 >  > results on ARM-based computers.
 >  > 
 >  > Specifically, the pool file md5sum numbers were different depending on
 >  > whether I computed them under Fedora 12 on an x86 machine vs under
 >  > Debian Lenny on an ARM-based computer.
 >  > 
 >  > This obviously creates issues if you want to move your backup drive
 >  > between different CPUs.
 >  > 
 >  > I narrowed it down to Digest::MD5, by doing the following 1-liner:
 >  > perl -e 'use Digest::MD5 qw(md5_hex);$file=testfile; 
 > $size=(stat($file))[7];$body=`cat $file`; print md5_hex($size,$body) . "\n";'
 >  > 
 >  > This should be the same as:
 >  > perl -e '$file=testfile; $size=(stat($file))[7];$body=`cat $file`; print 
 > $size, $body;' | md5sum
 >  > 
 >  > For maybe 1% of files in my pool the ARM machine gave the wrong answer
 >  > when using Digest::MD5
 >  > 
 >  > So, something must be wacko in the perl implementation of Digest::MD5
 >  > on ARM machines!
 >  > 
 > 
 > Well, what do you know, Perl 5.10.0 (at least in Debian but I think
 > upstream too) are broken on ARM processors.
 > 
 > Something about 32-bit alignment.
 > You need to upgrade to 5.10.1 -- and now I wasted a day on this...
 > And now I need to write code to fix my pool - YUCK!
 > 

Well, I went through my pool carefully and it seems like the error
effects close to HALF of my pool files. This is a real mess and PITA.
BUT, I wrote a perl routine that goes through the pool and/or cpool
and corrects all the entries. Specifically, it
1. Goes through the pool and calculates the actual MD5sum path for the
   file (using my zFile2MD5 routine if it is in the cpool which avoids
   decompressing the entire file).

2. If the calculated partial file MD5sum differs from the current
   filename, then the routine finds the first empty spot in the chain
   of the corrected MD5sum. If there is already a chain there (of at
   least one file), the routine compares files (again using my faster
   zcompare routine if compressed) to see if there already is a
   match. If there is a match, then it is flagged for later correction
   by a program like my BackupPC_fixLinks.pl program. While strictly
   speaking there is no danger in having more than one copy of the
   same file in a chain (and it is necessary when nlinks > MAXLINKS),
   it is not efficient, so it is detected and flagged. Note ,though,
   in general you shouldn't have many such collisions since if the
   MD5sum was broken once it was probably broken the whole time
   (unless you switched back and forth between broken and non-broken
   Perl versions)

3. The program then renames (i.e. moves) the file and intelligently
   fills in any holes in the old chain in a way that minimizes chain
   renumbering and that preserves the relative ordering of chain
   numbering.

Note the routine can  in general be used to check and fix the
integrity of the pool/pool so it may be more generally useful

The program uses routines from my jLib.pm module and requires the
latest version that I have not yet posted (but will email if anybody
needs it).

Here though is the perl code for the routine itself:
---

#!/usr/bin/perl
#= -*-perl-*-
#
# BackupPC_fixPoolMdsums: Rename/move pool files if mdsum path name invalid
#
# DESCRIPTION
#   See 'usage' for more detailed description of what it does
#   
# AUTHOR
#   Jeff Kosowsky
#
# COPYRIGHT
#   Copyright (C) 2011  Jeff Kosowsky
#
#   This program is free software; you can redistribute it and/or modify
#   it under the terms of the GNU General Public License as published by
#   the Free Software Foundation; either version 2 of the License, or
#   (at your option) any later version.
#
#   This program is distributed in the hope that it will be useful,
#   but WITHOUT ANY WARRANTY; without even the implied warranty of
#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#   GNU General Public License for more details.
#
#   You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
#
#
#
# Version 0.1, released January 2011
#
#

use strict;
use warnings;

use lib "/usr/share/BackupPC/lib";
use BackupPC::Lib;
use BackupPC::jLib 0.4.0;  # Requires version >= 0.4.0
use File::Glob ':glob';
use Getopt::Long qw(:config no_ignore_case bundling);

my $bpc = BackupPC::Lib->new or die("BackupPC::Lib->new failed\n");
%Conf   = $bpc->Conf(); #Global variable 

Re: [BackupPC-users] Filesystem separation

2011-01-25 Thread John Goerzen
Rob Owens  ptd.net> writes:

> One reason I always specify the --one-file-system argument for rsync is
> that prevents me from accidentally backing up an NFS share.  Since I use
> BackupPC for all the computers on my LAN, the data in the NFS share gets
> backed up when I back up the server that is hosting/exporting the share.
> 
> Same thing goes for the occasional fuse share.  In particular, I've
> started using encfs and I certainly wouldn't want a copy of my encrypted
> data to get backed up unencrypted, just because BackupPC happened to be
> running when I had an encrypted volume mounted.

That is a reasonable point, and a good idea.  I'm used to doing that with other
backup software as well.  But I'm still not understanding why the manual says a
*restore* is easier.

-- John


--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


[BackupPC-users] rsync with --no-inc-recursive

2011-01-25 Thread John Goerzen
Hi,

I notice that the default BackupPC config uses --recursive with rsync. 
According to rsync's manpage, this may cause it to miss hardlinks:

"If  incremental recursion is active (see --recursive), rsync may 
transfer a missing hard-linked file before it finds that another link 
for that contents exists elsewhere in the  hierarchy.   This  does 
not  affect  the accuracy of the transfer, just its efficiency.  One way 
to avoid this is to disable incremental recursion using the 
--no-inc-recursive option."

Is it safe to replace --recursive with --no-inc-recursive in the 
BackupPC config?

Also, I'm wondering if direct restore to the client isn't possible with 
rsync?  The introduction of the manual says "(with smb or tar)" which 
makes me wonder.

Thanks,

-- John

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Filesystem separation

2011-01-25 Thread Rob Owens
On Tue, Jan 25, 2011 at 06:17:12PM -0600, John Goerzen wrote:
> Hi,
> 
> In reading the manual for parameters such as the tar, rsync, etc. share, 
> I see:
> 
> "Alternatively, rather than backup all the file systems as a single 
> share ("/"), it is easier to restore a single file system if you backup 
> each file system separately."
> 
> Can anyone tell me why this is easier?  Can't one select the subset of 
> the backup to restore out of a whole filesystem backup anyhow?
> 
One reason I always specify the --one-file-system argument for rsync is
that prevents me from accidentally backing up an NFS share.  Since I use
BackupPC for all the computers on my LAN, the data in the NFS share gets
backed up when I back up the server that is hosting/exporting the share.

Same thing goes for the occasional fuse share.  In particular, I've
started using encfs and I certainly wouldn't want a copy of my encrypted
data to get backed up unencrypted, just because BackupPC happened to be
running when I had an encrypted volume mounted.

-Rob

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


[BackupPC-users] Filesystem separation

2011-01-25 Thread John Goerzen
Hi,

In reading the manual for parameters such as the tar, rsync, etc. share, 
I see:

"Alternatively, rather than backup all the file systems as a single 
share ("/"), it is easier to restore a single file system if you backup 
each file system separately."

Can anyone tell me why this is easier?  Can't one select the subset of 
the backup to restore out of a whole filesystem backup anyhow?

-- John


--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/