Re: [BackupPC-users] 2 cpool files with same checksum, different (compressed content) but same zcatt'ed content?????

2008-11-03 Thread Jeffrey J. Kosowsky
Jeffrey J. Kosowsky wrote at about 05:56:30 -0500 on Monday, November 3, 2008:
  Craig Barratt wrote at about 18:06:43 -0700 on Friday, October 31, 2008:
Jeff writes:

 Is there a (reasonably easy) way of identifying which ones have the
 rsync checksum seed and which ones don't???

I'm relucant to even say, because you are heading in an unproductive
direction.  But here goes: a compressed file without checksums starts
with 0x78 and a compressed file with checksums starts with 0xd6 or 0xd7.
See lib/BackupPC/FileZIO.pm.

The file sizes in the example you cite suggests the first has checksums
and the second does not.

  Thanks I am learning a lot (and trust me it is productive because it
  has all forced me to go through the code at a line-by-line level.
  
  I am noticing that some of the potentially improperly backed up files
  have either a 0x00 or 0x04 as there code?
  How would that happen? (or are these somehow error codes that got
  stuck back in the first byte?)
  
  Thanks!
  
Never mind it was just a mistake in my code... It all makes sense now...

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 2 cpool files with same checksum, different (compressed content) but same zcatt'ed content?????

2008-11-03 Thread Jeffrey J. Kosowsky
Craig Barratt wrote at about 18:06:43 -0700 on Friday, October 31, 2008:
  Jeff writes:
  
   Is there a (reasonably easy) way of identifying which ones have the
   rsync checksum seed and which ones don't???
  
  I'm relucant to even say, because you are heading in an unproductive
  direction.  But here goes: a compressed file without checksums starts
  with 0x78 and a compressed file with checksums starts with 0xd6 or 0xd7.
  See lib/BackupPC/FileZIO.pm.
  
  The file sizes in the example you cite suggests the first has checksums
  and the second does not.
  
Thanks I am learning a lot (and trust me it is productive because it
has all forced me to go through the code at a line-by-line level.

I am noticing that some of the potentially improperly backed up files
have either a 0x00 or 0x04 as there code?
How would that happen? (or are these somehow error codes that got
stuck back in the first byte?)

Thanks!

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 2 cpool files with same checksum, different (compressed content) but same zcatt'ed content?????

2008-11-01 Thread Jeffrey J. Kosowsky
Holger Parplies wrote at about 00:52:47 +0100 on Saturday, November 1, 2008:
  Hi,
  
  Jeffrey J. Kosowsky wrote on 2008-10-31 15:26:58 -0400 [Re: [BackupPC-users] 
  2 cpool files with same checksum, different (compressed   content) but same 
  zcatt'ed content?]:
   Les Mikesell wrote at about 10:27:20 -0500 on Friday, October 31, 2008:
 Jeffrey J. Kosowsky wrote:
  
  Is there a (reasonably easy) way of identifying which ones have the
  rsync checksum seed and which ones don't???
 
 I think you are kind of missing the point that they could both be 
 corrupted in other ways...
  
  I don't think these files as such are corrupted. They are decompressible (to
  identical contents). I wouldn't trust the backup(s) as a whole - they may be
  missing files for instance - but that's a different matter.

Exactly.
  
   I know but I would still like to understand this better since I am
   writing some utilities to clean this up (with the caveat being that it
   may still be corrupted)
  
  I admit that I am missing the point why you are going to the trouble, but to
  answer your question: it doesn't matter. BackupPC will add the checksums at
  the appropriate time (next backup, I'd say) if they are missing
  (and enabled).
Well, I'm not sure why I'm going to the trouble except that I am
learning a lot plus I am writing my own scripts (which I will share
when done) to check and correct corrupted pools (which may be
necessary at some later point when I REALLY need it)

  That said, I'd expect the version with checksums to be larger, because the
  compressed content should be identical. Add to that the checksums in one case
  and nothing in the other.
  

Is there anywhere where the specific format of the rsync checksums is
documented in terms of how they are computed and where they are added
to the pool files?

Thanks

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 2 cpool files with same checksum, different (compressed content) but same zcatt'ed content?????

2008-10-31 Thread Tino Schwarze
On Thu, Oct 30, 2008 at 11:16:05PM -0400, Jeffrey J. Kosowsky wrote:

 I must be missing something on this whole compression, pooling, and
 checksum matter.
 
 I found 2 files in my cpool that have the same checksum (one is _0)
 but 'cmp' to different values. However, when I zcat them, they have
 the same value. I thought that (lossless) compression was a 1-1
 mapping?

Please post the output of the following:
ls -l yourfile*
md5sum yourfile*
BackupPC_zcat yourfile | md5sum
BackupPC_zcat yourfile_0 | md5sum

 But here we seem to have two files that are identical (and thus have
 the same checksum) but compress to 2 *different* results?

Again: The file name ist NOT the checksum of the whole file's contents!
It's just an MD5 sum which incorporates the first 256k of the file and
the file's original length (as I learned this week from this list).

 This would seem to be going against the grain of pooling where two
 identical files share the same pool entry.
 
 What am I missing?

That hash collisions are expected.

Tino.

-- 
What we nourish flourishes. - Was wir nähren erblüht.

www.lichtkreis-chemnitz.de
www.craniosacralzentrum.de

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 2 cpool files with same checksum, different (compressed content) but same zcatt'ed content?????

2008-10-31 Thread Holger Parplies
Hi,

Jeffrey J. Kosowsky wrote on 2008-10-30 23:16:05 -0400 [[BackupPC-users] 2 
cpool files with same checksum, different (compressed   content) but same 
zcatt'ed content?]:
 [...]
 I found 2 files in my cpool that have the same checksum (one is _0)
 but 'cmp' to different values. However, when I zcat them, they have
 the same value. I thought that (lossless) compression was a 1-1
 mapping?

have you got checksum seeds enabled? This could be an artefact of your NFS
problems leading to one copy without checksum seeds and one with them, maybe?

Regards,
Holger

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 2 cpool files with same checksum, different (compressed content) but same zcatt'ed content?????

2008-10-31 Thread Jeffrey J. Kosowsky
Holger Parplies wrote at about 12:25:08 +0100 on Friday, October 31, 2008:
  Hi,
  
  Jeffrey J. Kosowsky wrote on 2008-10-30 23:16:05 -0400 [[BackupPC-users] 2 
  cpool files with same checksum, different (compressed   content) but same 
  zcatt'ed content?]:
   [...]
   I found 2 files in my cpool that have the same checksum (one is _0)
   but 'cmp' to different values. However, when I zcat them, they have
   the same value. I thought that (lossless) compression was a 1-1
   mapping?
  
  have you got checksum seeds enabled? This could be an artefact of your NFS
  problems leading to one copy without checksum seeds and one with them, maybe?
  
  Regards,
  Holger

Is there a (reasonably easy) way of identifying which ones have the
rsync checksum seed and which ones don't???

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 2 cpool files with same checksum, different (compressed content) but same zcatt'ed content?????

2008-10-31 Thread Jeffrey J. Kosowsky
Tino Schwarze wrote at about 12:20:50 +0100 on Friday, October 31, 2008:
  On Thu, Oct 30, 2008 at 11:16:05PM -0400, Jeffrey J. Kosowsky wrote:
  
   I must be missing something on this whole compression, pooling, and
   checksum matter.
   
   I found 2 files in my cpool that have the same checksum (one is _0)
   but 'cmp' to different values. However, when I zcat them, they have
   the same value. I thought that (lossless) compression was a 1-1
   mapping?
  
  Please post the output of the following:
  ls -l yourfile*
  md5sum yourfile*
  BackupPC_zcat yourfile | md5sum
  BackupPC_zcat yourfile_0 | md5sum

-rw-r- 5 backuppc backuppc 7870 Oct 27 16:28 
/var/lib/BackupPC/cpool/5/f/8/5f87fe62e8254679c582097314f97fe3
-rw-r--r-- 2 backuppc backuppc 7681 Oct 28 09:17 
/var/lib/BackupPC/cpool/5/f/8/5f87fe62e8254679c582097314f97fe3_1

08be9e936c80024809fde108f6df9bb1 
/var/lib/BackupPC/cpool/5/f/8/5f87fe62e8254679c582097314f97fe3
ce46e80af4a086e29ae17b0f800362e1 
/var/lib/BackupPC/cpool/5/f/8/5f87fe62e8254679c582097314f97fe3_1

4266be808f85826aedf3c64c1e240203
4266be808f85826aedf3c64c1e240203

  
   But here we seem to have two files that are identical (and thus have
   the same checksum) but compress to 2 *different* results?
  
  Again: The file name ist NOT the checksum of the whole file's contents!
  It's just an MD5 sum which incorporates the first 256k of the file and
  the file's original length (as I learned this week from this list).

Yes but this is the OPPOSITE of hash collisions. Hash collision is
when 2 *different* (uncompressed) files have the *same* checksum.

Here 2 *identical* (uncompressed) files have *different* checksums.

As Craig and Holger explained, this is probably also attributable to
corruption where some backups had the rsync caching seeds included an
others not.

  
   This would seem to be going against the grain of pooling where two
   identical files share the same pool entry.
   
   What am I missing?
  
  That hash collisions are expected.
Yes but that is not the case here.
  
  Tino.
  
  -- 
  What we nourish flourishes. - Was wir nähren erblüht.
  
  www.lichtkreis-chemnitz.de
  www.craniosacralzentrum.de
  
  -
  This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
  Build the coolest Linux based applications with Moblin SDK  win great prizes
  Grand prize is a trip for two to an Open Source event anywhere in the world
  http://moblin-contest.org/redirect.php?banner_id=100url=/
  ___
  BackupPC-users mailing list
  BackupPC-users@lists.sourceforge.net
  List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
  Wiki:http://backuppc.wiki.sourceforge.net
  Project: http://backuppc.sourceforge.net/

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 2 cpool files with same checksum, different (compressed content) but same zcatt'ed content?????

2008-10-31 Thread Les Mikesell
Jeffrey J. Kosowsky wrote:
 
 Is there a (reasonably easy) way of identifying which ones have the
 rsync checksum seed and which ones don't???

I think you are kind of missing the point that they could both be 
corrupted in other ways...

-- 
   Les Mikesell
[EMAIL PROTECTED]


-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 2 cpool files with same checksum, different (compressed content) but same zcatt'ed content?????

2008-10-31 Thread Jeffrey J. Kosowsky
Les Mikesell wrote at about 10:27:20 -0500 on Friday, October 31, 2008:
  Jeffrey J. Kosowsky wrote:
   
   Is there a (reasonably easy) way of identifying which ones have the
   rsync checksum seed and which ones don't???
  
  I think you are kind of missing the point that they could both be 
  corrupted in other ways...
  
I know but I would still like to understand this better since I am
writing some utilities to clean this up (with the caveat being that it
may still be corrupted)

  -- 
 Les Mikesell
  [EMAIL PROTECTED]
  
  
  -
  This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
  Build the coolest Linux based applications with Moblin SDK  win great prizes
  Grand prize is a trip for two to an Open Source event anywhere in the world
  http://moblin-contest.org/redirect.php?banner_id=100url=/
  ___
  BackupPC-users mailing list
  BackupPC-users@lists.sourceforge.net
  List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
  Wiki:http://backuppc.wiki.sourceforge.net
  Project: http://backuppc.sourceforge.net/

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 2 cpool files with same checksum, different (compressed content) but same zcatt'ed content?????

2008-10-31 Thread Holger Parplies
Hi,

Jeffrey J. Kosowsky wrote on 2008-10-31 15:26:58 -0400 [Re: [BackupPC-users] 2 
cpool files with same checksum, different (compressed   content) but same 
zcatt'ed content?]:
 Les Mikesell wrote at about 10:27:20 -0500 on Friday, October 31, 2008:
   Jeffrey J. Kosowsky wrote:

Is there a (reasonably easy) way of identifying which ones have the
rsync checksum seed and which ones don't???
   
   I think you are kind of missing the point that they could both be 
   corrupted in other ways...

I don't think these files as such are corrupted. They are decompressible (to
identical contents). I wouldn't trust the backup(s) as a whole - they may be
missing files for instance - but that's a different matter.

 I know but I would still like to understand this better since I am
 writing some utilities to clean this up (with the caveat being that it
 may still be corrupted)

I admit that I am missing the point why you are going to the trouble, but to
answer your question: it doesn't matter. BackupPC will add the checksums at
the appropriate time (next backup, I'd say) if they are missing (and enabled).
That said, I'd expect the version with checksums to be larger, because the
compressed content should be identical. Add to that the checksums in one case
and nothing in the other.

Regards,
Holger

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] 2 cpool files with same checksum, different (compressed content) but same zcatt'ed content?????

2008-10-31 Thread Craig Barratt
Jeff writes:

 Is there a (reasonably easy) way of identifying which ones have the
 rsync checksum seed and which ones don't???

I'm relucant to even say, because you are heading in an unproductive
direction.  But here goes: a compressed file without checksums starts
with 0x78 and a compressed file with checksums starts with 0xd6 or 0xd7.
See lib/BackupPC/FileZIO.pm.

The file sizes in the example you cite suggests the first has checksums
and the second does not.

Les writes:

 I think you are kind of missing the point that they could both be
 corrupted in other ways...

Exactly!  Can we stop this thread please :)?

Craig

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/