Re: [Bacula-users] checksumming problem

2007-07-01 Thread Jon Wilson
I don't think I have turned off the output:

[EMAIL PROTECTED] lib]$ grep OUTPUT_BASE64 md5.*
md5.c:#define OUTPUT_BASE64 1
md5.c:#ifdef OUTPUT_BASE64

The problem is now that this:

 So I have the following, which don't match:

 1) System MD5sum from file on disk
 2) Bacula md5sum, from file on disk
 3) Base64 encoding of (2)
 4) Base64 hash stored in the catalog

 1 and 2 don't match, which is confusing me.

 3 and 4 don't match either. I'm rapidly concluding that (4) is just 
 garbage, due to whatever nonsense our in-house scripts introduced into 
 the system many months ago.

-- 
Jon Wilson [EMAIL PROTECTED]
Systems Administration Manager

PO Box H58, Australia Square, Sydney NSW 1215
Level 2, 9 Castlereagh Street, Sydney Office Tel:+61 9231 5888
Direct Tel:+61 2 9236 9118  Fax: +61 2 9231 5988
www.sirca.org.au

DISCLAIMER: The contents of this email, inclusive of attachments, may be 
legally privileged and confidential. Any unauthorised use of the 
contents is expressly prohibited. If you have received this email 
message in error or are not the intended recipient, you should destroy 
the message along with any attachment(s). Unintended recipients of this 
email are prohibited from retaining, disclosing, distributing or using 
any information herein. This email is also subject to copyright. No part 
of it should be reproduced, adapted or transmitted without the written 
consent of the copyright owner.


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] checksumming problem

2007-06-28 Thread Jon Wilson

Thanks Kern.

Kern Sibbald wrote:
 It seems like the checksumming and base64 encoding in the version of
 Bacula we have is broken. We can't just run md5sum on the command line
 and compare against what's in the Catalog. The bug report for this is 
 here (fixed in newer releases):
 
 The code was not broken.  The hash code is identical to the RFC defining it.  
 However, when Bacula was originally written, there was no RFC for 64 bit 
 encoding or I wasn't aware of it, so the 64 bit printable representation was 
 different from what md5sum produces.

Noted. Perhaps unusual is a better word than broken :-)

 Some questions:
 
 1) Are the checksums stored in Bacula any use to me at all?
 
 Yes, of course, but they are not so easy to use without a bit of work.

OK. We can do some work

 2) Can I set up a Verify task, and compare the stored checksums
 against the disk copies? Or is this going to be too hard, given that the
 InitCatalog jobs was never run, and we have moved files around?
 
 The closest thing to what you want is a Verify VolumeToCatalog.  It will 
 verifiy that the hash codes stored on the Volume are the same as those in the 
 catalog. This means that Bacula must read the whole tape to find the hash 
 codes, but they are pre-computed, so Bacula does not actually read the data 
 on the Volume and re-compute a hash code (considering the data such as Win32 
 data, ... that would be a nearly impossible task anyway).

Re-reading all the tapes is not an option, I'm afraid.

 There is a C++ program (actually written in C) in 
 bacula-source/src/lib/md5.c that you can tweak and compile into a program 
 that will permit you to create an md5 program that does the same thing as 
 md5sum, but using the Bacula 64 bit encoding scheme ...
 
 If I remember right, if you have already configured and built Bacula, you can 
 simply:
 
   cd bacula-source/src/lib
   make md5sum
 
 and it will create a binary named md5 (I think) that you can execute.  If 
 not, 
 just tweak the Makefile.

Right, got the 1.36.3 source. Compiled the md5sum binary.
It does something, but gives a hex answer. The database has an base64 
entry. Hmmm.

[EMAIL PROTECTED] lib]# ./md5sum /path/to/datfix/2101.1.dat
c4e8a6c211039eae6e94da1620b35581

Now compare this to the catalog in Mysql:

mysql select Filename.FilenameId,Path.Path,Filename.Name,File.MD5 from 
Path,File,Filename where Filename.Name='2101.1.dat' and 
Filename.FilenameId=File.FilenameId and File.PathId=Path.PathId and Path 
like '%/datfix/%';
++-+++
| FilenameId | Path| Name   | MD5
++-+++
| 137634 | /original/path/ | 2101.1.dat | ti/+AnBHiz4yb9YM45/RDB |
| 137634 | /original/path/ | 2101.1.dat | ti/+AnBHiz4yb9YM45/RDB |
++-+++
2 rows in set (0.00 sec)

(There are two entries, because we do on and off-site backup runs)

How do I get from c4e8a6c211039eae6e94da1620b35581 to 
ti/+AnBHiz4yb9YM45/RDB or vice-versa? I think I am stuck.

For reference, the standard linux md5sum gives this:

[EMAIL PROTECTED] lib]# /usr/bin/md5sum /path/to/datfix/2000/2101.1.dat
b622fe0270478b3e326cd60ce196d10d

I have checked all this with some other files, and the behaviour is the 
same.

Thanks,

Jon

-- 
Jon Wilson [EMAIL PROTECTED]
Systems Administration Manager

PO Box H58, Australia Square, Sydney NSW 1215
Level 2, 9 Castlereagh Street, Sydney Office Tel:+61 9231 5888
Direct Tel:+61 2 9236 9118  Fax: +61 2 9231 5988
www.sirca.org.au

DISCLAIMER: The contents of this email, inclusive of attachments, may be
legally privileged and confidential. Any unauthorised use of the
contents is expressly prohibited. If you have received this email
message in error or are not the intended recipient, you should destroy
the message along with any attachment(s). Unintended recipients of this
email are prohibited from retaining, disclosing, distributing or using
any information herein. This email is also subject to copyright. No part
of it should be reproduced, adapted or transmitted without the written
consent of the copyright owner.


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] checksumming problem

2007-06-28 Thread Jon Wilson
Kern Sibbald wrote:
 [EMAIL PROTECTED] lib]# ./md5sum /path/to/datfix/2101.1.dat
 c4e8a6c211039eae6e94da1620b35581

 Now compare this to the catalog in Mysql:

 mysql select Filename.FilenameId,Path.Path,Filename.Name,File.MD5 from 
 Path,File,Filename where Filename.Name='2101.1.dat' and 
 Filename.FilenameId=File.FilenameId and File.PathId=Path.PathId and Path 
 like '%/datfix/%';
 ++-+++
 | FilenameId | Path| Name   | MD5
 ++-+++
 | 137634 | /original/path/ | 2101.1.dat | ti/+AnBHiz4yb9YM45/RDB |
 | 137634 | /original/path/ | 2101.1.dat | ti/+AnBHiz4yb9YM45/RDB |
 ++-+++
 2 rows in set (0.00 sec)

 (There are two entries, because we do on and off-site backup runs)

 How do I get from c4e8a6c211039eae6e94da1620b35581 to 
 ti/+AnBHiz4yb9YM45/RDB or vice-versa? I think I am stuck.

 For reference, the standard linux md5sum gives this:

 [EMAIL PROTECTED] lib]# /usr/bin/md5sum /path/to/datfix/2000/2101.1.dat
 b622fe0270478b3e326cd60ce196d10d

 I have checked all this with some other files, and the behaviour is the 
 same.
 
 I guess 1.36.3, which is pretty old is probably missing some code that is in 
 md5.c in the current SVN.  I'll send a second email with the current md5.c 
 attached -- off list.  It *should* compile and run fine in your 1.36.3 code, 
 just save the original for comparison if you run into problems.

Thanks for that Kern.

I had to hack it slightly, since there are now extra arguments to 
bin_to_base64().

Now I get results like this:

[EMAIL PROTECTED] lib]# ./md5sum /path/to/datfix/2000/2101.1.dat
c4e8a6c211039eae6e94da1620b35581  x++mwhEDn65ul9oWI7NVgB 
/path/to/datfix/2000/2101.1.dat

So I have the following, which don't match:

1) System MD5sum from file on disk
2) Bacula md5sum, from file on disk
3) Base64 encoding of (2)
4) Base64 hash stored in the catalog

1 and 2 don't match, which is confusing me.

3 and 4 don't match either. I'm rapidly concluding that (4) is just 
garbage, due to whatever nonsense our in-house scripts introduced into 
the system many months ago.

Jon

-- 
Jon Wilson [EMAIL PROTECTED]
Systems Administration Manager

PO Box H58, Australia Square, Sydney NSW 1215
Level 2, 9 Castlereagh Street, Sydney Office Tel:+61 9231 5888
Direct Tel:+61 2 9236 9118  Fax: +61 2 9231 5988
www.sirca.org.au

DISCLAIMER: The contents of this email, inclusive of attachments, may be 
legally privileged and confidential. Any unauthorised use of the 
contents is expressly prohibited. If you have received this email 
message in error or are not the intended recipient, you should destroy 
the message along with any attachment(s). Unintended recipients of this 
email are prohibited from retaining, disclosing, distributing or using 
any information herein. This email is also subject to copyright. No part 
of it should be reproduced, adapted or transmitted without the written 
consent of the copyright owner.


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] checksumming problem

2007-06-24 Thread Jon Wilson

Hi,

I have an interesting problem, regarding checksumming a large number
of big files prior to a storage migration.

We have a fairly old install of Bacula, v1.36, running on Redhat
Enterprise Linux v3, with a MySQL 3.23 database.

We are shortly going to upgrade our storage system. I want to check for
bit-rot before commencing the migration. Bacula seems to have some
checksums stored, so lets use them!

To complicate things, many files have been moved around, in order
optimise use of different storage media. We have some scripts which
ensure Bacula backs-up the correct files, and doesn't double back-up big
files. But this means that the paths in the catalog don't match the
paths on disk.

It seems like the checksumming and base64 encoding in the version of
Bacula we have is broken. We can't just run md5sum on the command line
and compare against what's in the Catalog. The bug report for this is 
here (fixed in newer releases):

http://bugs.bacula.org/bug_view_advanced_page.php?bug_id=565

My first thought was to set up a Verify task in our current 
configuration, and get Bacula to natively check the files against the 
disk. It shouldn't matter that the encoding is broken, since we will be 
using the same (broken) code for verification. But when I do this, it 
complains that there has been no InitCatalog task done. I'm not sure 
whether this is related to our file moves, or because we intially 
misconfgiured things. I guess we should have been checking this 
functionality from the start :-(

Some questions:

1) Are the checksums stored in Bacula any use to me at all?

2) Can I set up a Verify task, and compare the stored checksums
against the disk copies? Or is this going to be too hard, given that the
InitCatalog jobs was never run, and we have moved files around?

3) Can I write a command line checksum program, which will output
checksums, and encode them in a way I can compare with the catalog? How
can I do this in Perl, or C? I've had a look at the internals of the
Bacula libraries, and my meager C skills are not sufficient to work
things out. (This is my preferred option)

Thanks,

-- 
Jon Wilson [EMAIL PROTECTED]
Systems Administration Manager

PO Box H58, Australia Square, Sydney NSW 1215
Level 2, 9 Castlereagh Street, Sydney Office Tel:+61 9231 5888
Direct Tel:+61 2 9236 9118  Fax: +61 2 9231 5988
www.sirca.org.au

DISCLAIMER: The contents of this email, inclusive of attachments, may be
legally privileged and confidential. Any unauthorised use of the
contents is expressly prohibited. If you have received this email
message in error or are not the intended recipient, you should destroy
the message along with any attachment(s). Unintended recipients of this
email are prohibited from retaining, disclosing, distributing or using
any information herein. This email is also subject to copyright. No part
of it should be reproduced, adapted or transmitted without the written
consent of the copyright owner.



-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users