Re: [Bacula-users] checksumming problem
I don't think I have turned off the output: [EMAIL PROTECTED] lib]$ grep OUTPUT_BASE64 md5.* md5.c:#define OUTPUT_BASE64 1 md5.c:#ifdef OUTPUT_BASE64 The problem is now that this: So I have the following, which don't match: 1) System MD5sum from file on disk 2) Bacula md5sum, from file on disk 3) Base64 encoding of (2) 4) Base64 hash stored in the catalog 1 and 2 don't match, which is confusing me. 3 and 4 don't match either. I'm rapidly concluding that (4) is just garbage, due to whatever nonsense our in-house scripts introduced into the system many months ago. -- Jon Wilson [EMAIL PROTECTED] Systems Administration Manager PO Box H58, Australia Square, Sydney NSW 1215 Level 2, 9 Castlereagh Street, Sydney Office Tel:+61 9231 5888 Direct Tel:+61 2 9236 9118 Fax: +61 2 9231 5988 www.sirca.org.au DISCLAIMER: The contents of this email, inclusive of attachments, may be legally privileged and confidential. Any unauthorised use of the contents is expressly prohibited. If you have received this email message in error or are not the intended recipient, you should destroy the message along with any attachment(s). Unintended recipients of this email are prohibited from retaining, disclosing, distributing or using any information herein. This email is also subject to copyright. No part of it should be reproduced, adapted or transmitted without the written consent of the copyright owner. - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] checksumming problem
Thanks Kern. Kern Sibbald wrote: It seems like the checksumming and base64 encoding in the version of Bacula we have is broken. We can't just run md5sum on the command line and compare against what's in the Catalog. The bug report for this is here (fixed in newer releases): The code was not broken. The hash code is identical to the RFC defining it. However, when Bacula was originally written, there was no RFC for 64 bit encoding or I wasn't aware of it, so the 64 bit printable representation was different from what md5sum produces. Noted. Perhaps unusual is a better word than broken :-) Some questions: 1) Are the checksums stored in Bacula any use to me at all? Yes, of course, but they are not so easy to use without a bit of work. OK. We can do some work 2) Can I set up a Verify task, and compare the stored checksums against the disk copies? Or is this going to be too hard, given that the InitCatalog jobs was never run, and we have moved files around? The closest thing to what you want is a Verify VolumeToCatalog. It will verifiy that the hash codes stored on the Volume are the same as those in the catalog. This means that Bacula must read the whole tape to find the hash codes, but they are pre-computed, so Bacula does not actually read the data on the Volume and re-compute a hash code (considering the data such as Win32 data, ... that would be a nearly impossible task anyway). Re-reading all the tapes is not an option, I'm afraid. There is a C++ program (actually written in C) in bacula-source/src/lib/md5.c that you can tweak and compile into a program that will permit you to create an md5 program that does the same thing as md5sum, but using the Bacula 64 bit encoding scheme ... If I remember right, if you have already configured and built Bacula, you can simply: cd bacula-source/src/lib make md5sum and it will create a binary named md5 (I think) that you can execute. If not, just tweak the Makefile. Right, got the 1.36.3 source. Compiled the md5sum binary. It does something, but gives a hex answer. The database has an base64 entry. Hmmm. [EMAIL PROTECTED] lib]# ./md5sum /path/to/datfix/2101.1.dat c4e8a6c211039eae6e94da1620b35581 Now compare this to the catalog in Mysql: mysql select Filename.FilenameId,Path.Path,Filename.Name,File.MD5 from Path,File,Filename where Filename.Name='2101.1.dat' and Filename.FilenameId=File.FilenameId and File.PathId=Path.PathId and Path like '%/datfix/%'; ++-+++ | FilenameId | Path| Name | MD5 ++-+++ | 137634 | /original/path/ | 2101.1.dat | ti/+AnBHiz4yb9YM45/RDB | | 137634 | /original/path/ | 2101.1.dat | ti/+AnBHiz4yb9YM45/RDB | ++-+++ 2 rows in set (0.00 sec) (There are two entries, because we do on and off-site backup runs) How do I get from c4e8a6c211039eae6e94da1620b35581 to ti/+AnBHiz4yb9YM45/RDB or vice-versa? I think I am stuck. For reference, the standard linux md5sum gives this: [EMAIL PROTECTED] lib]# /usr/bin/md5sum /path/to/datfix/2000/2101.1.dat b622fe0270478b3e326cd60ce196d10d I have checked all this with some other files, and the behaviour is the same. Thanks, Jon -- Jon Wilson [EMAIL PROTECTED] Systems Administration Manager PO Box H58, Australia Square, Sydney NSW 1215 Level 2, 9 Castlereagh Street, Sydney Office Tel:+61 9231 5888 Direct Tel:+61 2 9236 9118 Fax: +61 2 9231 5988 www.sirca.org.au DISCLAIMER: The contents of this email, inclusive of attachments, may be legally privileged and confidential. Any unauthorised use of the contents is expressly prohibited. If you have received this email message in error or are not the intended recipient, you should destroy the message along with any attachment(s). Unintended recipients of this email are prohibited from retaining, disclosing, distributing or using any information herein. This email is also subject to copyright. No part of it should be reproduced, adapted or transmitted without the written consent of the copyright owner. - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] checksumming problem
Kern Sibbald wrote: [EMAIL PROTECTED] lib]# ./md5sum /path/to/datfix/2101.1.dat c4e8a6c211039eae6e94da1620b35581 Now compare this to the catalog in Mysql: mysql select Filename.FilenameId,Path.Path,Filename.Name,File.MD5 from Path,File,Filename where Filename.Name='2101.1.dat' and Filename.FilenameId=File.FilenameId and File.PathId=Path.PathId and Path like '%/datfix/%'; ++-+++ | FilenameId | Path| Name | MD5 ++-+++ | 137634 | /original/path/ | 2101.1.dat | ti/+AnBHiz4yb9YM45/RDB | | 137634 | /original/path/ | 2101.1.dat | ti/+AnBHiz4yb9YM45/RDB | ++-+++ 2 rows in set (0.00 sec) (There are two entries, because we do on and off-site backup runs) How do I get from c4e8a6c211039eae6e94da1620b35581 to ti/+AnBHiz4yb9YM45/RDB or vice-versa? I think I am stuck. For reference, the standard linux md5sum gives this: [EMAIL PROTECTED] lib]# /usr/bin/md5sum /path/to/datfix/2000/2101.1.dat b622fe0270478b3e326cd60ce196d10d I have checked all this with some other files, and the behaviour is the same. I guess 1.36.3, which is pretty old is probably missing some code that is in md5.c in the current SVN. I'll send a second email with the current md5.c attached -- off list. It *should* compile and run fine in your 1.36.3 code, just save the original for comparison if you run into problems. Thanks for that Kern. I had to hack it slightly, since there are now extra arguments to bin_to_base64(). Now I get results like this: [EMAIL PROTECTED] lib]# ./md5sum /path/to/datfix/2000/2101.1.dat c4e8a6c211039eae6e94da1620b35581 x++mwhEDn65ul9oWI7NVgB /path/to/datfix/2000/2101.1.dat So I have the following, which don't match: 1) System MD5sum from file on disk 2) Bacula md5sum, from file on disk 3) Base64 encoding of (2) 4) Base64 hash stored in the catalog 1 and 2 don't match, which is confusing me. 3 and 4 don't match either. I'm rapidly concluding that (4) is just garbage, due to whatever nonsense our in-house scripts introduced into the system many months ago. Jon -- Jon Wilson [EMAIL PROTECTED] Systems Administration Manager PO Box H58, Australia Square, Sydney NSW 1215 Level 2, 9 Castlereagh Street, Sydney Office Tel:+61 9231 5888 Direct Tel:+61 2 9236 9118 Fax: +61 2 9231 5988 www.sirca.org.au DISCLAIMER: The contents of this email, inclusive of attachments, may be legally privileged and confidential. Any unauthorised use of the contents is expressly prohibited. If you have received this email message in error or are not the intended recipient, you should destroy the message along with any attachment(s). Unintended recipients of this email are prohibited from retaining, disclosing, distributing or using any information herein. This email is also subject to copyright. No part of it should be reproduced, adapted or transmitted without the written consent of the copyright owner. - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] checksumming problem
Hi, I have an interesting problem, regarding checksumming a large number of big files prior to a storage migration. We have a fairly old install of Bacula, v1.36, running on Redhat Enterprise Linux v3, with a MySQL 3.23 database. We are shortly going to upgrade our storage system. I want to check for bit-rot before commencing the migration. Bacula seems to have some checksums stored, so lets use them! To complicate things, many files have been moved around, in order optimise use of different storage media. We have some scripts which ensure Bacula backs-up the correct files, and doesn't double back-up big files. But this means that the paths in the catalog don't match the paths on disk. It seems like the checksumming and base64 encoding in the version of Bacula we have is broken. We can't just run md5sum on the command line and compare against what's in the Catalog. The bug report for this is here (fixed in newer releases): http://bugs.bacula.org/bug_view_advanced_page.php?bug_id=565 My first thought was to set up a Verify task in our current configuration, and get Bacula to natively check the files against the disk. It shouldn't matter that the encoding is broken, since we will be using the same (broken) code for verification. But when I do this, it complains that there has been no InitCatalog task done. I'm not sure whether this is related to our file moves, or because we intially misconfgiured things. I guess we should have been checking this functionality from the start :-( Some questions: 1) Are the checksums stored in Bacula any use to me at all? 2) Can I set up a Verify task, and compare the stored checksums against the disk copies? Or is this going to be too hard, given that the InitCatalog jobs was never run, and we have moved files around? 3) Can I write a command line checksum program, which will output checksums, and encode them in a way I can compare with the catalog? How can I do this in Perl, or C? I've had a look at the internals of the Bacula libraries, and my meager C skills are not sufficient to work things out. (This is my preferred option) Thanks, -- Jon Wilson [EMAIL PROTECTED] Systems Administration Manager PO Box H58, Australia Square, Sydney NSW 1215 Level 2, 9 Castlereagh Street, Sydney Office Tel:+61 9231 5888 Direct Tel:+61 2 9236 9118 Fax: +61 2 9231 5988 www.sirca.org.au DISCLAIMER: The contents of this email, inclusive of attachments, may be legally privileged and confidential. Any unauthorised use of the contents is expressly prohibited. If you have received this email message in error or are not the intended recipient, you should destroy the message along with any attachment(s). Unintended recipients of this email are prohibited from retaining, disclosing, distributing or using any information herein. This email is also subject to copyright. No part of it should be reproduced, adapted or transmitted without the written consent of the copyright owner. - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users