André Kelpe created HADOOP-9928:
-----------------------------------

             Summary: provide md5, sha1 and .asc files, that are usable
                 Key: HADOOP-9928
                 URL: https://issues.apache.org/jira/browse/HADOOP-9928
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 1.2.1, 2.1.0-beta
            Reporter: André Kelpe
            Priority: Critical


I am trying to verify the checksums of tarballs I downloaded and it seems that 
the way, those are produced is all but useful. 

Almost all other open source projects I know, create a .md5, .sha1 and .asc 
files, that can easily be used with tools like md5sum, sha1sum or gpg. 

The hadoop downloads provide an mds file, for which there seems to be no 
documentation on how to use it.

Here are some challenges with that format:

0. all sorts of checksums are in the same file
1. The MD5 sum is all upper case (all of them are, to be precise)
2. The MD5 sum contains whitespace

For the three above I came up with this interesting construct:

md5sum --check  <(grep "MD5 = " some-file.mds | sed -e "s/MD5 = //g;s/ //g" | 
awk -F: '{print tolower($2), "", $1}')

That would work, if there wouldn't be the next problem:

3. The file format wraps lines around after 80 chars (see here for instance: 
http://apache.openmirror.de/hadoop/core/beta/hadoop-2.1.0-beta-src.tar.gz.mds)

I really do not see, how this format is useful to anyone.

5. Next to all of that, there are not gpg signatures. How can I verify that the 
mirror, I downloaded the tarball from, was not compromised?

It would be very helpful, if you could provide checksums and signatures the 
same way, that other projects use or at least explain how to use the mds files 
with standard unix tools.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to