[ 
https://issues.apache.org/jira/browse/HADOOP-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13554432#comment-13554432
 ] 

Todd Lipcon commented on HADOOP-9209:
-------------------------------------

Yea... the issue is that the distinct properties are odd... here's a first 
crack at how I understand it:

- If the checksum "algorithm names" are different, then we can say nothing 
about whether the files are identical. (does the "algorithm name" fully 
encompass things like the block size?)
- If the checksum "algorithm names" are the same, and the checksums are the 
same, then the files are probably identical (except for possibilities of hash 
collision)
- If the checksum "algorithm names" are the same, but the checksums differ, 
then the files are definitely not identical.

Does that mesh with your understanding? Or does the block size not properly 
propagate into the algorithm name string? (and if that's the case, then under 
what cases can we actually make definitive judgments?)
                
> Add shell command to dump file checksums
> ----------------------------------------
>
>                 Key: HADOOP-9209
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9209
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs, tools
>    Affects Versions: 3.0.0, 2.0.3-alpha
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-9209.txt, hadoop-9209.txt
>
>
> Occasionally while working with tools like distcp, or debugging certain 
> issues, it's useful to be able to quickly see the checksum of a file. We 
> currently have the APIs to efficiently calculate a checksum, but we don't 
> expose it to users. This JIRA is to add a "fs -checksum" command which dumps 
> the checksum information for the specified file(s).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to