The fs checksum output has more info like bytes per CRC, CRC per block. See e.g.: https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java
In order to avoid dealing with different formatting or byte order you could use md5sum for the remote file as well if the file is reasonably small hadoop fs -cat /abc.txt | md5sum On Fri, Aug 7, 2015 at 3:35 AM Shashi Vishwakarma <shashi.vish...@gmail.com <javascript:_e(%7B%7D,'cvml','shashi.vish...@gmail.com');>> wrote: > Hi > > I have a small confusion regarding checksum verification.Lets say , i have > a file abc.txt and I transferred this file to hdfs. How do I ensure about > data integrity? > > I followed below steps to check that file is correctly transferred. > > *On Local File System:* > > md5sum abc.txt > > 276fb620d097728ba1983928935d6121 TestFile > > *On Hadoop Cluster :* > > hadoop fs -checksum /abc.txt > > /abc.txt MD5-of-0MD5-of-512CRC32C > 000002000000000000000000911156a9cf0d906c56db7c8141320df0 > > Both output looks different to me. Let me know if I am doing anything > wrong. > > How do I verify if my file is transferred properly into HDFS? > > Thanks > Shashi >