How check sum are generated for blocks in data node

2014-03-28 Thread reena upadhyay
I was going through this link http://stackoverflow.com/questions/9406477/data-integrity-in-hdfs-which-data-nodes-verifies-the-checksum . Its written that in recent version of hadoop only the last data node verifies the checksum as the write happens in a pipeline fashion. Now I have a question:

Re: How check sum are generated for blocks in data node

2014-03-28 Thread Wellington Chevreuil
Hi Reena, the pipeline is per block. If you have half of your file in data node A only, that means the pipeline had only one node (node A, in this case, probably because replication factor is set to 1) and then, data node A has the checksums for its block. The same applies to data node B.