I wonder what happens if a HDFS client writes to 3 replicas, but only one of them succeeds, but the other 2 fail, then the client considers the operation a failure, but what happens to the successful block? if another client does a latter read, and hit on the successful block, it could read out the result ? (which was probably not correct) it seems to create a completely consistent view, there would have to be something like 2 phase commit protocol .
also, what is the granularity of HDFS replication? for example, if I keep writing a long file onto hdfs, does the client finish flushing a 128MB block, and then tries to replicate it to all 3 replicas ? or is every byte immediately copied to 3 replicas? Thanks Yang