Hi, In the Strata 2013 training lectures, Jonathan Hsieh from Cloudera said something about HBase syncs which I'm trying to understand further.
He said that HBase sync guarantees only that a write goes to the local disk on the region server responsible for that region and in-memory copies go on 2 other machines in the HBase cluster. But I thought that when the write goes to the WAL on the first region server, that the HDFS append would push that write to 3 machines total in the HDFS cluster. In order for the append write to the WAL to be successful, doesn't the DataNode on that machine have to pipeline the write to 2 other DataNodes? I'm not sure what Jonathan was referring to when he said that 2 in-memory copies go to other HBase machines? Even when the memstore on the first region server gets full, doesn't the flush to the HFile get written on 3 HDFS nodes in total?