Re: Why datanode does a flush to disk after receiving a packet

2010-11-11 Thread Todd Lipcon
On Thu, Nov 11, 2010 at 7:31 AM, Thanh Do wrote: > Thanks Todd, > > In HDFS-6313, i see three API (sync, hflush, hsync), > And I assume hflush corresponds to : > > *"API2: flushes out to all replicas of the block. > The data is in the buffers of the DNs but not on the DN's OS buffers. > New reade

Re: Why datanode does a flush to disk after receiving a packet

2010-11-11 Thread Thanh Do
Thanks Todd, In HDFS-6313, i see three API (sync, hflush, hsync), And I assume hflush corresponds to : *"API2: flushes out to all replicas of the block. The data is in the buffers of the DNs but not on the DN's OS buffers. New readers will see the data after the call has returned.*" I am still c

Re: Why datanode does a flush to disk after receiving a packet

2010-11-10 Thread Todd Lipcon
Nope, flush just flushes the java side buffer to the Linux buffer cache -- not all the way to the media. Hsync is the API that will eventually go all the way to disk, but it has not yet been implemented. -Todd On Wednesday, November 10, 2010, Thanh Do wrote: > Or another way to rephase my quest

Re: Why datanode does a flush to disk after receiving a packet

2010-11-10 Thread Thanh Do
Or another way to rephase my question: does data.flush and checksumOut.flush guarantee data be synchronized with underlying disk, just like fsync(). Thanks Thanh On Wed, Nov 10, 2010 at 10:26 PM, Thanh Do wrote: > Hi all, > > After reading the appenddesign3.pdf in HDFS-256, > and looking at the

Why datanode does a flush to disk after receiving a packet

2010-11-10 Thread Thanh Do
Hi all, After reading the appenddesign3.pdf in HDFS-256, and looking at the BlockReceiver.java code in 0.21.0, I am confused by the following. The document says that: *For each packet, a DataNode in the pipeline has to do 3 things. 1. Stream data a. Receive data from the upstream DataNode o