On Jan 30, 2012, at 12:02 PM, Jesse Yates wrote:

> The large blocks issue is going away soon/already with append support in 
> HDFS. You are still going to be hurt if you have other things IOing on the 
> node as you still need to spin disk, but it won't be as terrible as it could 
> be.
> 
> The big problem is in the fact that writing replicas in HDFS is done in a 
> pipeline, rather than in parallel. There is a ticket to change this 
> (HDFS-1783), but no movement on it since last summer.

ugh - why would they change this? Pipelining maximizes bandwidth usage. It'd be 
cool if the log stream could be configured to return after written to one, two, 
or more nodes though.

> Just my two cents, but sticking with the currently logging style makes the 
> most sense, though maybe making it a really distinct interface so we can swap 
> out for an HDFS implementation when it's ready and people prefer.
> 
> - Jesse Yates
> 
> Sent from my iPhone.

Reply via email to