IO pipeline optimizations

2011-07-19 Thread Shrinivas Joshi
This blog post on YDN website http://developer.yahoo.com/blogs/hadoop/posts/2009/08/the_anatomy_of_hadoop_io_pipel/has detailed discussion on different steps involved in Hadoop IO operations and opportunities for optimizations. Could someone please comment on current state of these potential

Re: IO pipeline optimizations

2011-07-19 Thread Todd Lipcon
Hi Shrinivas, There has been some work going on recently around optimizing checksums. See HDFS-2080 for example. This will help both the write and read code, though we've focused more on read. There have also been a lot of improvements around random read access - for example HDFS-941 which