Re: IO pipeline optimizations

Todd Lipcon Tue, 19 Jul 2011 13:31:53 -0700

Hi Shrinivas,

There has been some work going on recently around optimizing checksums. See
HDFS-2080 for example. This will help both the write and read code, though
we've focused more on read.

There have also been a lot of improvements around random read access - for
example HDFS-941 which improves random read by more than 2x.

I'm planning on writing a blog post in the next couple of weeks about some
of this work.

-Todd

On Tue, Jul 19, 2011 at 1:26 PM, Shrinivas Joshi <jshrini...@gmail.com>wrote:

> This blog post on YDN website
>
> http://developer.yahoo.com/blogs/hadoop/posts/2009/08/the_anatomy_of_hadoop_io_pipel/has
> detailed discussion on different steps involved in Hadoop IO
> operations
> and opportunities for optimizations. Could someone please comment on
> current
> state of these potential optimizations? Are some of these expected to be
> addressed in "next gen MR" release?
>
> Thanks,
> -Shrinivas
>

-- 
Todd Lipcon
Software Engineer, Cloudera

Re: IO pipeline optimizations

Reply via email to