Hi Eric, > From: Eric Baldeschwieler <eri...@hortonworks.com>
> This would seem like a perfect use case for YARN. Is that what you are > thinking? You could implement this as a new framework rather then > trying to incrementally change map-reduce. Currently, we have done with this Hadoop 0.20.2. Major code surgery here. This is the released version. We (Chris Douglas and myself) have gotten some of the components to work with YARN (that is still work in progress...). Having help from the community here would be great :). Long answer to your question...yes, getting this YARN simplifies lots of things. Sriram On May 8, 2012, at 10:32 AM, Sriram Rao <srirams...@gmail.com> wrote: > Hi, > > I'd like to announce the release of a new open source project, Sailfish. > > http://code.google.com/p/sailfish/ > > Sailfish tries to improve Hadoop-performance, particularly for large-jobs > which process TB's of data and run for hours. In building Sailfish, we > modify how map-output is handled and transported from map->reduce. > > The project pages provide more information about the project. > > We are looking for colloborators who can help get some of the ideas into > Apache Hadoop. A possible step forward could be to make "shuffle" phase of > Hadoop pluggable. > > If you are interested in working with us, please get in touch with me. > > Sriram