Adarsh, Yahoo! no longer has its own distribution of Hadoop. It has been merged into the 0.20.2XX line so 0.20.203 is what Yahoo is running internally right now, and we are moving towards 0.20.204 which should be out soon. I am not an expert on Cloudera so I cannot really map its releases to the Apache Releases, but their distro is based off of Apache Hadoop with a few bug fixes and maybe a few features like append added in on top of it, but you need to talk to Cloudera about the exact details. For the most part they are all very similar. You need to think most about support, there are several companies that can sell you support if you want/need it. You also need to think about features vs. stability. The 0.20.203 release has been tested on a lot of machines by many different groups, but may be missing some features that are needed in some situations.
--Bobby On 7/14/11 11:49 PM, "Adarsh Sharma" <adarsh.sha...@orkash.com> wrote: Hadoop releases are issued time by time. But one more thing related to hadoop usage, There are so many providers that provides the distribution of Hadoop ; 1. Apache Hadoop 2. Cloudera 3. Yahoo etc. Which distribution is best among them on production usage. I think Cloudera's is best among them. Best Regards, Adarsh Owen O'Malley wrote: > On Jul 14, 2011, at 4:33 PM, Teruhiko Kurosaka wrote: > > >> I'm a newbie and I am confused by the Hadoop releases. >> I thought 0.21.0 is the latest & greatest release that I >> should be using but I noticed 0.20.203 has been released >> lately, and 0.21.X is marked "unstable, unsupported". >> >> Should I be using 0.20.203? >> > > Yes, I apologize for confusing release numbering, but the best release to use > is 0.20.203.0. It includes security, job limits, and many other improvements > over 0.20.2 and 0.21.0. Unfortunately, it doesn't have the new sync support > so it isn't suitable for using with HBase. Most large clusters use a separate > version of HDFS for HBase. > > -- Owen > >