Re: Should I upgrade from 0.18.3 to the latest 0.20.1?

Edmund Kohlwey Tue, 10 Nov 2009 17:48:17 -0800

The new API in 0.20.x is likely not what you'll see in the final Hadoop1.0 release, which I've heard some people forecast within the next 18months or so (we'll see). There will likely be a 0.21.x series, and thenthe final release.

That having been said, its much more similar to what you'll see in thefinal release. Depending on how complex your jobs are, you may see minoror no changes in the final release, or you may see dramatic ones. Ithink (someone correct me if I'm wrong) the basic map and reduceabstract classes are just about set in stone, but if you're using otherstuff like file formats, custom splits, etc. then you may see a lot ofdifferences. I've also noticed a lot of changes in how the job and tasktrackers work, even in the current trunk. There's also some interestingwork being done by yahoo on pipelining MR jobs, which will not be in any0.20.x release.

The other thing about 0.20.x is that a lot of the old API (like joins,etc.) has not been updated, so your application may be a hodgepodgepatchwork of the two APIs.

Are there any portions of the new API which are particularly attractiveto you? That might help people suggest weather or not you should switchto satisfy that need. If you don't have any needs particular to the0.20.x API then there's probably little reason to switch.

If you do upgrade to 0.20.1, make sure to get the cloudera or yahoodistributions. The current "stable" (0.20.1) release on the Apache pageis very buggy.


On 11/10/09 3:30 PM, Mark Kerzner wrote:

Hi,

I've been working on my project for about a year, and I decided to upgrade
from 0.18.3 (which was stable and already old even back then). I have
started, but I see that many classes have changed, many are deprecated, and
I need to re-write some code. Is it worth it? What are the advantages of
doing this? Other areas of concern are:

    - Will Amazon EMR work with the latest Hadoop?
    - What about Cloudera distribution or Yahoo distribution?

Thank you,
Mark

Re: Should I upgrade from 0.18.3 to the latest 0.20.1?

Reply via email to