I'm happy to report I converted all the recommender-related jobs to Hadoop 0.20.x and sorted out the issue I had before, and simply reworked the jobs to not need one job to have two mapper inputs.
I don't think I broke anything, but, can't be 100% sure since the tests aren't exhaustive. I suppose I'd encourage everyone to chuck in their upgrades as soon as is realistic too. Then I can for instance take on updating anything left. On Tue, May 25, 2010 at 3:17 PM, Sean Owen <[email protected]> wrote: > Just to state what seems to be in progress -- looks like we are agreed > we should move to the new Hadoop APIs. Some code is already using it; > most of the part that isn't is the recommender which was due to some > strange bugs deep in Hadoop in prior versions. It's time to try it > again. I'm going to work on porting everything forward now. > > The other argument against this was that Amazon EMR runs 0.18.3. I > think Jeff already established that what we're doing has already > broken compatibility with 0.18.x. We can point those users to release > 0.3 and say they can try to back-port that code to 0.18.x > compatibility. But 0.4 onwards is for 0.20.x+ and you can run your own > cluster using AWS and hopefully EMR updates soon. > > Sean >
