I'm in the process of migrating over our Hadoop setup from MRv1 to MRv2 and have a question about interoperability.

We run our Hadoop clusters in the cloud (AWS) in a transient fashion. I.e., start up clusters when needed, push all output from HDFS to S3, and shut the clusters down when done. We have configurations for starting up run different-sized clusters simultaneously to handle different work streams, etc. All works fine.

I have a client machine (the "job controller") at the center of the process, which runs the scripts to launch & shut down the Hadoop clusters, as well as uses the installed Hadoop client to submit jobs to the clusters. Again, all works fine.


I want to start migrating over our setup from MRv1 to MRv2. But a) I don't necessarily need/want to migrate over all of our cluster configurations all at once, and b) I need to do some testing on the MRv2 cluster configurations/scripts before I go live with it. So, I'd like to be able to launch some clusters as MRv2, and some as MRv1. But given my set up with the central job controller machine, I'm scratching my head about how to accomplish this.

MRv2 uses a ResourceManager daemon to submit jobs, vs. MRv1's JobTracker (with ResourceManager listening on a different port). If I leave the version of the Hadoop client on the job controller machine as MRv1, I'm thinking it won't be able to submit jobs to an MRv2 cluster. Similarly, if I upgrade the client to MRv2, then I'd think it wouldn't be able to submit jobs to MRv1 clusters.


So my question is: is there any (easy) way for a single machine to be able to submit jobs to both types of clusters? (E.g., run both the MRv1 and MRv2 client packages?)


One obvious workaround here would be to start up a 2nd job controller machine, and use that for all the MRv2 testing, before I cut over the main one to MRv2. But that's not an ideal solution for a number of reasons. (Cost, time involved in setting up a duplicate environment, difficulties involved in splitting production work between 2 machines while we transition, etc.)


Any suggestions here greatly appreciated!

Thanks,

DR

Reply via email to