I'm in the process of migrating over our Hadoop setup from MRv1 to MRv2
and have a question about interoperability.
We run our Hadoop clusters in the cloud (AWS) in a transient fashion.
I.e., start up clusters when needed, push all output from HDFS to S3,
and shut the clusters down when done. We have configurations for
starting up run different-sized clusters simultaneously to handle
different work streams, etc. All works fine.
I have a client machine (the "job controller") at the center of the
process, which runs the scripts to launch & shut down the Hadoop
clusters, as well as uses the installed Hadoop client to submit jobs to
the clusters. Again, all works fine.
I want to start migrating over our setup from MRv1 to MRv2. But a) I
don't necessarily need/want to migrate over all of our cluster
configurations all at once, and b) I need to do some testing on the MRv2
cluster configurations/scripts before I go live with it. So, I'd like
to be able to launch some clusters as MRv2, and some as MRv1. But given
my set up with the central job controller machine, I'm scratching my
head about how to accomplish this.
MRv2 uses a ResourceManager daemon to submit jobs, vs. MRv1's JobTracker
(with ResourceManager listening on a different port). If I leave the
version of the Hadoop client on the job controller machine as MRv1, I'm
thinking it won't be able to submit jobs to an MRv2 cluster. Similarly,
if I upgrade the client to MRv2, then I'd think it wouldn't be able to
submit jobs to MRv1 clusters.
So my question is: is there any (easy) way for a single machine to be
able to submit jobs to both types of clusters? (E.g., run both the MRv1
and MRv2 client packages?)
One obvious workaround here would be to start up a 2nd job controller
machine, and use that for all the MRv2 testing, before I cut over the
main one to MRv2. But that's not an ideal solution for a number of
reasons. (Cost, time involved in setting up a duplicate environment,
difficulties involved in splitting production work between 2 machines
while we transition, etc.)
Any suggestions here greatly appreciated!
Thanks,
DR