I'm contemplating an upcoming zookeeper rollout and was wondering what
the zookeeper brain trust here thought about a network deployment question:
Is it generally considered bad practice to just deploy zookeeper on our
existing hdfs/MR nodes? Or is it better to run zookeeper instances on
their own dedicated nodes?
On the one hand, we're not going to be making heavy-duty use of
zookeeper, so it might be sufficient for zookeeper nodes to share box
resources with HDFS & MR. On the other hand, though, I don't want
zookeeper to become unavailable if the nodes are running a resource
intensive job that's hogging CPU or network.
What's generally considered best practice for Zookeeper?
Thanks,
DR