Hi Bertrand,
I guess you configured two racks totally. one IDC is a rack, and another IDC is
another rack.
so if you want to don't replicate populate during one IDC down, you had to
change the replicate placement policy,
if there are minimum blocks on one rack, then don't do anything. (here
Hi,
Last week we had a discussion at work regarding setting up our new Hadoop
cluster(s).
One of the things that has changed is that the importance of the Hadoop
stack is growing so we want to be more available.
One of the points we talked about was setting up the cluster in such a way
that the
Hi Niels,
it's depend of the number of replicas and the Hadoop rack configuration
(level).
It's possible to have replicas on the two datacenters.
What's the rack configuration that you plan ? You can implement your
own one and define it using the topology.node.switch.mapping.impl
property.
According to your own analysis, you wouldn't be more available but that was
your aim.
Did you consider having two separate clusters? One per datacenter, with an
automatic copy of the data?
I understand that load balancing of work and data would not be easy but it
seems to me a simple strategy