Hey there,
I've set up rack awareness on my hadoop cluster with replication 3. I have 2
racks and each contains 50% of the nodes.
I can see that the blocks are spread on the 2 racks, the problem is that all
nodes from a rack are storing 2 replicas and the nodes of the other rack
just one. If I
Jobs run on the whole cluster. After rebalancing everything is properly
allocated. Then I start running jobs using all the slots of the 2 racks and
the problem starts to happen.
Maybe I'm missing something. When using the rack awareness, do you have to
specify to the jobs to run in slots form both
When you rebalance, the block is fully written, so the writer locality does
not have to be taken into account (there is no writer anymore), hence it
can rebalance across the racks. That's why jobs asymmetry was the easy
guess. What's your hadoop version by the way? I remember a bug around rack
I'm on cdh3u4 (0.20.2), gonna try to read a bit on this bug
--
View this message in context:
http://lucene.472066.n3.nabble.com/rack-awareness-unexpected-behaviour-tp4086029p4086049.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Hi Rohit
Did you succeed in running R script from Oozie action?
If so can you share you action configuration?
I am trying to figure out how to run a R script from Oozie
--
View this message in context:
I'm not aware of a bug in 0.20.2 that would not honor the Rack
Awareness, but have you done the two below checks as well?
1. Ensuring JT has the same rack awareness scripts and configuration
so it can use it for scheduling, and,
2. Checking if the map and reduce tasks are being evenly spread
Rack aware is an artificial concept.
Meaning you can define where a node is regardless of is real position in the
rack.
Going from memory, and its probably been changed in later versions of the
code...
Isn't the replication... Copy on node 1, copy on same rack, third copy on
different rack?
For 3 replicas, the replication sequence is: 1st on local node of Writer, 2nd
on remote rack node of 1st replica, 3rd on same rack of 2nd replica.
There could be some special cases like: disk is full on 1st node, or no node
available for 2nd replica rack, and Hadoop already take care it well.