Hi Jun Ping Du!
Do You use fedaration?
Our cluster works in *_federation_* mode. Today we launched tests again.
Input file had size 600 Mb and size of block 6. So, there were 100 map
tasks.
And we saw the same result: all map tasks were executed within one
datacenter.
In web-interfaces of HDFS and YARN we saw different information. Page of
HDFS contained information, that there were 6 nodes in cluster, page of
YARN said only ablout 3 nodes.
Screenshots and configuration files are attached.
What settings we need to change to execute tasks on all nodes?
On 07/15/2013 05:52 PM, Jun Ping Du wrote:
Hi Костарев,
I tried to reproduce your case on my 5-nodes setup (with 2 nodes
in dc1/rack1, 1 node in dc1/rack2 and 2 nodes in dc2/rack2) but didn't
see anything unusual. In my test, even with 3 replicas, I saw job with
150 map tasks distributed across all nodes no matter what datacenter is.
Can you try again with a job with more map tasks as it is pretty
random in scheduling if your job only have 10 map tasks. In your case,
it seems b1, b3 and b2 take 3-4 maps away in one heartbeat which is
pretty normal case. Let me know the distribution and version you are
using if it still not work with more tasks.
btw, you can find history log for each job in application page.
Isn't it?
Thanks,
Junping
--
Консультант 1-й категории
Костарев А.Ф.
a1.node.nevod.ru
b1.node.nevod.ru
gate.psu.ru
a1.node.nevod.ru
a2.node.nevod.ru
b1.node.nevod.ru
b2.node.nevod.ru
b3.node.nevod.ru
gate.psu.ru
a1.node.nevod.ru /datacenter1/rack1
195.222.150.66 /datacenter1/rack1
a2.node.nevod.ru /datacenter1/rack1
195.222.150.68 /datacenter1/rack1
b1.node.nevod.ru /datacenter2/rack1
195.222.150.34 /datacenter2/rack2
b2.node.nevod.ru /datacenter2/rack2
195.222.150.35 /datacenter2/rack1
b3.node.nevod.ru /datacenter2/rack1
195.222.150.36 /datacenter2/rack1
gate.psu.ru /datacenter3/rack1
212.192.67.170 /datacenter3/rack1