Thanks Mahesh
Till now I am not able to run the whole job in a limited time period. So I
am looking for optimizations and resource utilization. May be I can try
tweaking input split size if it helps.
Thanks for your help, It explains the behaviour
--
Madhav Sharan
On Tue, Aug 9, 2016 at 1:28 P
Hi Madhav,
The behaviour to me sounds normal.
If the Block Size is 128 MB there could possibly be ~24 Mappers (i.e.,
containers used).
You cannot use entire cluster as the blocks could be only in the nodes
being used.
You should not try using the entire cluster resources for following reason
The
Hi Sunil - Thanks a lot for replying
For one job run yes some nodes don't take load at all. But if I rerun no
these are not same nodes always.
One map job takes ~3 seconds to run and till now I am not able to run my
whole job on a bigger data set so I can't say that container are short
lived.
I
HI Madhav
Could you help to share some more information here. When u say few nodes
are not utilized, is it always same nodes which are not utilized?
also how long each of these container are running on an average, pls make
sure you have provided enough split size to ensure the containers are not
Hi Hadoop users,
I am running a m/r job with an input file of 23 million records. I can see
all our files are not getting used.
What can I change to utilize all nodes?
Containers Mem Used Mem Avail Vcores used Vcores avail
8 11.25 GB 0 B 8 0
0 0 B 11.25 GB 0 8
0 0 B 11.25 GB 0 8
8 11.25 GB 0 B
Hi Hadoop users,
I am running a m/r job with an input file of 23 million records. I can see
all our files are not getting used.
What can I change to utilize all nodes?
Containers Mem Used Mem Avail Vcores used Vcores avail
8 11.25 GB 0 B 8 0
0 0 B 11.25 GB 0 8
0 0 B 11.25 GB 0 8
8 11.25 GB 0 B