Questions with regards to Yarn/Hadoop

Omid Alipourfard Mon, 24 Aug 2015 18:54:07 -0700

Hi,

I am running a Terasort benchmark (10 GB, 25 reducers, 50 mappers) that
comes with Hadoop 2.7.1.  I am experiencing an unexpected behavior with
Yarn, which I am hoping someone can shed some light on:


I have a cluster of three machines with 2 cores and 3.75 GB of RAM (per
machine), when I run the Terasort job, one of the machines is idling, i.e.,
it is not using any substantial Disk or CPU.  All three machines are
capable of executing jobs, and one of the machines is both a name node and
a data node.

On the other hand, running the same job on a cluster of three machines with
2 cores and 8 GB of RAM (per machine) utilizes all the machines.

Both setups are using the same Hadoop configuration files, in both of them
mapper tasks have 1 GB and reducer tasks have 2 GB of memory.

I am guessing Yarn is not utilizing the machines correctly -- maybe because
of the available amount of RAM, but I am not sure how to verify this.

Any thoughts on what the problem might be or how to verify it is
appreciated,
Thanks,
Omid

P.S. I can also post any of the logs or configuration files.

Questions with regards to Yarn/Hadoop

Reply via email to