Taking a dump of 8 GB heap shows about 18 million
org.apache.hadoop.yarn.proto.YarnProtos$ApplicationIdProto

Similar counts are there for ApplicationAttempt, ContainerId. All seems to
be linked via
org.apache.hadoop.yarn.proto.YarnProtos$ContainerStatusProto, the count of
which is also about 18 million.

On further debugging, looking at the CapacityScheduler code:

It seems to add duplicated entries of UpdatedContainerInfo objects for the
completed containers. In the same dump seeing about 0.5
UpdatedContainerInfo million objects

This issue only surfaces if the scheduler thread is not able to drain fast
enough the UpdatedContainerInfo objects, happens only in a big cluster.

Has anyone noticed the same. We are running hadoop 2.6.0

Sharad

Reply via email to