Hi all,

We have a Flink 1.6 streaming application running on Amazon EMR, with a
YARN session configured with 20GB for the Task Manager, 2GB for the Job
Manager, and 4 slots (number of vCPUs), in detached mode. Each Core Node
has 4 vCores, 32 GB mem, 32 GB disc, and each Task Node has 4 vCores, 8 GB
mem, 32 GB disc. We have auto-scaling for Core Nodes based on the HDFS
Utilization and Capacity Remaining GB, as well as auto-scaling for the Task
Nodes based on YARN Available Memory and the number of Pending Containers.
We've got Log Aggregation turned on as well. This runs well under normal
pressure for about a week, where upon YARN can no longer allocate the
resource requests from Flink, causing container requests to build up. Even
when scaled up, the container requests don't seem to be fulfilled. I've
seen that it seems to start. Does anyone have a good guide to setting up a
streaming application on EMR with YARN?

Thank you,
Austin Cawley-Edwards

Reply via email to