Hey all,

I am running into a problem where YARN kills containers for being over
their memory allocation (which is about 8G for executors plus 6G for
overhead), and I noticed that in those containers there are tons of
pyspark.daemon processes hogging memory. Here's a snippet from a container
with 97 pyspark.daemon processes. The total sum of RSS usage across all of
these is 1,764,956 pages (i.e. 6.7GB on the system).

Any ideas what's happening here and how I can get the number of
pyspark.daemon processes back to a more reasonable count?

2015-01-23 15:36:53,654 INFO  [Reporter] yarn.YarnAllocationHandler
(Logging.scala:logInfo(59)) - Container marked as failed:
container_1421692415636_0052_01_000030. Exit status: 143. Diagnostics:
Container [pid=35211,containerID=container_1421692415636_0052_01_000030]
is running beyond physical memory limits. Current usage: 14.9 GB of
14.5 GB physical memory used; 41.3 GB of 72.5 GB virtual memory used.
Killing container.
Dump of the process-tree for container_1421692415636_0052_01_000030 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES)
FULL_CMD_LINE
        |- 54101 36625 36625 35211 (python) 78 1 332730368 16834 python -m
pyspark.daemon
        |- 52140 36625 36625 35211 (python) 58 1 332730368 16837 python -m
pyspark.daemon
        |- 36625 35228 36625 35211 (python) 65 604 331685888 17694 python -m
pyspark.daemon

        [...]


Full output here: https://gist.github.com/skrasser/e3e2ee8dede5ef6b082c

Thank you!
-Sven

-- 
http://sites.google.com/site/krasser/?utm_source=sig

Reply via email to