Hello,
I have a question about the way LinuxResourceCalculatorPlugin calculates
memory consumed by process tree (it is calculated via
ProcfsBasedProcessTree class). When we enable caching (disk) in apache
spark jobs run on YARN cluster, the node manager starts to kill the
containers while reading the cached data, because of "Container is
running beyond memory limits ...". The reason is that even if we enable
parsing of the smaps file
(yarn.nodemanager.container-monitor.procfs-tree.smaps-based-rss.enabled)
the ProcfsBasedProcessTree calculates mmaped read-only pages as consumed
by the process tree, while spark uses FileChannel.map(MapMode.READ_ONLY)
to read the cached data. The JVM then consumes *a lot* more memory than
the configured heap size (and it cannot be really controlled), but this
memory is IMO not really consumed by the process, the kernel can reclaim
these pages, if needed. My question is - is there any explicit reason
why "Private_Clean" pages are calculated as consumed by process tree? I
patched the ProcfsBasedProcessTree not to calculate them, but I don't
know if this is the "correct" solution.
Thanks for opinions,
cheers,
Jan
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org