Hello,

I have a question about the way LinuxResourceCalculatorPlugin calculates memory consumed by process tree (it is calculated via ProcfsBasedProcessTree class). When we enable caching (disk) in apache spark jobs run on YARN cluster, the node manager starts to kill the containers while reading the cached data, because of "Container is running beyond memory limits ...". The reason is that even if we enable parsing of the smaps file (yarn.nodemanager.container-monitor.procfs-tree.smaps-based-rss.enabled) the ProcfsBasedProcessTree calculates mmaped read-only pages as consumed by the process tree, while spark uses FileChannel.map(MapMode.READ_ONLY) to read the cached data. The JVM then consumes *a lot* more memory than the configured heap size (and it cannot be really controlled), but this memory is IMO not really consumed by the process, the kernel can reclaim these pages, if needed. My question is - is there any explicit reason why "Private_Clean" pages are calculated as consumed by process tree? I patched the ProcfsBasedProcessTree not to calculate them, but I don't know if this is the "correct" solution.

Thanks for opinions,
 cheers,
 Jan


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org

Reply via email to