+1 for ExitOnOutOfMemoryError. +HeapDumpOnOutOfMemroyError may produce very large files. How to clean up these old files? WDYT?
Thanks ZhangJian He On Tue, 14 Nov 2023 at 12:04, Gaofei Cao <[email protected]> wrote: > +1, > > `-XX:+ExitOnOutOfMemoryError` parameter can avoid the loss of some > key threads, it will be beneficial to the system. > If the IoTDB cluster is deployed on k8s, this parameter is more > indispensable, because k8s can dispatch another pod to replace this > OOM node rapidly. > Besides, i think we can add the usage of `-XX:+ExitOnOutOfMemoryError` > and `-XX:+HeapDumpOnOutOfMemoryError` in the user/DBA manual, which is > important to find the root cause of OOM. > > Best, > ---------------------- > Gaofei Cao > > Yuan Tian <[email protected]> 于2023年11月13日周一 19:52写道: > > > > Hi all, > > > > Recently, we found in some real user cases that when OOM occurs in the > > DataNode process (although we should ensure that OOM does not happen, but > > we all know that bugs will always exist), some threads(e.g. rpc listening > > threads) may exit unexpectedly which may cause some strange things to > > happen. For example, if the heartbeat listening thread on the DataNode > > unexpectedly exits due to OOM, and then the OOM recovers on its own (some > > large queries end, or some compaction tasks end), but this thread will > > never exist again, causing the DataNode to remain in unknown state, > because > > the ConfigNode can no longer contact it via heartbeat. > > > > Therefore, we feel that OOM is a high-risk error, and we should let the > > process exit directly to avoid the loss of some key threads. > > > > And I did an experiment and found that -XX:+ExitOnOutOfMemoryError and > > -XX:+HeapDumpOnOutOfMemoryError do not conflict which means that we can > > keep both in jvm args and when OOM happens, it will firstly dump the heap > > memory and then exit. > > > > I've made this change in my pr( > https://github.com/apache/iotdb/pull/11531). > > > > What do you think? > > > > > > > > > > Best, > > ---------------------- > > Yuan Tian >
