[ https://issues.apache.org/jira/browse/IGNITE-6171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16264634#comment-16264634 ]
Dmitriy Sorokin commented on IGNITE-6171: ----------------------------------------- Anton Vinogradov, please review new patch. > Native facility to control excessive GC pauses > ---------------------------------------------- > > Key: IGNITE-6171 > URL: https://issues.apache.org/jira/browse/IGNITE-6171 > Project: Ignite > Issue Type: Task > Components: general > Affects Versions: 2.3 > Reporter: Vladimir Ozerov > Assignee: Dmitriy Sorokin > Labels: iep-7, usability > Fix For: 2.4 > > > Ignite is Java-based application. If node experiences long GC pauses it may > negatively affect other nodes. We need to find a way to detect long GC pauses > within the process and trigger some actions in response, e.g. node stop. > This is a kind of Inception \[1\], when you need to understand that you sleep > while sleeping. As all Java threads are blocked on safepoint, we cannot use > Java's thread to detect Java's GC. Native threads should be used instead. > Proposed solution: > 1) Thread 1 should periodically call dummy JNI method returning current time, > and set this time to shared variable; > 2) Thread 2 should periodically check that variable. If it has not been > changed for some time - most likely we are in GC pause. Once certain > threashold is reached - trigger compensating action, whether this is a > warning, process kill, or what so ever. > Justification: crossing native -> Java boundaries involves safepoints. This > way Thread 1 will be trapped if STW pause is in progress. Java method cannot > be empty, as JVM is smart enough and can deduce it to no-op. > \[1\] http://www.imdb.com/title/tt1375666/ -- This message was sent by Atlassian JIRA (v6.4.14#64029)