[ https://issues.apache.org/jira/browse/HADOOP-11361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365687#comment-15365687 ]
Vinayakumar B commented on HADOOP-11361: ---------------------------------------- bq. If one thread enters the above method with Time.now() - jmxCacheTS >= jmxCacheTTL being false, then getAllMetrics will continue to be false, then when updateAttrCache() is called, it will hit NULL lastRecs >From your earlier comment linked, if {{Time.now() - jmxCacheTS >= >jmxCacheTTL}} is false, doesnt it return from there itself? bq. It looks to me that it's actually not a race condition issue (otherwise please would you please explain what race condition you are seeing), Race condition is there between two threads calling {{updateJmxCache()}} at same time. {code} if (getAllMetrics) { MetricsCollectorImpl builder = new MetricsCollectorImpl(); getMetrics(builder, true); } synchronized(this) { updateAttrCache(); if (getAllMetrics) { updateInfoCache(); } jmxCacheTS = Time.now(); lastRecs = null; // in case regular interval update is not running lastRecsCleared = true; }{code} 1. Consider both threads t1 and t2 waiting at the entrance of {{synchronized(this) {}}, i.e. line no : 183. 2. t1 gets inside and makes the {{lastRecs = null}} and {{lastRecsCleared=true}} 3. now since t2 already passed the line of reloading the {{lastRecs}} in {{getMetrics(..)}} line, t2 will get NPE in {{updateAttrCache()}}. bq. Given that the code is doing lastRecsCleared = true;, the comment "Not making LastRecs null ..." is a bit contradictory. To make it less contradictory, how about renaming {{lastRecsCleared}} to something, {{reloadRecs}}? bq. . So seems to me that the solution would be to fix MetricsSourceAdapter#updateAttrCache to let it check whether lastRecs is NULL, and not to access it if so. This also works, since the {{lastRecs}} is used only to update {{attrCache}} and {{infoCache}}, avoiding updating these in case of {{lastRecs == null}} will not be a problem. > Fix a race condition in MetricsSourceAdapter.updateJmxCache > ----------------------------------------------------------- > > Key: HADOOP-11361 > URL: https://issues.apache.org/jira/browse/HADOOP-11361 > Project: Hadoop Common > Issue Type: Bug > Affects Versions: 2.4.1, 2.5.1, 2.6.0 > Reporter: Brahma Reddy Battula > Assignee: Brahma Reddy Battula > Attachments: HADOOP-111361-003.patch, HADOOP-11361-002.patch, > HADOOP-11361-004.patch, HADOOP-11361.patch, HDFS-7487.patch > > > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateAttrCache(MetricsSourceAdapter.java:247) > at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:177) > at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getAttribute(MetricsSourceAdapter.java:102) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org