[ https://issues.apache.org/jira/browse/HADOOP-12593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032618#comment-15032618 ]
Colin Patrick McCabe commented on HADOOP-12593: ----------------------------------------------- [~gliptak], as the comment you pasted states, "there is only a single writer to thread-local StatisticsData objects." A single writer cannot race with itself, as [~sjlee0] commented. [~sjlee0] wrote: bq. I'd be the first to admit that the way it is written is not very clear or intentional, but t appears correct. The only exception to the single-writer rule is Statistics.rootData. However, that is also currently protected by the Statistics instance lock. Again, that could be made more explicit (e.g. sometimes synchronization is done in a couple of layers above in the call chain). Hmm. The declaration of {{rootData}} has a comment saying that it is protected by the Statistics lock. There's also a comment near {{reset}} talking about the locking, commenting that "Both reads and writes to rootData are done under the lock, so we're free to modify rootData from any thread that holds the lock". Are other places we could add comments to make it clearer? To be honest, I think the locking here is better documented than most places in our code base. I did a quick scan of our uses of {{volatile long}}, and the only place I can find that might be a problem is here: {{hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java}} might have a "lost update" here: {code} void benchmarkOne() throws IOException { for(int idx = 0; idx < opsPerThread; idx++) { if((localNumOpsExecuted+1) % statsOp.ugcRefreshCount == 0) refreshUserMappingsProto.refreshUserToGroupsMappings(); long stat = statsOp.executeOp(daemonId, idx, arg1); localNumOpsExecuted++; localCumulativeTime += stat; <=== should be using AtomicLong } } {code} {{localCumulativeTime}} should be an AtomicLong to prevent lost updates. Perhaps we should change this JIRA to be about fixing this issue. Or maybe we should close this and file the {{localCumulativeTime}} issue as a follow-on, to make it less confusing. [~steve_l], what do you think? > multiple "volatile long" field declarations exist in the Hadoop codebase > ------------------------------------------------------------------------ > > Key: HADOOP-12593 > URL: https://issues.apache.org/jira/browse/HADOOP-12593 > Project: Hadoop Common > Issue Type: Bug > Affects Versions: 2.8.0 > Reporter: Steve Loughran > Priority: Minor > > If you get your IDE to scan for "volatile long", you find 20-30 entries. > Volatile operations on `long` variables are not guaranteed to be atomic, so > these usages can be vulnerable to race conditions generating invalid data. > they need to be replaced by AtomicLong references, except in the specific > case that you want performance values for statistics, and are prepared to > take the risk -- This message was sent by Atlassian JIRA (v6.3.4#6332)