[ https://issues.apache.org/jira/browse/SPARK-21425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088920#comment-16088920 ]
Sean Owen commented on SPARK-21425: ----------------------------------- I see, it's really Main. It looks like having the object declared as a static shared variable makes the difference, though in principle that wouldn't matter. It could be specific to this kind of setup, with local execution, but may still be an issue. On the flip side, I don't see much downside to making writes on these thread-safe. If it's necessary for correctness, well, it's necessary. If it's not, then it doesn't create contention between threads, and at most this is paying a cost to acquire a lock to write (which might be elided, but probably not in this case). CollectionAccumulator is already very nearly thread safe anyway (minus setValue). At the moment it seems like that would be good practice, but I am not 100% clear on why it was not created that way in the first place. > LongAccumulator, DoubleAccumulator not threadsafe > ------------------------------------------------- > > Key: SPARK-21425 > URL: https://issues.apache.org/jira/browse/SPARK-21425 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.2.0 > Reporter: Ryan Williams > Priority: Minor > > [AccumulatorV2 > docs|https://github.com/apache/spark/blob/v2.2.0/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L42-L43] > acknowledge that accumulators must be concurrent-read-safe, but afaict they > must also be concurrent-write-safe. > The same docs imply that {{Int}} and {{Long}} meet either/both of these > criteria, when afaict they do not. > Relatedly, the provided > [LongAccumulator|https://github.com/apache/spark/blob/v2.2.0/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L291] > and > [DoubleAccumulator|https://github.com/apache/spark/blob/v2.2.0/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L370] > are not thread-safe, and should be expected to behave undefinedly when > multiple concurrent tasks on the same executor write to them. > [Here is a repro repo|https://github.com/ryan-williams/spark-bugs/tree/accum] > with some simple applications that demonstrate incorrect results from > {{LongAccumulator}}'s. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org