, Alexander; dev@spark.apache.org
Subject: Re: Is breeze thread safe in Spark?
RJ, could you provide a code example that can re-produce the bug you observed
in local testing? Breeze's += is not thread-safe. But in a Spark job, calls to
a resultHandler is synchronized:
https://github.com/apache/spark
Hi,
Is breeze library called thread safe from Spark mllib code in case when native
libs for blas and lapack are used? Might it be an issue when running Spark
locally?
Best regards, Alexander
-
To unsubscribe, e-mail:
David,
Can you confirm that += is not thread safe but + is? I'm assuming +
allocates a new object for the write, while += doesn't.
Thanks!
RJ
On Wed, Sep 3, 2014 at 2:50 PM, David Hall d...@cs.berkeley.edu wrote:
In general, in Breeze we allocate separate work arrays for each call to
Additionally, at the higher level, MLlib allocates separate Breeze
Vectors/Matrices on a Per-executor basis. The only place I can think of
where data structures might be over-written concurrently is in a
.aggregate() call, and these calls happen sequentially.
RJ - Do you have a JIRA reference for
Never filed a JIRA -- I actually forgot about it. Let me file one now.
On Wed, Sep 3, 2014 at 2:58 PM, Evan R. Sparks evan.spa...@gmail.com
wrote:
Additionally, at the higher level, MLlib allocates separate Breeze
Vectors/Matrices on a Per-executor basis. The only place I can think of
mutating operations are not thread safe. Operations that don't mutate
should be thread safe. I can't speak to what Evan said, but I would guess
that the way they're using += should be safe.
On Wed, Sep 3, 2014 at 11:58 AM, RJ Nowling rnowl...@gmail.com wrote:
David,
Can you confirm that +=
Here's the JIRA:
https://issues.apache.org/jira/browse/SPARK-3384
Even if the current implementation uses += in a thread safe manner, it can
be easy to make the mistake of accidentally using += in a parallelized
context. I suggest changing all instances of += to +.
I would encourage others to
RJ, could you provide a code example that can re-produce the bug you
observed in local testing? Breeze's += is not thread-safe. But in a
Spark job, calls to a resultHandler is synchronized:
What about the allocation of a new breeze vector? Can it happen unsafe within
Spark (in several threads)?
Best regards, Alexander
03.09.2014, в 23:17, Xiangrui Meng men...@gmail.com написал(а):
RJ, could you provide a code example that can re-produce the bug you
observed in local testing?