[ 
https://issues.apache.org/jira/browse/SPARK-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14206143#comment-14206143
 ] 

Andrew Ash commented on SPARK-572:
----------------------------------

Static mutable variables are now a standard way of having code run on a 
per-executor basis.

To run per-entry, you can use map(), for per-partition you can use 
mapPartitions(), but for per-executor you need static variables or 
initializers.  If for example you want to open a connection to another data 
storage system and write all of an executor's data into that system, a static 
connection object is the common way to do that.

I would propose closing this ticket as "Won't Fix".  Using this technique is 
confusing, but prohibiting it is difficult and introduces additional roadblocks 
to Spark power users.

cc [~rxin]

> Forbid update of static mutable variables
> -----------------------------------------
>
>                 Key: SPARK-572
>                 URL: https://issues.apache.org/jira/browse/SPARK-572
>             Project: Spark
>          Issue Type: Improvement
>            Reporter: tjhunter
>
> Consider the following piece of code:
> <pre>
> object Foo {
>  var xx = -1
>  def main() {
>    xx = 1
>    val sc = new SparkContext(...)
>    sc.broadcast(xx)
>    sc.parallelize(0 to 10).map(i=>{ ... xx ...})
>  }
> }
> </pre>
> Can you guess the value of xx? It is 1 when you use the local scheduler and 
> -1 when you use the mesos scheduler. Given the complications, it should 
> probably just be forbidden for now...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to