[ https://issues.apache.org/jira/browse/SPARK-3326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14116327#comment-14116327 ]
Sean Owen commented on SPARK-3326: ---------------------------------- The call to Foo.getSome() occurs remotely, on a different JVM with a different copy of your class. You may initialize your instance in the driver, but this leaves it uninitialized in the remote workers. You can initialize this in a static block. Or you can simply reference the value of Foo.getSome() directly in your map function and then it is serialized in the closure. All that you send right now is a function that depends on "what Foo.getSome()" returns when it's called", not what it happens to return on the driver. Consider broadcast variables if it's large. If that's what's going on then this is normal behavior. > can't access a static variable after init in mapper > --------------------------------------------------- > > Key: SPARK-3326 > URL: https://issues.apache.org/jira/browse/SPARK-3326 > Project: Spark > Issue Type: Bug > Environment: CDH5.1.0 > Spark1.0.0 > Reporter: Gavin Zhang > > I wrote a object like: > object Foo { > private Bar bar = null > def init(Bar bar){ > this.bar = bar > } > def getSome(){ > bar.someDef() > } > } > In Spark main def, I read some text from HDFS and init this object. And after > then calling getSome(). > I was successful with this code: > sc.textFile(args(0)).take(10).map(println(Foo.getSome())) > However, when I changed it for write output to HDFS, I found the bar variable > in Foo object is null: > sc.textFile(args(0)).map(line=>Foo.getSome()).saveAsTextFile(args(1)) > WHY? -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org