[ 
https://issues.apache.org/jira/browse/SPARK-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14227319#comment-14227319
 ] 

Brennon York commented on SPARK-4170:
-------------------------------------

[~srowen] this bug seems to be an issue with the way Scala has been defined and 
the differences between what happens at runtime versus compile time with 
respect to the way {code}App{code} leverages the [{code}delayedInit{code} 
function|http://www.scala-lang.org/api/2.11.1/index.html#scala.App].

I tried to replicate the issue on my local machine under both compile time and 
runtime with only the latter producing the issue (as expected through the Scala 
documentation). The former was tested by creating a simple application, 
compiled with sbt, and executed while the latter was setup within the 
{code}spark-shell{code} REPL. I'm wondering if we can't close this issue and 
just provide a bit of documentation somewhere to reference that, when building 
even simple Spark apps, extending the {code}App{code} interface will result in 
delayed initialization and, likely, set null values within that closure. 
Thoughts?

> Closure problems when running Scala app that "extends App"
> ----------------------------------------------------------
>
>                 Key: SPARK-4170
>                 URL: https://issues.apache.org/jira/browse/SPARK-4170
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.1.0
>            Reporter: Sean Owen
>            Priority: Minor
>
> Michael Albert noted this problem on the mailing list 
> (http://apache-spark-user-list.1001560.n3.nabble.com/BUG-when-running-as-quot-extends-App-quot-closures-don-t-capture-variables-td17675.html):
> {code}
> object DemoBug extends App {
>     val conf = new SparkConf()
>     val sc = new SparkContext(conf)
>     val rdd = sc.parallelize(List("A","B","C","D"))
>     val str1 = "A"
>     val rslt1 = rdd.filter(x => { x != "A" }).count
>     val rslt2 = rdd.filter(x => { str1 != null && x != "A" }).count
>     
>     println("DemoBug: rslt1 = " + rslt1 + " rslt2 = " + rslt2)
> }
> {code}
> This produces the output:
> {code}
> DemoBug: rslt1 = 3 rslt2 = 0
> {code}
> If instead there is a proper "main()", it works as expected.
> I also this week noticed that in a program which "extends App", some values 
> were inexplicably null in a closure. When changing to use main(), it was fine.
> I assume there is a problem with variables not being added to the closure 
> when main() doesn't appear in the standard way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to