[jira] [Commented] (SPARK-12760) inaccurate description for difference between local vs cluster mode in closure handling
[ https://issues.apache.org/jira/browse/SPARK-12760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110908#comment-15110908 ] Apache Spark commented on SPARK-12760: -- User 'mortada' has created a pull request for this issue: https://github.com/apache/spark/pull/10867 > inaccurate description for difference between local vs cluster mode in > closure handling > --- > > Key: SPARK-12760 > URL: https://issues.apache.org/jira/browse/SPARK-12760 > Project: Spark > Issue Type: Bug > Components: Documentation >Reporter: Mortada Mehyar >Priority: Minor > > In the spark documentation there's an example for illustrating how `local` > and `cluster` mode can differ > http://spark.apache.org/docs/latest/programming-guide.html#example > " In local mode with a single JVM, the above code will sum the values within > the RDD and store it in counter. This is because both the RDD and the > variable counter are in the same memory space on the driver node." > However the above doesn't seem to be true. Even in `local` mode it seems like > the counter value should still be 0, because the variable will be summed up > in the executor memory space, but the final value in the driver memory space > is still 0. I tested this snippet and verified that in `local` mode the value > is indeed still 0. > Is the doc wrong or perhaps I'm missing something the doc is trying to say? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12760) inaccurate description for difference between local vs cluster mode in closure handling
[ https://issues.apache.org/jira/browse/SPARK-12760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110814#comment-15110814 ] Apache Spark commented on SPARK-12760: -- User 'srowen' has created a pull request for this issue: https://github.com/apache/spark/pull/10866 > inaccurate description for difference between local vs cluster mode in > closure handling > --- > > Key: SPARK-12760 > URL: https://issues.apache.org/jira/browse/SPARK-12760 > Project: Spark > Issue Type: Bug > Components: Documentation >Reporter: Mortada Mehyar >Priority: Minor > > In the spark documentation there's an example for illustrating how `local` > and `cluster` mode can differ > http://spark.apache.org/docs/latest/programming-guide.html#example > " In local mode with a single JVM, the above code will sum the values within > the RDD and store it in counter. This is because both the RDD and the > variable counter are in the same memory space on the driver node." > However the above doesn't seem to be true. Even in `local` mode it seems like > the counter value should still be 0, because the variable will be summed up > in the executor memory space, but the final value in the driver memory space > is still 0. I tested this snippet and verified that in `local` mode the value > is indeed still 0. > Is the doc wrong or perhaps I'm missing something the doc is trying to say? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12760) inaccurate description for difference between local vs cluster mode in closure handling
[ https://issues.apache.org/jira/browse/SPARK-12760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096661#comment-15096661 ] Mortada Mehyar commented on SPARK-12760: Hi, sure definitely. I'll create a PR and link to this JIRA ticket. > inaccurate description for difference between local vs cluster mode in > closure handling > --- > > Key: SPARK-12760 > URL: https://issues.apache.org/jira/browse/SPARK-12760 > Project: Spark > Issue Type: Bug > Components: Documentation >Reporter: Mortada Mehyar >Priority: Minor > > In the spark documentation there's an example for illustrating how `local` > and `cluster` mode can differ > http://spark.apache.org/docs/latest/programming-guide.html#example > " In local mode with a single JVM, the above code will sum the values within > the RDD and store it in counter. This is because both the RDD and the > variable counter are in the same memory space on the driver node." > However the above doesn't seem to be true. Even in `local` mode it seems like > the counter value should still be 0, because the variable will be summed up > in the executor memory space, but the final value in the driver memory space > is still 0. I tested this snippet and verified that in `local` mode the value > is indeed still 0. > Is the doc wrong or perhaps I'm missing something the doc is trying to say? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org