[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-10-27 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18687 @gatorsmile Any suggestion for this issue? Leave it as it for now? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-10-27 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18687 Sure. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apach

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-10-27 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18687 @viirya Could you close this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-08-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18687 ping @cloud-fan Please help review this too. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-23 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18687 This really depends on how you implement the global statement cache and management. All the compiled plans can be stored in the cache. The plans can be reused, if possible (the reused plans might

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-23 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18687 cc @cloud-fan Can you help review this too? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18687 There is another case we should fix. You can see the storage level of ds2 is StorageLevel.NONE, but its executed plan is still cached version. scala> Seq("1", "2").toDF().write.saveAsTab

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18687 @gatorsmile This introduces another question, when one table in cached in one session, and other session uncache the table, now is the original table cached or uncached? --- If your project is set

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18687 Ok. Sounds reasonable. I'm preparing new fix for the case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18687 My above example is pretty common in many Spark SQL use cases. Many users rely on it. As long as one table is cached in one session, the other sessions can use the cached table without reading th

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18687 @gatorsmile Thanks for reporting that. It is hard to argue the reported case is valid in semantics. Actually ds1 and ds2 are two different Datasets. In semantics, you cache one Dataset, why

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18687 Thanks for fixing this, but this PR does not fix all the cases that caused by our materialized plans in the QueryExecution. For examples, ```Scala Seq("1", "2").toDF().write.saveAsTa

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18687 Btw, as `QueryExecution.toRdd` executes the executed plan. Once it is materialized by the Dataset before persist, it still executes the uncached executed plan. There're few places in Dataset calling

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-21 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18687 Should `Dataset` be thread-safe? cc @rxin @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18687 cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18687 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18687 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79791/ Test PASSed. ---

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18687 **[Test build #79791 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79791/testReport)** for PR 18687 at commit [`3c16c3c`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18687 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79790/ Test FAILed. ---

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18687 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18687 **[Test build #79790 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79790/testReport)** for PR 18687 at commit [`6b21c6b`](https://github.com/apache/spark/commit/6

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18687 **[Test build #79791 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79791/testReport)** for PR 18687 at commit [`3c16c3c`](https://github.com/apache/spark/commit/3c

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18687 **[Test build #79790 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79790/testReport)** for PR 18687 at commit [`6b21c6b`](https://github.com/apache/spark/commit/6b