[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-05-06 Thread RussellSpitzer
Github user RussellSpitzer commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-217483240 Sorry I forgot about this, I'll clean this up tomorrow and get it ready --- If your project is set up for it, you can reply to this email and have your reply app

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-05-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-217465276 ping @RussellSpitzer --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-04-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-212184245 Maybe we might have to correct the title just like the others `[SPARK-][SQL]` (this is described in https://cwiki.apache.org/confluence/display/SPARK/Contributi

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-24 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-174386527 no problem! Thank you :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have thi

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-22 Thread RussellSpitzer
Github user RussellSpitzer commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-174113049 Haven't forgotten this will have a new pr soon :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-17 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-172371851 @RussellSpitzer Thank you for the comment. I totally agree with you. As mentioned before, my only concern is that for ORC/Parquet, we will not be able to see push

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-15 Thread RussellSpitzer
Github user RussellSpitzer commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-172145152 I personally think the ambiguous `PUSHED_FILTERS` is more confusing. When we see a predicate there we have no idea whether or not it is a valid filter for the so

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-15 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-172140783 @RussellSpitzer Thanks for the change. I have thought about it again. My only concern is that `unhandledPredicates` actually contains filters that can be handled by the d

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-172139084 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-172139079 **[Test build #49502 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49502/consoleFull)** for PR 10655 at commit [`5a0daf6`](https://g

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-172139086 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-172137996 **[Test build #49502 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49502/consoleFull)** for PR 10655 at commit [`5a0daf6`](https://gi

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-15 Thread RussellSpitzer
Github user RussellSpitzer commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-172136871 @yhuai I removed the PushedFilters and add the other examples. We could read-add the "PushedFilters" if you like. I wasn't sure if you still wanted that. I'm sti

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-170613555 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-170613562 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-170613543 **[Test build #49155 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49155/consoleFull)** for PR 10655 at commit [`42cc0e7`](https://g

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-170611622 **[Test build #49155 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49155/consoleFull)** for PR 10655 at commit [`42cc0e7`](https://gi

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-11 Thread RussellSpitzer
Github user RussellSpitzer commented on a diff in the pull request: https://github.com/apache/spark/pull/10655#discussion_r49342098 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -114,6 +114,7 @@ private[sql] object PhysicalRDD { //

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-11 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-170604593 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enab

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-11 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-170604576 @RussellSpitzer Thank you for the PR! The change looks good. Can you also try ORC and Parquet table and attach the before/after change to the PR description? --- If you

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-11 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/10655#discussion_r49341415 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -114,6 +114,7 @@ private[sql] object PhysicalRDD { // Metadata

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-11 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/10655#discussion_r49341368 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala --- @@ -321,8 +321,8 @@ private[sql] object DataSourceSt

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-08 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-170195683 OK I think I figured out why. "acc" is a boolean column. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. I

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-08 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-170195637 Thanks @RussellSpitzer. I will let @yhuai review and merge this. One question, do you know why the filter is "if (isnull(acc#2)) null else CASE 1000 WHEN 1 THEN ac

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-08 Thread RussellSpitzer
Github user RussellSpitzer commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-170064440 @rxin Added, basically I think the current "PushedFilters" list isn't very valuable if everything is listed there. So instead we should just list those filters w

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-07 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-169897099 Can you in the pull request description includes a before/after change? --- If your project is set up for it, you can reply to this email and have your reply appear on Gi

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-07 Thread RussellSpitzer
Github user RussellSpitzer commented on a diff in the pull request: https://github.com/apache/spark/pull/10655#discussion_r49153042 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -114,6 +114,7 @@ private[sql] object PhysicalRDD { //

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-07 Thread RussellSpitzer
Github user RussellSpitzer commented on a diff in the pull request: https://github.com/apache/spark/pull/10655#discussion_r49152876 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala --- @@ -321,8 +321,8 @@ private[sql] object Dat

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-07 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/10655#discussion_r49151802 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -114,6 +114,7 @@ private[sql] object PhysicalRDD { // Metadata

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-07 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/10655#discussion_r49150950 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala --- @@ -321,8 +321,8 @@ private[sql] object DataSourceSt

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-169862654 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your p

[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...

2016-01-07 Thread RussellSpitzer
GitHub user RussellSpitzer opened a pull request: https://github.com/apache/spark/pull/10655 SPARK-12639 SQL Improve Explain for Datasources with Handled Predicates SPARK-11661 Makes all predicates pushed down to underlying Datasources regardless of whether the source can handle