[GitHub] spark issue #14629: [WIP][SPARK-17046][SQL] prevent user using dataframe.sel...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/14629 I haven't changed my mind of this. Lets close this one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14629: [WIP][SPARK-17046][SQL] prevent user using dataframe.sel...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14629 @hvanhovell @rxin unless you've changed your stance a little bit on this, I think the conclusion is that this isn't worth changing this behavior and we can close this @WeichenXu123 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14629: [WIP][SPARK-17046][SQL] prevent user using dataframe.sel...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14629 Interesting point, yeah, because normally in an RDBMS you have to `COUNT(*)` or `COUNT(1)` and the argument is useless anyway, so would be nice to not have to provide an argument to select in this context. But while comparing to `DataFrame`s, of course you should already `count()` directly if this is desired. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14629: [WIP][SPARK-17046][SQL] prevent user using dataframe.sel...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/14629 A `df.select()` without any columns is not useless IMO: You can still get a valid `count()` from a data frame. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14629: [WIP][SPARK-17046][SQL] prevent user using dataframe.sel...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/14629 MySql do not allow select with 0 columns, and I think select() is useless, no one will do such operation, so, is it better to generate compiling error when detecting code use `df.select()` because it is usually a coding mistake? Or, `df.select()` is useful in some particular scenario? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14629: [WIP][SPARK-17046][SQL] prevent user using dataframe.sel...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14629 Yes that's a good question. A 0-column DataFrame is valid, though that's a little different from being able to select 0 columns from a DataFrame. I don't have a database handy, but can you select no columns in any SQL syntax? Maybe best to emulate that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14629: [WIP][SPARK-17046][SQL] prevent user using dataframe.sel...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14629 Why do we want to enforce this? It is valid to have a DataFrame without any columns. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14629: [WIP][SPARK-17046][SQL] prevent user using dataframe.sel...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/14629 @srowen How do you think about this problem? I found adding two method like `def select(cols: Column*)` `def select(col: Column, cols: Column*)` causing ambiguous, I prepare to change `def select(cols: Column*)` into `def selectInternal(cols: Column*)` and mark `selectInternal` as private[spark] do you think it reasonable? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14629: [WIP][SPARK-17046][SQL] prevent user using dataframe.sel...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14629 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14629: [WIP][SPARK-17046][SQL] prevent user using dataframe.sel...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14629 **[Test build #63724 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63724/consoleFull)** for PR 14629 at commit [`8060aa4`](https://github.com/apache/spark/commit/8060aa418138179286bd3d6eb64daac53610cadf). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14629: [WIP][SPARK-17046][SQL] prevent user using dataframe.sel...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14629 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63724/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14629: [WIP][SPARK-17046][SQL] prevent user using dataframe.sel...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14629 **[Test build #63724 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63724/consoleFull)** for PR 14629 at commit [`8060aa4`](https://github.com/apache/spark/commit/8060aa418138179286bd3d6eb64daac53610cadf). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org