[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user xwu0226 closed the pull request at: https://github.com/apache/spark/pull/12579 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user xwu0226 commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-218549665 @liancheng Thank you for the detail explanation!! Yeah. if the goal is to make sure Spark SQL can handle the generated DDL, then, we need to miss some hive features for now. I will close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-218379687 Hey @xwu0226, sorry that I didn't explain why I opened another PR for the same issue, was in code rush for 2.0... So one of the considerations for all the native DDL commands is that we don't want these DDL commands to rely on Hive anymore. This is because we'd like to remove Hive dependency from Spark SQL core and gradually make Hive a separate data source in the future. This means, we shouldn't add new code in places like `HiveClientImpl`. These new DDL command should be implemented upon interfaces like `CatalogTable`. One apparent problem of this approach is that, current Spark SQL interfaces don't capture all semantics of Hive. For example, some table metadata like skew spec is not covered in `CatalogTable` yet. Our general strategies are: 1. For easy ones, like "owner" and "compressed" in #12844, we may just add them to the interface and leverage them. 2. For features that are not supported in Spark SQL, for example, skew spec, we can simply ignore them for now, since Spark can't handle them anyway. There will be a follow-up of #12781 to add support for Hive tables. After offline discussion with @yhuai, we decided to add a flag in `CatalogTable` to indicate that whether there unrecognized metadata provided by the underlying external catalog, but not translated and included in `CatalogTable`. In this way, when applying `SHOW CREATE TABLE` to tables containing such metadata, this flag can be set to true, and we can simply refuse to output anything by checking this flag. This makes sense because even if you add things like skew spec in the result of `SHOW CREATE TABLE`, Spark can't handle the generated DDL statement --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user xwu0226 commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-218238618 @srowen Yes, for datasource table. This PR also includes the work for hive syntax DDL too. I see #12781 mentions that there will be followup PR taking care of the hive syntax DDL. So I wondering whether I should continue on this PR. I can close this one if there is no need. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-218192270 @xwu0226 I think this is superseded by https://github.com/apache/spark/pull/12781 ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-216763938 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user xwu0226 commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-215303758 @yhuai @liancheng , I see PR [#12734](https://github.com/apache/spark/pull/12734) takes care of the PARTITIONED BY and CLUSTERED BY (with SORTED BY) clause for CTAS syntax, but not for non-CTAS syntax. Now I need to change my PR to adapt to this change, which means that the generated DDL will be something like `create table t1 (c1 int, ...) using .. options (..) partitioned by (..) clustered by (...) sorted by (...) in ... buckets`. But there won't be a "select clause" following it since we do not have the original query. But such generated query will not run because [#12734](https://github.com/apache/spark/pull/12734) does not support it. Can we add a fake select clause with a warning message? Also DataFrameWriter.saveAsTable case is like CTAS. Can we then generate the DDL as a regular CTAS syntax? This will change my current implementation in this PR. Please advice, thanks a lot! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-214553367 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user xwu0226 commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-214472079 @liancheng Thanks for triggering the test! I am looking into the test failure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-214466178 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-214465973 **[Test build #56899 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56899/consoleFull)** for PR 12579 at commit [`13e9775`](https://github.com/apache/spark/commit/13e9775604f3365683bf2b0f3b35b80a30f05dd4). * This patch **fails Spark unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-214466180 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56899/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-214420652 **[Test build #56899 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56899/consoleFull)** for PR 12579 at commit [`13e9775`](https://github.com/apache/spark/commit/13e9775604f3365683bf2b0f3b35b80a30f05dd4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-214419087 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user xwu0226 commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-213126393 @yhuai @andrewor14 Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12579#issuecomment-213032974 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14346][SQL] Show Create Table (Native)
GitHub user xwu0226 opened a pull request: https://github.com/apache/spark/pull/12579 [SPARK-14346][SQL] Show Create Table (Native) This is a rebased version of [#12132](https://github.com/apache/spark/pull/12132) and [#12406](https://github.com/apache/spark/pull/12406) ## What changes were proposed in this pull request? Allow users to issue "`SHOW CREATE TABLE`" command natively in SparkSQL. -- For tables that are created by Hive, this command will display the DDL in hive syntax. If the syntax includes `CLUSTERED BY, SKEWED BY or STORED BY` clause, there will be a warning message saying that this DDL is not supported in SparkSQL native DDL yet. -- For tables that are created by datasource DDL, such as "`CREATE TABLE... USING ... OPTIONS (...)`", it will show the DDL in this syntax. -- For tables that are created by dataframe API, such as "`df.write.partitionBy(...).saveAsTable(...)`", currently the command will display DDL with the syntax "CREATE TABLE.. USING...OPTIONS(...)". However, this syntax lose the partitioning information. It is proposed to display create table in the dataframe API format. ## How was this patch tested? Unit tests are created. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xwu0226/spark show_create_table_3 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12579.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12579 commit 0ebb0142e13db3ce8fb474ee5682528b0f87d2d2 Author: xin Wu Date: 2016-04-02T01:46:16Z show create table DDL -- hive metastore table commit 6d060be797d4127f0b86fa59c1bc848d75215533 Author: xin Wu Date: 2016-04-02T06:01:46Z update upon review commit 2799672162d715b209cad9a5c103d6f09692d8dc Author: xin Wu Date: 2016-04-02T18:19:26Z ignoring sqlContext temp table and considering datasource table ddl commit 98c020aa9a5374861d1470fa0c305148e8314ada Author: xin Wu Date: 2016-04-04T21:54:32Z fix scala style issue commit efd889821bf84e328ef6dd8d0b6a645729248251 Author: xin Wu Date: 2016-04-04T22:40:26Z fix scala style issue in testcase commit b370630f5827071bc5076e9b3fa9c92720b27eb2 Author: xin Wu Date: 2016-04-05T01:31:46Z fix testcase for test failure commit 8cb7a7299df84f2608b91b092a7df6795b85d41e Author: xin Wu Date: 2016-04-06T18:12:07Z continue the database ddl generation commit 8b67d22c5ed8fd6b309df772e4a372e741acf630 Author: xin Wu Date: 2016-04-08T20:57:12Z support datasource ddl commit 9ab863fb7f8127d1acd083b1ba857f5c1fd2769c Author: xin Wu Date: 2016-04-08T22:04:05Z scala style fix commit a40273c7989bebdf62b93ce6e604bb14cacce100 Author: xin Wu Date: 2016-04-13T22:54:16Z merge the code committed by CREATE TABLE native support commit d214a3b0c54641a6234ba39eef82b2b8ac4c87dd Author: xin Wu Date: 2016-04-14T23:49:03Z rework show create ddl based on new native supported create table DDL work commit 1680ea0403f0d29185d9a3f8f81d15599be81aac Author: xin Wu Date: 2016-04-14T23:51:03Z Merge branch 'show_create_table_1' into show_create_table_2 commit fa8373c3fd2d27cf2b3356ee0214c8e04dfc0f36 Author: xin Wu Date: 2016-04-15T02:03:41Z remove spaces commit 5095b6c871de55e871c5ea606ade6ab0b2166627 Author: xin Wu Date: 2016-04-15T16:24:53Z update upon review - use visitTableIdentifier commit 15f226c7d4f195947cbb1acc341eaaae4072d4a6 Author: xin Wu Date: 2016-04-20T18:28:29Z generate dataframe API create table for some datasource tables commit 601867ae71cc370770deddd56cc8883b04dcf8ee Author: xin Wu Date: 2016-04-20T18:31:27Z synch up with master branch commit 687f7aca56cf5c032ceac09c341b2dfd00129b8e Author: xin Wu Date: 2016-04-20T21:54:51Z update upon review commit bf3512ba01e773a514350030cfa91087de10fc03 Author: xin Wu Date: 2016-04-20T22:07:55Z synch up with latest change commit ca44d67584f358bd588743d33de2b7d689df584d Author: xin Wu Date: 2016-04-21T05:35:04Z synch up again --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org