[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/11573 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-195597743 OK. Let's merge this first (to avoid it has conflicts caused by other commits ). We can address the comments in a follow-up PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55902999 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala --- @@ -64,10 +83,86 @@ private[sql] class SparkQl(conf: ParserConf = SimpleParserConf()) extends Cataly val tableIdent = extractTableIdent(nameParts) RefreshTable(tableIdent) + // CREATE DATABASE [IF NOT EXISTS] database_name [COMMENT database_comment] + // [LOCATION path] [WITH DBPROPERTIES (key1=val1, key2=val2, ...)]; + case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: args) => +val Seq(ifNotExists, dbLocation, databaseComment, dbprops) = getClauses(Seq( + "TOK_IFNOTEXISTS", + "TOK_DATABASELOCATION", + "TOK_DATABASECOMMENT", + "TOK_DATABASEPROPERTIES"), args) +val location = dbLocation.map { + case Token("TOK_DATABASELOCATION", Token(loc, Nil) :: Nil) => unquoteString(loc) + case _ => parseFailed("Invalid CREATE DATABASE command", node) +} +val comment = databaseComment.map { + case Token("TOK_DATABASECOMMENT", Token(com, Nil) :: Nil) => unquoteString(com) + case _ => parseFailed("Invalid CREATE DATABASE command", node) +} +val props = dbprops.toSeq.flatMap { + case Token("TOK_DATABASEPROPERTIES", Token("TOK_DBPROPLIST", propList) :: Nil) => +extractProps(propList, "TOK_TABLEPROPERTY") + case _ => parseFailed("Invalid CREATE DATABASE command", node) +}.toMap +CreateDatabase(databaseName, ifNotExists.isDefined, location, comment, props)(node.source) + + // CREATE [TEMPORARY] FUNCTION [db_name.]function_name AS class_name + // [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ]; + case Token("TOK_CREATEFUNCTION", args) => +// Example format: +// +// TOK_CREATEFUNCTION +// :- db_name +// :- func_name +// :- alias +// +- TOK_RESOURCE_LIST +//:- TOK_RESOURCE_URI +//: :- TOK_JAR +//: +- '/path/to/jar' +//+- TOK_RESOURCE_URI +// :- TOK_FILE +// +- 'path/to/file' +val (funcNameArgs, otherArgs) = args.partition { + case Token("TOK_RESOURCE_LIST", _) => false + case Token("TOK_TEMPORARY", _) => false + case Token(_, Nil) => true + case _ => parseFailed("Invalid CREATE FUNCTION command", node) +} +// If database name is specified, there are 3 tokens, otherwise 2. +val (funcName, alias) = funcNameArgs match { + case Token(dbName, Nil) :: Token(fname, Nil) :: Token(aname, Nil) :: Nil => +(unquoteString(dbName) + "." + unquoteString(fname), unquoteString(aname)) + case Token(fname, Nil) :: Token(aname, Nil) :: Nil => +(unquoteString(fname), unquoteString(aname)) + case _ => +parseFailed("Invalid CREATE FUNCTION command", node) +} +// Extract other keywords, if they exist +val Seq(rList, temp) = getClauses(Seq("TOK_RESOURCE_LIST", "TOK_TEMPORARY"), otherArgs) +val resourcesMap = rList.toSeq.flatMap { + case Token("TOK_RESOURCE_LIST", resources) => +resources.map { + case Token("TOK_RESOURCE_URI", rType :: Token(rPath, Nil) :: Nil) => +val resourceType = rType match { + case Token("TOK_JAR", Nil) => "jar" + case Token("TOK_FILE", Nil) => "file" + case Token("TOK_ARCHIVE", Nil) => "archive" + case Token(f, _) => parseFailed(s"Unexpected resource format '$f'", node) +} +(resourceType, unquoteString(rPath)) + case _ => parseFailed("Invalid CREATE FUNCTION command", node) +} + case _ => parseFailed("Invalid CREATE FUNCTION command", node) +}.toMap +CreateFunction(funcName, alias, resourcesMap, temp.isDefined)(node.source) --- End diff -- Let's also have a test for this case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail:
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-195592073 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52937/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-195592071 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-195591633 **[Test build #52937 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52937/consoleFull)** for PR 11573 at commit [`bd91b0f`](https://github.com/apache/spark/commit/bd91b0f3c850da796b22c7e575658625d2486057). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-195570543 I have left a few comments. It is a good starting point. Thank you for working on this! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55893285 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala --- @@ -29,7 +29,26 @@ private[sql] class SparkQl(conf: ParserConf = SimpleParserConf()) extends Cataly import ParserUtils._ /** Check if a command should not be explained. */ - protected def isNoExplainCommand(command: String): Boolean = "TOK_DESCTABLE" == command + protected def isNoExplainCommand(command: String): Boolean = { +"TOK_DESCTABLE" == command || "TOK_ALTERTABLE" == command + } + + /** + * For each node, extract properties in the form of a list ['key1', 'key2', 'key3', 'value'] + * into a pair (key1.key2.key3, value). + */ --- End diff -- Also, what is the difference between this method and `extractTableProps` defined for alter table? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55892958 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala --- @@ -0,0 +1,428 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.ExternalCatalog.TablePartitionSpec +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending, SortDirection} +import org.apache.spark.sql.catalyst.parser._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.types.StructType + + +/** + * Helper object to parse alter table commands. + */ +object AlterTableCommandParser { + import ParserUtils._ + + /** + * Parse the given node assuming it is an alter table command. + */ + def parse(node: ASTNode): LogicalPlan = { +node.children match { + case (tabName @ Token("TOK_TABNAME", _)) :: otherNodes => +val tableIdent = extractTableIdent(tabName) +val partSpec = getClauseOption("TOK_PARTSPEC", node.children).map(parsePartitionSpec) +matchAlterTableCommands(node, otherNodes, tableIdent, partSpec) + case _ => +parseFailed("Could not parse ALTER TABLE command", node) +} + } + + private def cleanAndUnquoteString(s: String): String = { +cleanIdentifier(unquoteString(s)) + } + + /** + * Extract partition spec from the given [[ASTNode]] as a map, assuming it exists. + * + * Expected format: + * +- TOK_PARTSPEC + * :- TOK_PARTVAL + * : :- dt + * : +- '2008-08-08' + * +- TOK_PARTVAL + * :- country + * +- 'us' + */ + private def parsePartitionSpec(node: ASTNode): Map[String, String] = { +node match { + case Token("TOK_PARTSPEC", partitions) => +partitions.map { + // Note: sometimes there's a "=", "<" or ">" between the key and the value --- End diff -- Maybe it will be good to provide a concrete example? Is it caused by what a user type? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55892525 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala --- @@ -29,7 +29,26 @@ private[sql] class SparkQl(conf: ParserConf = SimpleParserConf()) extends Cataly import ParserUtils._ /** Check if a command should not be explained. */ - protected def isNoExplainCommand(command: String): Boolean = "TOK_DESCTABLE" == command + protected def isNoExplainCommand(command: String): Boolean = { +"TOK_DESCTABLE" == command || "TOK_ALTERTABLE" == command + } + + /** + * For each node, extract properties in the form of a list ['key1', 'key2', 'key3', 'value'] + * into a pair (key1.key2.key3, value). + */ --- End diff -- Maybe it is good to provide a concrete example as well? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55889875 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala --- @@ -64,10 +83,86 @@ private[sql] class SparkQl(conf: ParserConf = SimpleParserConf()) extends Cataly val tableIdent = extractTableIdent(nameParts) RefreshTable(tableIdent) + // CREATE DATABASE [IF NOT EXISTS] database_name [COMMENT database_comment] + // [LOCATION path] [WITH DBPROPERTIES (key1=val1, key2=val2, ...)]; + case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: args) => +val Seq(ifNotExists, dbLocation, databaseComment, dbprops) = getClauses(Seq( + "TOK_IFNOTEXISTS", + "TOK_DATABASELOCATION", + "TOK_DATABASECOMMENT", + "TOK_DATABASEPROPERTIES"), args) +val location = dbLocation.map { + case Token("TOK_DATABASELOCATION", Token(loc, Nil) :: Nil) => unquoteString(loc) + case _ => parseFailed("Invalid CREATE DATABASE command", node) +} +val comment = databaseComment.map { + case Token("TOK_DATABASECOMMENT", Token(com, Nil) :: Nil) => unquoteString(com) + case _ => parseFailed("Invalid CREATE DATABASE command", node) +} +val props = dbprops.toSeq.flatMap { + case Token("TOK_DATABASEPROPERTIES", Token("TOK_DBPROPLIST", propList) :: Nil) => +extractProps(propList, "TOK_TABLEPROPERTY") --- End diff -- Maybe it will be also good to provide an example format like what we have for `TOK_CREATEFUNCTION`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55888485 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala --- @@ -64,10 +83,86 @@ private[sql] class SparkQl(conf: ParserConf = SimpleParserConf()) extends Cataly val tableIdent = extractTableIdent(nameParts) RefreshTable(tableIdent) + // CREATE DATABASE [IF NOT EXISTS] database_name [COMMENT database_comment] + // [LOCATION path] [WITH DBPROPERTIES (key1=val1, key2=val2, ...)]; + case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: args) => +val Seq(ifNotExists, dbLocation, databaseComment, dbprops) = getClauses(Seq( + "TOK_IFNOTEXISTS", + "TOK_DATABASELOCATION", + "TOK_DATABASECOMMENT", + "TOK_DATABASEPROPERTIES"), args) +val location = dbLocation.map { + case Token("TOK_DATABASELOCATION", Token(loc, Nil) :: Nil) => unquoteString(loc) + case _ => parseFailed("Invalid CREATE DATABASE command", node) +} +val comment = databaseComment.map { + case Token("TOK_DATABASECOMMENT", Token(com, Nil) :: Nil) => unquoteString(com) + case _ => parseFailed("Invalid CREATE DATABASE command", node) +} +val props = dbprops.toSeq.flatMap { + case Token("TOK_DATABASEPROPERTIES", Token("TOK_DBPROPLIST", propList) :: Nil) => +extractProps(propList, "TOK_TABLEPROPERTY") + case _ => parseFailed("Invalid CREATE DATABASE command", node) +}.toMap +CreateDatabase(databaseName, ifNotExists.isDefined, location, comment, props)(node.source) + + // CREATE [TEMPORARY] FUNCTION [db_name.]function_name AS class_name + // [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ]; + case Token("TOK_CREATEFUNCTION", args) => +// Example format: +// +// TOK_CREATEFUNCTION +// :- db_name +// :- func_name +// :- alias +// +- TOK_RESOURCE_LIST +//:- TOK_RESOURCE_URI +//: :- TOK_JAR +//: +- '/path/to/jar' +//+- TOK_RESOURCE_URI +// :- TOK_FILE +// +- 'path/to/file' +val (funcNameArgs, otherArgs) = args.partition { + case Token("TOK_RESOURCE_LIST", _) => false + case Token("TOK_TEMPORARY", _) => false + case Token(_, Nil) => true + case _ => parseFailed("Invalid CREATE FUNCTION command", node) +} +// If database name is specified, there are 3 tokens, otherwise 2. +val (funcName, alias) = funcNameArgs match { + case Token(dbName, Nil) :: Token(fname, Nil) :: Token(aname, Nil) :: Nil => +(unquoteString(dbName) + "." + unquoteString(fname), unquoteString(aname)) + case Token(fname, Nil) :: Token(aname, Nil) :: Nil => +(unquoteString(fname), unquoteString(aname)) + case _ => +parseFailed("Invalid CREATE FUNCTION command", node) +} +// Extract other keywords, if they exist +val Seq(rList, temp) = getClauses(Seq("TOK_RESOURCE_LIST", "TOK_TEMPORARY"), otherArgs) +val resourcesMap = rList.toSeq.flatMap { + case Token("TOK_RESOURCE_LIST", resources) => +resources.map { + case Token("TOK_RESOURCE_URI", rType :: Token(rPath, Nil) :: Nil) => +val resourceType = rType match { + case Token("TOK_JAR", Nil) => "jar" + case Token("TOK_FILE", Nil) => "file" + case Token("TOK_ARCHIVE", Nil) => "archive" + case Token(f, _) => parseFailed(s"Unexpected resource format '$f'", node) +} +(resourceType, unquoteString(rPath)) + case _ => parseFailed("Invalid CREATE FUNCTION command", node) +} + case _ => parseFailed("Invalid CREATE FUNCTION command", node) +}.toMap +CreateFunction(funcName, alias, resourcesMap, temp.isDefined)(node.source) --- End diff -- What will this map look like if I have `USING JAR 'jar1', JAR 'jar2', ...`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe,
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-195520611 **[Test build #52937 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52937/consoleFull)** for PR 11573 at commit [`bd91b0f`](https://github.com/apache/spark/commit/bd91b0f3c850da796b22c7e575658625d2486057). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55867164 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/bucket.scala --- @@ -24,12 +27,15 @@ package org.apache.spark.sql.execution.datasources * * @param numBuckets number of buckets. * @param bucketColumnNames the names of the columns that used to generate the bucket id. - * @param sortColumnNames the names of the columns that used to sort data in each bucket. + * @param sortColumns (name, sort direction) of columns that used to sort data in each bucket. */ private[sql] case class BucketSpec( numBuckets: Int, bucketColumnNames: Seq[String], -sortColumnNames: Seq[String]) +sortColumns: Seq[(String, SortDirection)]) { --- End diff -- It is great to have `SortDirection` defined explicitly. So, when we create it we know what will the order by. However, this change implies that we can use `descending` as the order, which is actually not allowed because `UnsafeKVExternalSorter` only sort keys with ascending order (this sorter is used in `DynamicPartitionWriterContainer` when we generate bucketing files). So, I think for now, it is better to revert this change. In future, we can revisit it when we store sort direction in metastore and we can actually sort rows in a bucket file with a descending order. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-194613303 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52770/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-194613298 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-194612768 **[Test build #52770 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52770/consoleFull)** for PR 11573 at commit [`37854fc`](https://github.com/apache/spark/commit/37854fccec3af29f15acfdd4693a532b6f862469). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-194600792 Also cc @rxin @yhuai. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-194590031 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-194590032 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52773/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-194589783 **[Test build #52773 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52773/consoleFull)** for PR 11573 at commit [`6ad8dd5`](https://github.com/apache/spark/commit/6ad8dd5f5294fb7c9424651920706645d1dba8f9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-194555700 **[Test build #52773 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52773/consoleFull)** for PR 11573 at commit [`6ad8dd5`](https://github.com/apache/spark/commit/6ad8dd5f5294fb7c9424651920706645d1dba8f9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-194550969 **[Test build #52770 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52770/consoleFull)** for PR 11573 at commit [`37854fc`](https://github.com/apache/spark/commit/37854fccec3af29f15acfdd4693a532b6f862469). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-194549041 @hvanhovell I believe as of the latest commit I've addressed all of your comments. This is ready from my side so I've removed the WIP tag. PTAL. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-194189126 @hvanhovell I've addressed most of your comments except a couple ones where I said I would fix later. Separately I've also significantly cleaned up the logic and signatures in the alter table code so you should have a look at that as well. It's probably quite different from what it looked like when you last reviewed this patch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55487582 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala --- @@ -64,6 +85,53 @@ private[sql] class SparkQl(conf: ParserConf = SimpleParserConf()) extends Cataly val tableIdent = extractTableIdent(nameParts) RefreshTable(tableIdent) + case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: createDatabaseArgs) => +val Seq( + allowExisting, + dbLocation, + databaseComment, + dbprops) = getClauses(Seq( + "TOK_IFNOTEXISTS", + "TOK_DATABASELOCATION", + "TOK_DATABASECOMMENT", + "TOK_DATABASEPROPERTIES"), createDatabaseArgs) +val location = dbLocation.map { + case Token("TOK_DATABASELOCATION", Token(loc, Nil) :: Nil) => unquoteString(loc) +} +val comment = databaseComment.map { + case Token("TOK_DATABASECOMMENT", Token(comment, Nil) :: Nil) => unquoteString(comment) +} +val props: Map[String, String] = dbprops.toSeq.flatMap { + case Token("TOK_DATABASEPROPERTIES", propList) => +propList.flatMap(extractProps(_, "TOK_DBPROPLIST", "TOK_TABLEPROPERTY")) +}.toMap +CreateDatabase(databaseName, allowExisting.isDefined, location, comment, props)(node.source) + + case Token("TOK_CREATEFUNCTION", func :: as :: createFuncArgs) => --- End diff -- will fix later --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55487513 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala --- @@ -0,0 +1,410 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending} +import org.apache.spark.sql.catalyst.parser._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.types.StructType + + +/** + * Helper object to parse alter table commands. + */ +object AlterTableCommandParser { + import ParserUtils._ + + /** + * Parse the given node assuming it is an alter table command. + */ + def parse(v1: ASTNode): LogicalPlan = { +v1.children match { + case (tabName @ Token("TOK_TABNAME", _)) :: restNodes => +val tableIdent: TableIdentifier = extractTableIdent(tabName) +val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children) +val partition = partitionSpec.flatMap(parsePartitionSpec) +matchAlterTableCommands(v1, restNodes, tableIdent, partition) + case _ => +throw new AnalysisException(s"Could not parse alter table command: '${v1.text}'") +} + } + + private def cleanAndUnquoteString(s: String): String = { +cleanIdentifier(unquoteString(s)) + } + + private def parsePartitionSpec(node: ASTNode): Option[Map[String, Option[String]]] = { +node match { + case Token("TOK_PARTSPEC", partitions) => +val spec = partitions.map { + case Token("TOK_PARTVAL", ident :: constant :: Nil) => +(cleanAndUnquoteString(ident.text), Some(cleanAndUnquoteString(constant.text))) + case Token("TOK_PARTVAL", ident :: Nil) => +(cleanAndUnquoteString(ident.text), None) +}.toMap +Some(spec) + case _ => None +} + } + + private def extractTableProps(node: ASTNode): Map[String, Option[String]] = { +node match { + case Token("TOK_TABLEPROPERTIES", propsList) => +propsList.flatMap { + case Token("TOK_TABLEPROPLIST", props) => +props.map { + case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", Nil) :: Nil) => +val k = cleanAndUnquoteString(key.text) +(k, None) + case Token("TOK_TABLEPROPERTY", key :: value :: Nil) => +val k = cleanAndUnquoteString(key.text) +val v = cleanAndUnquoteString(value.text) +(k, Some(v)) +} +}.toMap + case _ => +throw new AnalysisException( + s"Expected table properties in alter table command: '${node.text}'") +} + } + + // TODO: This method is massive. Break it down. Also, add some comments... + private def matchAlterTableCommands( + node: ASTNode, + nodes: Seq[ASTNode], + tableIdent: TableIdentifier, + partition: Option[Map[String, Option[String]]]): LogicalPlan = { +nodes match { + case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ => +val renamedTable = getClause("TOK_TABNAME", renameArgs) +val renamedTableIdent: TableIdentifier = extractTableIdent(renamedTable) +AlterTableRename(tableIdent, renamedTableIdent)(node.source) + + case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ => +val setTableProperties = extractTableProps(args.head) +AlterTableSetProperties( + tableIdent, +
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55487572 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/bucket.scala --- @@ -29,7 +32,8 @@ package org.apache.spark.sql.execution.datasources private[sql] case class BucketSpec( numBuckets: Int, bucketColumnNames: Seq[String], -sortColumnNames: Seq[String]) +sortColumnNames: Seq[String], +sortDirections: Seq[SortDirection] = Nil) --- End diff -- will fix later --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-194188020 **[Test build #52739 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52739/consoleFull)** for PR 11573 at commit [`6538829`](https://github.com/apache/spark/commit/6538829fea21463826970997cf05eaedfcadb100). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55487318 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala --- @@ -0,0 +1,410 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending} +import org.apache.spark.sql.catalyst.parser._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.types.StructType + + +/** + * Helper object to parse alter table commands. + */ +object AlterTableCommandParser { + import ParserUtils._ + + /** + * Parse the given node assuming it is an alter table command. + */ + def parse(v1: ASTNode): LogicalPlan = { +v1.children match { + case (tabName @ Token("TOK_TABNAME", _)) :: restNodes => +val tableIdent: TableIdentifier = extractTableIdent(tabName) +val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children) +val partition = partitionSpec.flatMap(parsePartitionSpec) +matchAlterTableCommands(v1, restNodes, tableIdent, partition) + case _ => +throw new AnalysisException(s"Could not parse alter table command: '${v1.text}'") +} + } + + private def cleanAndUnquoteString(s: String): String = { +cleanIdentifier(unquoteString(s)) + } + + private def parsePartitionSpec(node: ASTNode): Option[Map[String, Option[String]]] = { +node match { + case Token("TOK_PARTSPEC", partitions) => +val spec = partitions.map { + case Token("TOK_PARTVAL", ident :: constant :: Nil) => +(cleanAndUnquoteString(ident.text), Some(cleanAndUnquoteString(constant.text))) + case Token("TOK_PARTVAL", ident :: Nil) => +(cleanAndUnquoteString(ident.text), None) +}.toMap +Some(spec) + case _ => None +} + } + + private def extractTableProps(node: ASTNode): Map[String, Option[String]] = { +node match { + case Token("TOK_TABLEPROPERTIES", propsList) => +propsList.flatMap { + case Token("TOK_TABLEPROPLIST", props) => +props.map { + case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", Nil) :: Nil) => +val k = cleanAndUnquoteString(key.text) +(k, None) + case Token("TOK_TABLEPROPERTY", key :: value :: Nil) => +val k = cleanAndUnquoteString(key.text) +val v = cleanAndUnquoteString(value.text) +(k, Some(v)) +} +}.toMap + case _ => +throw new AnalysisException( + s"Expected table properties in alter table command: '${node.text}'") +} + } + + // TODO: This method is massive. Break it down. Also, add some comments... + private def matchAlterTableCommands( + node: ASTNode, + nodes: Seq[ASTNode], + tableIdent: TableIdentifier, + partition: Option[Map[String, Option[String]]]): LogicalPlan = { +nodes match { + case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ => +val renamedTable = getClause("TOK_TABNAME", renameArgs) +val renamedTableIdent: TableIdentifier = extractTableIdent(renamedTable) +AlterTableRename(tableIdent, renamedTableIdent)(node.source) + + case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ => +val setTableProperties = extractTableProps(args.head) +AlterTableSetProperties( + tableIdent, +
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55487276 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala --- @@ -0,0 +1,410 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending} +import org.apache.spark.sql.catalyst.parser._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.types.StructType + + +/** + * Helper object to parse alter table commands. + */ +object AlterTableCommandParser { + import ParserUtils._ + + /** + * Parse the given node assuming it is an alter table command. + */ + def parse(v1: ASTNode): LogicalPlan = { +v1.children match { + case (tabName @ Token("TOK_TABNAME", _)) :: restNodes => +val tableIdent: TableIdentifier = extractTableIdent(tabName) +val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children) +val partition = partitionSpec.flatMap(parsePartitionSpec) +matchAlterTableCommands(v1, restNodes, tableIdent, partition) + case _ => +throw new AnalysisException(s"Could not parse alter table command: '${v1.text}'") +} + } + + private def cleanAndUnquoteString(s: String): String = { +cleanIdentifier(unquoteString(s)) + } + + private def parsePartitionSpec(node: ASTNode): Option[Map[String, Option[String]]] = { +node match { + case Token("TOK_PARTSPEC", partitions) => +val spec = partitions.map { + case Token("TOK_PARTVAL", ident :: constant :: Nil) => +(cleanAndUnquoteString(ident.text), Some(cleanAndUnquoteString(constant.text))) + case Token("TOK_PARTVAL", ident :: Nil) => +(cleanAndUnquoteString(ident.text), None) +}.toMap +Some(spec) + case _ => None +} + } + + private def extractTableProps(node: ASTNode): Map[String, Option[String]] = { +node match { + case Token("TOK_TABLEPROPERTIES", propsList) => +propsList.flatMap { + case Token("TOK_TABLEPROPLIST", props) => +props.map { + case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", Nil) :: Nil) => +val k = cleanAndUnquoteString(key.text) +(k, None) + case Token("TOK_TABLEPROPERTY", key :: value :: Nil) => +val k = cleanAndUnquoteString(key.text) +val v = cleanAndUnquoteString(value.text) +(k, Some(v)) +} +}.toMap + case _ => +throw new AnalysisException( + s"Expected table properties in alter table command: '${node.text}'") +} + } + + // TODO: This method is massive. Break it down. Also, add some comments... + private def matchAlterTableCommands( + node: ASTNode, + nodes: Seq[ASTNode], + tableIdent: TableIdentifier, + partition: Option[Map[String, Option[String]]]): LogicalPlan = { +nodes match { + case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ => --- End diff -- @viirya can you answer this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-194186989 **[Test build #52737 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52737/consoleFull)** for PR 11573 at commit [`f54501f`](https://github.com/apache/spark/commit/f54501f6c244c8e37b332991fd22617ec1ef1e76). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55486952 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala --- @@ -64,6 +85,53 @@ private[sql] class SparkQl(conf: ParserConf = SimpleParserConf()) extends Cataly val tableIdent = extractTableIdent(nameParts) RefreshTable(tableIdent) + case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: createDatabaseArgs) => +val Seq( + allowExisting, --- End diff -- I've renamed it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55486925 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala --- @@ -0,0 +1,410 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending} +import org.apache.spark.sql.catalyst.parser._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.types.StructType + + +/** + * Helper object to parse alter table commands. + */ +object AlterTableCommandParser { + import ParserUtils._ + + /** + * Parse the given node assuming it is an alter table command. + */ + def parse(v1: ASTNode): LogicalPlan = { +v1.children match { + case (tabName @ Token("TOK_TABNAME", _)) :: restNodes => +val tableIdent: TableIdentifier = extractTableIdent(tabName) +val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children) +val partition = partitionSpec.flatMap(parsePartitionSpec) +matchAlterTableCommands(v1, restNodes, tableIdent, partition) + case _ => +throw new AnalysisException(s"Could not parse alter table command: '${v1.text}'") +} + } + + private def cleanAndUnquoteString(s: String): String = { +cleanIdentifier(unquoteString(s)) --- End diff -- if you look at `ParserUtils` that's done in `cleanIdentifier` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55352952 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala --- @@ -64,6 +85,53 @@ private[sql] class SparkQl(conf: ParserConf = SimpleParserConf()) extends Cataly val tableIdent = extractTableIdent(nameParts) RefreshTable(tableIdent) + case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: createDatabaseArgs) => +val Seq( + allowExisting, --- End diff -- When i see a parameter that is named `allowExists` I think the existence of the current table is allowed, and that as a consequence the current create table command is going to overwrite the existing table. Why not name this `ifNotExists`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55337949 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import org.apache.spark.Logging +import org.apache.spark.sql.{Row, SQLContext} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference} +import org.apache.spark.sql.execution.datasources.BucketSpec +import org.apache.spark.sql.types._ + + +/** + * A DDL command expected to be run in the underlying system without Spark parsing the --- End diff -- The query has been parsed it should be something like ...without Spark executing the querty... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55337179 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/bucket.scala --- @@ -29,7 +32,8 @@ package org.apache.spark.sql.execution.datasources private[sql] case class BucketSpec( numBuckets: Int, bucketColumnNames: Seq[String], -sortColumnNames: Seq[String]) +sortColumnNames: Seq[String], +sortDirections: Seq[SortDirection] = Nil) --- End diff -- I still think sortColumnNames and sortDirections should be combined into a seq of pairs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55336977 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala --- @@ -0,0 +1,410 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending} +import org.apache.spark.sql.catalyst.parser._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.types.StructType + + +/** + * Helper object to parse alter table commands. + */ +object AlterTableCommandParser { + import ParserUtils._ + + /** + * Parse the given node assuming it is an alter table command. + */ + def parse(v1: ASTNode): LogicalPlan = { +v1.children match { + case (tabName @ Token("TOK_TABNAME", _)) :: restNodes => +val tableIdent: TableIdentifier = extractTableIdent(tabName) +val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children) +val partition = partitionSpec.flatMap(parsePartitionSpec) +matchAlterTableCommands(v1, restNodes, tableIdent, partition) + case _ => +throw new AnalysisException(s"Could not parse alter table command: '${v1.text}'") +} + } + + private def cleanAndUnquoteString(s: String): String = { +cleanIdentifier(unquoteString(s)) + } + + private def parsePartitionSpec(node: ASTNode): Option[Map[String, Option[String]]] = { +node match { + case Token("TOK_PARTSPEC", partitions) => +val spec = partitions.map { + case Token("TOK_PARTVAL", ident :: constant :: Nil) => +(cleanAndUnquoteString(ident.text), Some(cleanAndUnquoteString(constant.text))) + case Token("TOK_PARTVAL", ident :: Nil) => +(cleanAndUnquoteString(ident.text), None) +}.toMap +Some(spec) + case _ => None +} + } + + private def extractTableProps(node: ASTNode): Map[String, Option[String]] = { +node match { + case Token("TOK_TABLEPROPERTIES", propsList) => +propsList.flatMap { + case Token("TOK_TABLEPROPLIST", props) => +props.map { + case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", Nil) :: Nil) => +val k = cleanAndUnquoteString(key.text) +(k, None) + case Token("TOK_TABLEPROPERTY", key :: value :: Nil) => +val k = cleanAndUnquoteString(key.text) +val v = cleanAndUnquoteString(value.text) +(k, Some(v)) +} +}.toMap + case _ => +throw new AnalysisException( + s"Expected table properties in alter table command: '${node.text}'") +} + } + + // TODO: This method is massive. Break it down. Also, add some comments... + private def matchAlterTableCommands( + node: ASTNode, + nodes: Seq[ASTNode], + tableIdent: TableIdentifier, + partition: Option[Map[String, Option[String]]]): LogicalPlan = { +nodes match { + case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ => +val renamedTable = getClause("TOK_TABNAME", renameArgs) +val renamedTableIdent: TableIdentifier = extractTableIdent(renamedTable) +AlterTableRename(tableIdent, renamedTableIdent)(node.source) + + case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ => +val setTableProperties = extractTableProps(args.head) +AlterTableSetProperties( + tableIdent, +
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55336439 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala --- @@ -0,0 +1,410 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending} +import org.apache.spark.sql.catalyst.parser._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.types.StructType + + +/** + * Helper object to parse alter table commands. + */ +object AlterTableCommandParser { + import ParserUtils._ + + /** + * Parse the given node assuming it is an alter table command. + */ + def parse(v1: ASTNode): LogicalPlan = { +v1.children match { + case (tabName @ Token("TOK_TABNAME", _)) :: restNodes => +val tableIdent: TableIdentifier = extractTableIdent(tabName) +val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children) +val partition = partitionSpec.flatMap(parsePartitionSpec) +matchAlterTableCommands(v1, restNodes, tableIdent, partition) + case _ => +throw new AnalysisException(s"Could not parse alter table command: '${v1.text}'") +} + } + + private def cleanAndUnquoteString(s: String): String = { +cleanIdentifier(unquoteString(s)) + } + + private def parsePartitionSpec(node: ASTNode): Option[Map[String, Option[String]]] = { +node match { + case Token("TOK_PARTSPEC", partitions) => +val spec = partitions.map { + case Token("TOK_PARTVAL", ident :: constant :: Nil) => +(cleanAndUnquoteString(ident.text), Some(cleanAndUnquoteString(constant.text))) + case Token("TOK_PARTVAL", ident :: Nil) => +(cleanAndUnquoteString(ident.text), None) +}.toMap +Some(spec) + case _ => None +} + } + + private def extractTableProps(node: ASTNode): Map[String, Option[String]] = { +node match { + case Token("TOK_TABLEPROPERTIES", propsList) => +propsList.flatMap { + case Token("TOK_TABLEPROPLIST", props) => +props.map { + case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", Nil) :: Nil) => +val k = cleanAndUnquoteString(key.text) +(k, None) + case Token("TOK_TABLEPROPERTY", key :: value :: Nil) => +val k = cleanAndUnquoteString(key.text) +val v = cleanAndUnquoteString(value.text) +(k, Some(v)) +} +}.toMap + case _ => +throw new AnalysisException( + s"Expected table properties in alter table command: '${node.text}'") +} + } + + // TODO: This method is massive. Break it down. Also, add some comments... + private def matchAlterTableCommands( + node: ASTNode, + nodes: Seq[ASTNode], + tableIdent: TableIdentifier, + partition: Option[Map[String, Option[String]]]): LogicalPlan = { +nodes match { + case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ => +val renamedTable = getClause("TOK_TABNAME", renameArgs) +val renamedTableIdent: TableIdentifier = extractTableIdent(renamedTable) +AlterTableRename(tableIdent, renamedTableIdent)(node.source) + + case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ => +val setTableProperties = extractTableProps(args.head) +AlterTableSetProperties( + tableIdent, +
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55336099 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala --- @@ -0,0 +1,410 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending} +import org.apache.spark.sql.catalyst.parser._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.types.StructType + + +/** + * Helper object to parse alter table commands. + */ +object AlterTableCommandParser { + import ParserUtils._ + + /** + * Parse the given node assuming it is an alter table command. + */ + def parse(v1: ASTNode): LogicalPlan = { +v1.children match { + case (tabName @ Token("TOK_TABNAME", _)) :: restNodes => +val tableIdent: TableIdentifier = extractTableIdent(tabName) +val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children) +val partition = partitionSpec.flatMap(parsePartitionSpec) +matchAlterTableCommands(v1, restNodes, tableIdent, partition) + case _ => +throw new AnalysisException(s"Could not parse alter table command: '${v1.text}'") +} + } + + private def cleanAndUnquoteString(s: String): String = { +cleanIdentifier(unquoteString(s)) + } + + private def parsePartitionSpec(node: ASTNode): Option[Map[String, Option[String]]] = { +node match { + case Token("TOK_PARTSPEC", partitions) => +val spec = partitions.map { + case Token("TOK_PARTVAL", ident :: constant :: Nil) => +(cleanAndUnquoteString(ident.text), Some(cleanAndUnquoteString(constant.text))) + case Token("TOK_PARTVAL", ident :: Nil) => +(cleanAndUnquoteString(ident.text), None) +}.toMap +Some(spec) + case _ => None +} + } + + private def extractTableProps(node: ASTNode): Map[String, Option[String]] = { +node match { + case Token("TOK_TABLEPROPERTIES", propsList) => +propsList.flatMap { + case Token("TOK_TABLEPROPLIST", props) => +props.map { + case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", Nil) :: Nil) => +val k = cleanAndUnquoteString(key.text) +(k, None) + case Token("TOK_TABLEPROPERTY", key :: value :: Nil) => +val k = cleanAndUnquoteString(key.text) +val v = cleanAndUnquoteString(value.text) +(k, Some(v)) +} +}.toMap + case _ => +throw new AnalysisException( + s"Expected table properties in alter table command: '${node.text}'") +} + } + + // TODO: This method is massive. Break it down. Also, add some comments... + private def matchAlterTableCommands( + node: ASTNode, + nodes: Seq[ASTNode], + tableIdent: TableIdentifier, + partition: Option[Map[String, Option[String]]]): LogicalPlan = { +nodes match { + case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ => +val renamedTable = getClause("TOK_TABNAME", renameArgs) +val renamedTableIdent: TableIdentifier = extractTableIdent(renamedTable) +AlterTableRename(tableIdent, renamedTableIdent)(node.source) + + case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ => +val setTableProperties = extractTableProps(args.head) +AlterTableSetProperties( + tableIdent, +
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55335658 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala --- @@ -64,6 +85,53 @@ private[sql] class SparkQl(conf: ParserConf = SimpleParserConf()) extends Cataly val tableIdent = extractTableIdent(nameParts) RefreshTable(tableIdent) + case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: createDatabaseArgs) => +val Seq( + allowExisting, --- End diff -- Not quite understand your question? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55335240 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala --- @@ -0,0 +1,410 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending} +import org.apache.spark.sql.catalyst.parser._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.types.StructType + + +/** + * Helper object to parse alter table commands. + */ +object AlterTableCommandParser { + import ParserUtils._ + + /** + * Parse the given node assuming it is an alter table command. + */ + def parse(v1: ASTNode): LogicalPlan = { +v1.children match { + case (tabName @ Token("TOK_TABNAME", _)) :: restNodes => +val tableIdent: TableIdentifier = extractTableIdent(tabName) +val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children) +val partition = partitionSpec.flatMap(parsePartitionSpec) +matchAlterTableCommands(v1, restNodes, tableIdent, partition) + case _ => +throw new AnalysisException(s"Could not parse alter table command: '${v1.text}'") +} + } + + private def cleanAndUnquoteString(s: String): String = { +cleanIdentifier(unquoteString(s)) + } + + private def parsePartitionSpec(node: ASTNode): Option[Map[String, Option[String]]] = { +node match { + case Token("TOK_PARTSPEC", partitions) => +val spec = partitions.map { + case Token("TOK_PARTVAL", ident :: constant :: Nil) => +(cleanAndUnquoteString(ident.text), Some(cleanAndUnquoteString(constant.text))) + case Token("TOK_PARTVAL", ident :: Nil) => +(cleanAndUnquoteString(ident.text), None) +}.toMap +Some(spec) + case _ => None +} + } + + private def extractTableProps(node: ASTNode): Map[String, Option[String]] = { +node match { + case Token("TOK_TABLEPROPERTIES", propsList) => +propsList.flatMap { + case Token("TOK_TABLEPROPLIST", props) => +props.map { + case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", Nil) :: Nil) => +val k = cleanAndUnquoteString(key.text) +(k, None) + case Token("TOK_TABLEPROPERTY", key :: value :: Nil) => +val k = cleanAndUnquoteString(key.text) +val v = cleanAndUnquoteString(value.text) +(k, Some(v)) +} +}.toMap + case _ => +throw new AnalysisException( + s"Expected table properties in alter table command: '${node.text}'") +} + } + + // TODO: This method is massive. Break it down. Also, add some comments... + private def matchAlterTableCommands( + node: ASTNode, + nodes: Seq[ASTNode], + tableIdent: TableIdentifier, + partition: Option[Map[String, Option[String]]]): LogicalPlan = { +nodes match { + case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ => --- End diff -- Whe the wildcards at the end every pattern match. Do we expect additional input which can safely ignore? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55335131 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala --- @@ -0,0 +1,410 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending} +import org.apache.spark.sql.catalyst.parser._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.types.StructType + + +/** + * Helper object to parse alter table commands. + */ +object AlterTableCommandParser { + import ParserUtils._ + + /** + * Parse the given node assuming it is an alter table command. + */ + def parse(v1: ASTNode): LogicalPlan = { +v1.children match { + case (tabName @ Token("TOK_TABNAME", _)) :: restNodes => +val tableIdent: TableIdentifier = extractTableIdent(tabName) +val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children) +val partition = partitionSpec.flatMap(parsePartitionSpec) +matchAlterTableCommands(v1, restNodes, tableIdent, partition) + case _ => +throw new AnalysisException(s"Could not parse alter table command: '${v1.text}'") +} + } + + private def cleanAndUnquoteString(s: String): String = { +cleanIdentifier(unquoteString(s)) + } + + private def parsePartitionSpec(node: ASTNode): Option[Map[String, Option[String]]] = { +node match { + case Token("TOK_PARTSPEC", partitions) => +val spec = partitions.map { + case Token("TOK_PARTVAL", ident :: constant :: Nil) => +(cleanAndUnquoteString(ident.text), Some(cleanAndUnquoteString(constant.text))) + case Token("TOK_PARTVAL", ident :: Nil) => +(cleanAndUnquoteString(ident.text), None) +}.toMap +Some(spec) + case _ => None +} + } + + private def extractTableProps(node: ASTNode): Map[String, Option[String]] = { +node match { + case Token("TOK_TABLEPROPERTIES", propsList) => +propsList.flatMap { + case Token("TOK_TABLEPROPLIST", props) => +props.map { + case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", Nil) :: Nil) => +val k = cleanAndUnquoteString(key.text) +(k, None) + case Token("TOK_TABLEPROPERTY", key :: value :: Nil) => +val k = cleanAndUnquoteString(key.text) +val v = cleanAndUnquoteString(value.text) +(k, Some(v)) +} +}.toMap + case _ => +throw new AnalysisException( + s"Expected table properties in alter table command: '${node.text}'") +} + } + + // TODO: This method is massive. Break it down. Also, add some comments... + private def matchAlterTableCommands( + node: ASTNode, + nodes: Seq[ASTNode], + tableIdent: TableIdentifier, + partition: Option[Map[String, Option[String]]]): LogicalPlan = { +nodes match { + case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ => +val renamedTable = getClause("TOK_TABNAME", renameArgs) +val renamedTableIdent: TableIdentifier = extractTableIdent(renamedTable) +AlterTableRename(tableIdent, renamedTableIdent)(node.source) + + case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ => +val setTableProperties = extractTableProps(args.head) +AlterTableSetProperties( + tableIdent, +
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55335079 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala --- @@ -0,0 +1,410 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending} +import org.apache.spark.sql.catalyst.parser._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.types.StructType + + +/** + * Helper object to parse alter table commands. + */ +object AlterTableCommandParser { + import ParserUtils._ + + /** + * Parse the given node assuming it is an alter table command. + */ + def parse(v1: ASTNode): LogicalPlan = { +v1.children match { + case (tabName @ Token("TOK_TABNAME", _)) :: restNodes => +val tableIdent: TableIdentifier = extractTableIdent(tabName) +val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children) +val partition = partitionSpec.flatMap(parsePartitionSpec) +matchAlterTableCommands(v1, restNodes, tableIdent, partition) + case _ => +throw new AnalysisException(s"Could not parse alter table command: '${v1.text}'") +} + } + + private def cleanAndUnquoteString(s: String): String = { +cleanIdentifier(unquoteString(s)) + } + + private def parsePartitionSpec(node: ASTNode): Option[Map[String, Option[String]]] = { +node match { + case Token("TOK_PARTSPEC", partitions) => +val spec = partitions.map { + case Token("TOK_PARTVAL", ident :: constant :: Nil) => +(cleanAndUnquoteString(ident.text), Some(cleanAndUnquoteString(constant.text))) + case Token("TOK_PARTVAL", ident :: Nil) => +(cleanAndUnquoteString(ident.text), None) +}.toMap +Some(spec) + case _ => None +} + } + + private def extractTableProps(node: ASTNode): Map[String, Option[String]] = { +node match { + case Token("TOK_TABLEPROPERTIES", propsList) => +propsList.flatMap { + case Token("TOK_TABLEPROPLIST", props) => +props.map { + case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", Nil) :: Nil) => +val k = cleanAndUnquoteString(key.text) +(k, None) + case Token("TOK_TABLEPROPERTY", key :: value :: Nil) => +val k = cleanAndUnquoteString(key.text) +val v = cleanAndUnquoteString(value.text) +(k, Some(v)) +} +}.toMap + case _ => +throw new AnalysisException( + s"Expected table properties in alter table command: '${node.text}'") +} + } + + // TODO: This method is massive. Break it down. Also, add some comments... + private def matchAlterTableCommands( + node: ASTNode, + nodes: Seq[ASTNode], + tableIdent: TableIdentifier, + partition: Option[Map[String, Option[String]]]): LogicalPlan = { +nodes match { + case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ => +val renamedTable = getClause("TOK_TABNAME", renameArgs) +val renamedTableIdent: TableIdentifier = extractTableIdent(renamedTable) +AlterTableRename(tableIdent, renamedTableIdent)(node.source) + + case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ => +val setTableProperties = extractTableProps(args.head) +AlterTableSetProperties( + tableIdent, +
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55334443 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala --- @@ -0,0 +1,410 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending} +import org.apache.spark.sql.catalyst.parser._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.types.StructType + + +/** + * Helper object to parse alter table commands. + */ +object AlterTableCommandParser { + import ParserUtils._ + + /** + * Parse the given node assuming it is an alter table command. + */ + def parse(v1: ASTNode): LogicalPlan = { +v1.children match { + case (tabName @ Token("TOK_TABNAME", _)) :: restNodes => +val tableIdent: TableIdentifier = extractTableIdent(tabName) +val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children) +val partition = partitionSpec.flatMap(parsePartitionSpec) +matchAlterTableCommands(v1, restNodes, tableIdent, partition) + case _ => +throw new AnalysisException(s"Could not parse alter table command: '${v1.text}'") +} + } + + private def cleanAndUnquoteString(s: String): String = { +cleanIdentifier(unquoteString(s)) + } + + private def parsePartitionSpec(node: ASTNode): Option[Map[String, Option[String]]] = { +node match { + case Token("TOK_PARTSPEC", partitions) => +val spec = partitions.map { + case Token("TOK_PARTVAL", ident :: constant :: Nil) => +(cleanAndUnquoteString(ident.text), Some(cleanAndUnquoteString(constant.text))) + case Token("TOK_PARTVAL", ident :: Nil) => +(cleanAndUnquoteString(ident.text), None) +}.toMap +Some(spec) + case _ => None +} + } + + private def extractTableProps(node: ASTNode): Map[String, Option[String]] = { +node match { + case Token("TOK_TABLEPROPERTIES", propsList) => +propsList.flatMap { + case Token("TOK_TABLEPROPLIST", props) => +props.map { + case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", Nil) :: Nil) => +val k = cleanAndUnquoteString(key.text) +(k, None) + case Token("TOK_TABLEPROPERTY", key :: value :: Nil) => +val k = cleanAndUnquoteString(key.text) +val v = cleanAndUnquoteString(value.text) +(k, Some(v)) +} +}.toMap + case _ => +throw new AnalysisException( + s"Expected table properties in alter table command: '${node.text}'") +} + } + + // TODO: This method is massive. Break it down. Also, add some comments... + private def matchAlterTableCommands( + node: ASTNode, + nodes: Seq[ASTNode], + tableIdent: TableIdentifier, + partition: Option[Map[String, Option[String]]]): LogicalPlan = { +nodes match { + case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ => +val renamedTable = getClause("TOK_TABNAME", renameArgs) +val renamedTableIdent: TableIdentifier = extractTableIdent(renamedTable) +AlterTableRename(tableIdent, renamedTableIdent)(node.source) + + case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ => +val setTableProperties = extractTableProps(args.head) +AlterTableSetProperties( + tableIdent, +
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55333881 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala --- @@ -0,0 +1,410 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending} +import org.apache.spark.sql.catalyst.parser._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.types.StructType + + +/** + * Helper object to parse alter table commands. + */ +object AlterTableCommandParser { + import ParserUtils._ + + /** + * Parse the given node assuming it is an alter table command. + */ + def parse(v1: ASTNode): LogicalPlan = { +v1.children match { + case (tabName @ Token("TOK_TABNAME", _)) :: restNodes => +val tableIdent: TableIdentifier = extractTableIdent(tabName) +val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children) +val partition = partitionSpec.flatMap(parsePartitionSpec) +matchAlterTableCommands(v1, restNodes, tableIdent, partition) + case _ => +throw new AnalysisException(s"Could not parse alter table command: '${v1.text}'") +} + } + + private def cleanAndUnquoteString(s: String): String = { +cleanIdentifier(unquoteString(s)) + } + + private def parsePartitionSpec(node: ASTNode): Option[Map[String, Option[String]]] = { +node match { + case Token("TOK_PARTSPEC", partitions) => +val spec = partitions.map { + case Token("TOK_PARTVAL", ident :: constant :: Nil) => +(cleanAndUnquoteString(ident.text), Some(cleanAndUnquoteString(constant.text))) + case Token("TOK_PARTVAL", ident :: Nil) => +(cleanAndUnquoteString(ident.text), None) +}.toMap +Some(spec) + case _ => None +} + } + + private def extractTableProps(node: ASTNode): Map[String, Option[String]] = { +node match { + case Token("TOK_TABLEPROPERTIES", propsList) => +propsList.flatMap { + case Token("TOK_TABLEPROPLIST", props) => +props.map { + case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", Nil) :: Nil) => +val k = cleanAndUnquoteString(key.text) +(k, None) + case Token("TOK_TABLEPROPERTY", key :: value :: Nil) => +val k = cleanAndUnquoteString(key.text) +val v = cleanAndUnquoteString(value.text) +(k, Some(v)) +} +}.toMap + case _ => +throw new AnalysisException( + s"Expected table properties in alter table command: '${node.text}'") +} + } + + // TODO: This method is massive. Break it down. Also, add some comments... + private def matchAlterTableCommands( + node: ASTNode, + nodes: Seq[ASTNode], + tableIdent: TableIdentifier, + partition: Option[Map[String, Option[String]]]): LogicalPlan = { +nodes match { + case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ => +val renamedTable = getClause("TOK_TABNAME", renameArgs) +val renamedTableIdent: TableIdentifier = extractTableIdent(renamedTable) +AlterTableRename(tableIdent, renamedTableIdent)(node.source) + + case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ => +val setTableProperties = extractTableProps(args.head) +AlterTableSetProperties( + tableIdent, +
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55333529 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala --- @@ -0,0 +1,410 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import scala.collection.mutable.ArrayBuffer + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending} +import org.apache.spark.sql.catalyst.parser._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.types.StructType + + +/** + * Helper object to parse alter table commands. + */ +object AlterTableCommandParser { + import ParserUtils._ + + /** + * Parse the given node assuming it is an alter table command. + */ + def parse(v1: ASTNode): LogicalPlan = { +v1.children match { + case (tabName @ Token("TOK_TABNAME", _)) :: restNodes => +val tableIdent: TableIdentifier = extractTableIdent(tabName) +val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children) +val partition = partitionSpec.flatMap(parsePartitionSpec) +matchAlterTableCommands(v1, restNodes, tableIdent, partition) + case _ => +throw new AnalysisException(s"Could not parse alter table command: '${v1.text}'") +} + } + + private def cleanAndUnquoteString(s: String): String = { +cleanIdentifier(unquoteString(s)) --- End diff -- I think unqoute string does the cleaning (i.e. stripping leading and trailing quotes) for us. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r5511 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala --- @@ -64,6 +85,53 @@ private[sql] class SparkQl(conf: ParserConf = SimpleParserConf()) extends Cataly val tableIdent = extractTableIdent(nameParts) RefreshTable(tableIdent) + case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: createDatabaseArgs) => +val Seq( + allowExisting, + dbLocation, + databaseComment, + dbprops) = getClauses(Seq( + "TOK_IFNOTEXISTS", + "TOK_DATABASELOCATION", + "TOK_DATABASECOMMENT", + "TOK_DATABASEPROPERTIES"), createDatabaseArgs) +val location = dbLocation.map { + case Token("TOK_DATABASELOCATION", Token(loc, Nil) :: Nil) => unquoteString(loc) +} +val comment = databaseComment.map { + case Token("TOK_DATABASECOMMENT", Token(comment, Nil) :: Nil) => unquoteString(comment) +} +val props: Map[String, String] = dbprops.toSeq.flatMap { + case Token("TOK_DATABASEPROPERTIES", propList) => +propList.flatMap(extractProps(_, "TOK_DBPROPLIST", "TOK_TABLEPROPERTY")) +}.toMap +CreateDatabase(databaseName, allowExisting.isDefined, location, comment, props)(node.source) + + case Token("TOK_CREATEFUNCTION", func :: as :: createFuncArgs) => --- End diff -- I am not sure this pattern works for functions with a dotted name. For example create function s.t as 'foo.bar.func' Produces the following AST: TOK_CREATEFUNCTION 1, 0, 10, 16 :- s 1, 4, 4, 16 :- t 1, 6, 6, 18 +- 'foo.bar.func' 1, 10, 10, 23 Notice how the name is splitted into two top-level ASTNodes. I think we need to change the parser rule here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55332214 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala --- @@ -64,6 +85,53 @@ private[sql] class SparkQl(conf: ParserConf = SimpleParserConf()) extends Cataly val tableIdent = extractTableIdent(nameParts) RefreshTable(tableIdent) + case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: createDatabaseArgs) => +val Seq( + allowExisting, --- End diff -- Isn't `allowExists` the exact opposite of `IF NOT EXISTS`? I am asking because I have a similar case in my PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55324768 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import org.apache.spark.Logging +import org.apache.spark.sql.{Row, SQLContext} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference} +import org.apache.spark.sql.execution.datasources.BucketSpec +import org.apache.spark.sql.types._ + + +/** + * A DDL command expected to be run in the underlying system without Spark parsing the + * query text. + */ +abstract class NativeDDLCommands(val sql: String) extends RunnableCommand { + + override def run(sqlContext: SQLContext): Seq[Row] = { +sqlContext.runNativeSql(sql) + } + + override val output: Seq[Attribute] = { +Seq(AttributeReference("result", StringType, nullable = false)()) + } + +} + +case class CreateDatabase( +databaseName: String, +allowExisting: Boolean, +path: Option[String], +comment: Option[String], +props: Map[String, String])(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class CreateFunction( +functionName: String, +alias: String, +resourcesMap: Map[String, String], +isTemp: Boolean)(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableRename( +tableName: TableIdentifier, +renameTableName: TableIdentifier)(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableSetProperties( +tableName: TableIdentifier, +setProperties: Map[String, Option[String]])(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableDropProperties( +tableName: TableIdentifier, +dropProperties: Map[String, Option[String]], +allowExisting: Boolean)(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableSerDeProperties( +tableName: TableIdentifier, +serdeClassName: Option[String], +serdeProperties: Option[Map[String, Option[String]]], +partition: Option[Map[String, Option[String]]])(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableStoreProperties( +tableName: TableIdentifier, +buckets: Option[BucketSpec], +// TODO: use `clustered` and `sorted` instead for simplicity +noClustered: Boolean, +noSorted: Boolean)(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableSkewed( +tableName: TableIdentifier, +skewedCols: Seq[String], +skewedValues: Seq[Seq[String]], +storedAsDirs: Boolean, +notSkewed: Boolean, +// TODO: what?? +notStoredAsDirs: Boolean)(sql: String) --- End diff -- `notStoredAsDirs` is used when the command is `ALTER TABLE table_name NOT STORED AS DIRECTORIES`. When it is true, `storedAsDirs` is false of course. `storedAsDirs` is used with skewed arguments columns and values. I think we can use just one variable for them. When the command is `ALTER TABLE table_name NOT SKEWED`, then the values of `notStoredAsDirs` and `storedAsDirs` don't matter. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55324206 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import org.apache.spark.Logging +import org.apache.spark.sql.{Row, SQLContext} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference} +import org.apache.spark.sql.execution.datasources.BucketSpec +import org.apache.spark.sql.types._ + + +/** + * A DDL command expected to be run in the underlying system without Spark parsing the + * query text. + */ +abstract class NativeDDLCommands(val sql: String) extends RunnableCommand { + + override def run(sqlContext: SQLContext): Seq[Row] = { +sqlContext.runNativeSql(sql) + } + + override val output: Seq[Attribute] = { +Seq(AttributeReference("result", StringType, nullable = false)()) + } + +} + +case class CreateDatabase( +databaseName: String, +allowExisting: Boolean, +path: Option[String], +comment: Option[String], +props: Map[String, String])(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class CreateFunction( +functionName: String, +alias: String, +resourcesMap: Map[String, String], +isTemp: Boolean)(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableRename( +tableName: TableIdentifier, +renameTableName: TableIdentifier)(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableSetProperties( +tableName: TableIdentifier, +setProperties: Map[String, Option[String]])(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableDropProperties( +tableName: TableIdentifier, +dropProperties: Map[String, Option[String]], +allowExisting: Boolean)(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableSerDeProperties( +tableName: TableIdentifier, +serdeClassName: Option[String], +serdeProperties: Option[Map[String, Option[String]]], +partition: Option[Map[String, Option[String]]])(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableStoreProperties( +tableName: TableIdentifier, +buckets: Option[BucketSpec], +// TODO: use `clustered` and `sorted` instead for simplicity +noClustered: Boolean, +noSorted: Boolean)(sql: String) --- End diff -- Just because the corresponding token is `TOK_NOT_CLUSTERED` and `TOK_NOT_SORTED`. We can use positive here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-193637484 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-193637485 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52630/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-193637270 **[Test build #52630 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52630/consoleFull)** for PR 11573 at commit [`a663b5c`](https://github.com/apache/spark/commit/a663b5ccca7718b41df8b8ede462ac6c9e0b8e8f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-193610715 Note: The only changes I made on top of #11048 is addressing the outstanding comments in that patch and some minor clean ups. It's entirely possible that there still are things that are missing or incorrect given the original patch was not reviewed completely yet. @hvanhovell @yhuai PTAL. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11573#issuecomment-193609366 **[Test build #52630 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52630/consoleFull)** for PR 11573 at commit [`a663b5c`](https://github.com/apache/spark/commit/a663b5ccca7718b41df8b8ede462ac6c9e0b8e8f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55316476 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import org.apache.spark.Logging +import org.apache.spark.sql.{Row, SQLContext} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference} +import org.apache.spark.sql.execution.datasources.BucketSpec +import org.apache.spark.sql.types._ + + +/** + * A DDL command expected to be run in the underlying system without Spark parsing the + * query text. + */ +abstract class NativeDDLCommands(val sql: String) extends RunnableCommand { + + override def run(sqlContext: SQLContext): Seq[Row] = { +sqlContext.runNativeSql(sql) + } + + override val output: Seq[Attribute] = { +Seq(AttributeReference("result", StringType, nullable = false)()) + } + +} + +case class CreateDatabase( +databaseName: String, +allowExisting: Boolean, +path: Option[String], +comment: Option[String], +props: Map[String, String])(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class CreateFunction( +functionName: String, +alias: String, +resourcesMap: Map[String, String], +isTemp: Boolean)(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableRename( +tableName: TableIdentifier, +renameTableName: TableIdentifier)(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableSetProperties( +tableName: TableIdentifier, +setProperties: Map[String, Option[String]])(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableDropProperties( +tableName: TableIdentifier, +dropProperties: Map[String, Option[String]], +allowExisting: Boolean)(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableSerDeProperties( +tableName: TableIdentifier, +serdeClassName: Option[String], +serdeProperties: Option[Map[String, Option[String]]], +partition: Option[Map[String, Option[String]]])(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableStoreProperties( +tableName: TableIdentifier, +buckets: Option[BucketSpec], +// TODO: use `clustered` and `sorted` instead for simplicity +noClustered: Boolean, +noSorted: Boolean)(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableSkewed( +tableName: TableIdentifier, +skewedCols: Seq[String], +skewedValues: Seq[Seq[String]], +storedAsDirs: Boolean, +notSkewed: Boolean, +// TODO: what?? +notStoredAsDirs: Boolean)(sql: String) --- End diff -- @viirya I'm really confused about this flag. We have another one called `storedAsDirs` and this one says `notStoredAsDirs`. It would seem that one is always the opposite of the other, but that's not actually the case in tests, e.g. ``` // DDLCommandSuite, "alter table: skewed" val expected3 = AlterTableSkewed( TableIdentifier("table_name", None), Seq("dt", "country"), Seq(List("2008-08-08", "us"), List("2009-09-09", "uk")), storedAsDirs = false, notSkewed = false, notStoredAsDirs = false)(sql3) ``` seems contradictory? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/11573#discussion_r55316403 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -0,0 +1,197 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import org.apache.spark.Logging +import org.apache.spark.sql.{Row, SQLContext} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference} +import org.apache.spark.sql.execution.datasources.BucketSpec +import org.apache.spark.sql.types._ + + +/** + * A DDL command expected to be run in the underlying system without Spark parsing the + * query text. + */ +abstract class NativeDDLCommands(val sql: String) extends RunnableCommand { + + override def run(sqlContext: SQLContext): Seq[Row] = { +sqlContext.runNativeSql(sql) + } + + override val output: Seq[Attribute] = { +Seq(AttributeReference("result", StringType, nullable = false)()) + } + +} + +case class CreateDatabase( +databaseName: String, +allowExisting: Boolean, +path: Option[String], +comment: Option[String], +props: Map[String, String])(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class CreateFunction( +functionName: String, +alias: String, +resourcesMap: Map[String, String], +isTemp: Boolean)(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableRename( +tableName: TableIdentifier, +renameTableName: TableIdentifier)(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableSetProperties( +tableName: TableIdentifier, +setProperties: Map[String, Option[String]])(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableDropProperties( +tableName: TableIdentifier, +dropProperties: Map[String, Option[String]], +allowExisting: Boolean)(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableSerDeProperties( +tableName: TableIdentifier, +serdeClassName: Option[String], +serdeProperties: Option[Map[String, Option[String]]], +partition: Option[Map[String, Option[String]]])(sql: String) + extends NativeDDLCommands(sql) with Logging + +case class AlterTableStoreProperties( +tableName: TableIdentifier, +buckets: Option[BucketSpec], +// TODO: use `clustered` and `sorted` instead for simplicity +noClustered: Boolean, +noSorted: Boolean)(sql: String) --- End diff -- @viirya was there any reason why these have to be negative? It's much easier to understand if it's positive, i.e. `clustered` and `sorted`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...
GitHub user andrewor14 opened a pull request: https://github.com/apache/spark/pull/11573 [SPARK-13139][SQL] Parse Hive DDL commands ourselves ## What changes were proposed in this pull request? This patch is ported over from @viirya's changes in #11048. Currently for most DDLs we just pass the query text directly to Hive. Instead, we should parse these commands ourselves and in the future (not part of this patch) use the `HiveCatalog` to process these DDLs. This is a pretext to merging `SQLContext` and `HiveContext`. Note: As of this patch we still pass the query text to Hive. The difference is that we now parse the commands ourselves so in the future we can just use our own catalog. ## How was this patch tested? Jenkins, new `DDLCommandSuite`, which comprises of about 40% of the changes here. You can merge this pull request into a Git repository by running: $ git pull https://github.com/andrewor14/spark parser-plus-plus Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11573.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11573 commit fc3c1684ad8e24ab9b05f0f7e02659ea2e365ebd Author: Andrew OrDate: 2016-03-04T22:20:07Z Move things into new ParserUtils object commit adcb561a6bcb91a946448b8a5601155c9d714675 Author: Andrew Or Date: 2016-03-07T22:40:48Z Merge branch 'master' of github.com:apache/spark into parser-plus-plus commit 010afddf40d776b3009cee98057e74d499c45012 Author: Andrew Or Date: 2016-03-07T23:56:09Z Port over viirya's changes in #11048 commit 007907460d72a3aa82b222567c08589c62efb614 Author: Andrew Or Date: 2016-03-08T04:43:31Z Address comments from #11408 + fix style commit 02de9b771cc74abb6f1971081411bd6d8c4c4b5f Author: Andrew Or Date: 2016-03-08T04:45:43Z Merge branch 'master' of github.com:apache/spark into parser-plus-plus Conflicts: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/bucket.scala commit 3766f83bb5551c0fd978117c3f41efb99257c984 Author: Andrew Or Date: 2016-03-08T04:49:22Z Minor fixes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org