[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/11573


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-11 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-195597743
  
OK. Let's merge this first (to avoid it has conflicts caused by other 
commits ). We can address the comments in a follow-up PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-11 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55902999
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala ---
@@ -64,10 +83,86 @@ private[sql] class SparkQl(conf: ParserConf = 
SimpleParserConf()) extends Cataly
 val tableIdent = extractTableIdent(nameParts)
 RefreshTable(tableIdent)
 
+  // CREATE DATABASE [IF NOT EXISTS] database_name [COMMENT 
database_comment]
+  // [LOCATION path] [WITH DBPROPERTIES (key1=val1, key2=val2, ...)];
+  case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: args) =>
+val Seq(ifNotExists, dbLocation, databaseComment, dbprops) = 
getClauses(Seq(
+  "TOK_IFNOTEXISTS",
+  "TOK_DATABASELOCATION",
+  "TOK_DATABASECOMMENT",
+  "TOK_DATABASEPROPERTIES"), args)
+val location = dbLocation.map {
+  case Token("TOK_DATABASELOCATION", Token(loc, Nil) :: Nil) => 
unquoteString(loc)
+  case _ => parseFailed("Invalid CREATE DATABASE command", node)
+}
+val comment = databaseComment.map {
+  case Token("TOK_DATABASECOMMENT", Token(com, Nil) :: Nil) => 
unquoteString(com)
+  case _ => parseFailed("Invalid CREATE DATABASE command", node)
+}
+val props = dbprops.toSeq.flatMap {
+  case Token("TOK_DATABASEPROPERTIES", Token("TOK_DBPROPLIST", 
propList) :: Nil) =>
+extractProps(propList, "TOK_TABLEPROPERTY")
+  case _ => parseFailed("Invalid CREATE DATABASE command", node)
+}.toMap
+CreateDatabase(databaseName, ifNotExists.isDefined, location, 
comment, props)(node.source)
+
+  // CREATE [TEMPORARY] FUNCTION [db_name.]function_name AS class_name
+  // [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 
'file_uri'] ];
+  case Token("TOK_CREATEFUNCTION", args) =>
+// Example format:
+//
+//   TOK_CREATEFUNCTION
+// :- db_name
+// :- func_name
+// :- alias
+// +- TOK_RESOURCE_LIST
+//:- TOK_RESOURCE_URI
+//:  :- TOK_JAR
+//:  +- '/path/to/jar'
+//+- TOK_RESOURCE_URI
+//   :- TOK_FILE
+//   +- 'path/to/file'
+val (funcNameArgs, otherArgs) = args.partition {
+  case Token("TOK_RESOURCE_LIST", _) => false
+  case Token("TOK_TEMPORARY", _) => false
+  case Token(_, Nil) => true
+  case _ => parseFailed("Invalid CREATE FUNCTION command", node)
+}
+// If database name is specified, there are 3 tokens, otherwise 2.
+val (funcName, alias) = funcNameArgs match {
+  case Token(dbName, Nil) :: Token(fname, Nil) :: Token(aname, 
Nil) :: Nil =>
+(unquoteString(dbName) + "." + unquoteString(fname), 
unquoteString(aname))
+  case Token(fname, Nil) :: Token(aname, Nil) :: Nil =>
+(unquoteString(fname), unquoteString(aname))
+  case _ =>
+parseFailed("Invalid CREATE FUNCTION command", node)
+}
+// Extract other keywords, if they exist
+val Seq(rList, temp) = getClauses(Seq("TOK_RESOURCE_LIST", 
"TOK_TEMPORARY"), otherArgs)
+val resourcesMap = rList.toSeq.flatMap {
+  case Token("TOK_RESOURCE_LIST", resources) =>
+resources.map {
+  case Token("TOK_RESOURCE_URI", rType :: Token(rPath, Nil) :: 
Nil) =>
+val resourceType = rType match {
+  case Token("TOK_JAR", Nil) => "jar"
+  case Token("TOK_FILE", Nil) => "file"
+  case Token("TOK_ARCHIVE", Nil) => "archive"
+  case Token(f, _) => parseFailed(s"Unexpected resource 
format '$f'", node)
+}
+(resourceType, unquoteString(rPath))
+  case _ => parseFailed("Invalid CREATE FUNCTION command", 
node)
+}
+  case _ => parseFailed("Invalid CREATE FUNCTION command", node)
+}.toMap
+CreateFunction(funcName, alias, resourcesMap, 
temp.isDefined)(node.source)
--- End diff --

Let's also have a test for this case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: 

[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-195592073
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52937/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-195592071
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-195591633
  
**[Test build #52937 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52937/consoleFull)**
 for PR 11573 at commit 
[`bd91b0f`](https://github.com/apache/spark/commit/bd91b0f3c850da796b22c7e575658625d2486057).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-11 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-195570543
  
I have left a few comments. It is a good starting point. Thank you for 
working on this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-11 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55893285
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala ---
@@ -29,7 +29,26 @@ private[sql] class SparkQl(conf: ParserConf = 
SimpleParserConf()) extends Cataly
   import ParserUtils._
 
   /** Check if a command should not be explained. */
-  protected def isNoExplainCommand(command: String): Boolean = 
"TOK_DESCTABLE" == command
+  protected def isNoExplainCommand(command: String): Boolean = {
+"TOK_DESCTABLE" == command || "TOK_ALTERTABLE" == command
+  }
+
+  /**
+   * For each node, extract properties in the form of a list ['key1', 
'key2', 'key3', 'value']
+   * into a pair (key1.key2.key3, value).
+   */
--- End diff --

Also, what is the difference between this method and `extractTableProps` 
defined for alter table?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-11 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55892958
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala
 ---
@@ -0,0 +1,428 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.catalyst.TableIdentifier
+import 
org.apache.spark.sql.catalyst.catalog.ExternalCatalog.TablePartitionSpec
+import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending, 
SortDirection}
+import org.apache.spark.sql.catalyst.parser._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+
+/**
+ * Helper object to parse alter table commands.
+ */
+object AlterTableCommandParser {
+  import ParserUtils._
+
+  /**
+   * Parse the given node assuming it is an alter table command.
+   */
+  def parse(node: ASTNode): LogicalPlan = {
+node.children match {
+  case (tabName @ Token("TOK_TABNAME", _)) :: otherNodes =>
+val tableIdent = extractTableIdent(tabName)
+val partSpec = getClauseOption("TOK_PARTSPEC", 
node.children).map(parsePartitionSpec)
+matchAlterTableCommands(node, otherNodes, tableIdent, partSpec)
+  case _ =>
+parseFailed("Could not parse ALTER TABLE command", node)
+}
+  }
+
+  private def cleanAndUnquoteString(s: String): String = {
+cleanIdentifier(unquoteString(s))
+  }
+
+  /**
+   * Extract partition spec from the given [[ASTNode]] as a map, assuming 
it exists.
+   *
+   * Expected format:
+   *   +- TOK_PARTSPEC
+   *  :- TOK_PARTVAL
+   *  :  :- dt
+   *  :  +- '2008-08-08'
+   *  +- TOK_PARTVAL
+   * :- country
+   * +- 'us'
+   */
+  private def parsePartitionSpec(node: ASTNode): Map[String, String] = {
+node match {
+  case Token("TOK_PARTSPEC", partitions) =>
+partitions.map {
+  // Note: sometimes there's a "=", "<" or ">" between the key and 
the value
--- End diff --

Maybe it will be good to provide a concrete example? Is it caused by what a 
user type?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-11 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55892525
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala ---
@@ -29,7 +29,26 @@ private[sql] class SparkQl(conf: ParserConf = 
SimpleParserConf()) extends Cataly
   import ParserUtils._
 
   /** Check if a command should not be explained. */
-  protected def isNoExplainCommand(command: String): Boolean = 
"TOK_DESCTABLE" == command
+  protected def isNoExplainCommand(command: String): Boolean = {
+"TOK_DESCTABLE" == command || "TOK_ALTERTABLE" == command
+  }
+
+  /**
+   * For each node, extract properties in the form of a list ['key1', 
'key2', 'key3', 'value']
+   * into a pair (key1.key2.key3, value).
+   */
--- End diff --

Maybe it is good to provide a concrete example as well?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-11 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55889875
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala ---
@@ -64,10 +83,86 @@ private[sql] class SparkQl(conf: ParserConf = 
SimpleParserConf()) extends Cataly
 val tableIdent = extractTableIdent(nameParts)
 RefreshTable(tableIdent)
 
+  // CREATE DATABASE [IF NOT EXISTS] database_name [COMMENT 
database_comment]
+  // [LOCATION path] [WITH DBPROPERTIES (key1=val1, key2=val2, ...)];
+  case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: args) =>
+val Seq(ifNotExists, dbLocation, databaseComment, dbprops) = 
getClauses(Seq(
+  "TOK_IFNOTEXISTS",
+  "TOK_DATABASELOCATION",
+  "TOK_DATABASECOMMENT",
+  "TOK_DATABASEPROPERTIES"), args)
+val location = dbLocation.map {
+  case Token("TOK_DATABASELOCATION", Token(loc, Nil) :: Nil) => 
unquoteString(loc)
+  case _ => parseFailed("Invalid CREATE DATABASE command", node)
+}
+val comment = databaseComment.map {
+  case Token("TOK_DATABASECOMMENT", Token(com, Nil) :: Nil) => 
unquoteString(com)
+  case _ => parseFailed("Invalid CREATE DATABASE command", node)
+}
+val props = dbprops.toSeq.flatMap {
+  case Token("TOK_DATABASEPROPERTIES", Token("TOK_DBPROPLIST", 
propList) :: Nil) =>
+extractProps(propList, "TOK_TABLEPROPERTY")
--- End diff --

Maybe it will be also good to provide an example format like what we have 
for `TOK_CREATEFUNCTION`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-11 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55888485
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala ---
@@ -64,10 +83,86 @@ private[sql] class SparkQl(conf: ParserConf = 
SimpleParserConf()) extends Cataly
 val tableIdent = extractTableIdent(nameParts)
 RefreshTable(tableIdent)
 
+  // CREATE DATABASE [IF NOT EXISTS] database_name [COMMENT 
database_comment]
+  // [LOCATION path] [WITH DBPROPERTIES (key1=val1, key2=val2, ...)];
+  case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: args) =>
+val Seq(ifNotExists, dbLocation, databaseComment, dbprops) = 
getClauses(Seq(
+  "TOK_IFNOTEXISTS",
+  "TOK_DATABASELOCATION",
+  "TOK_DATABASECOMMENT",
+  "TOK_DATABASEPROPERTIES"), args)
+val location = dbLocation.map {
+  case Token("TOK_DATABASELOCATION", Token(loc, Nil) :: Nil) => 
unquoteString(loc)
+  case _ => parseFailed("Invalid CREATE DATABASE command", node)
+}
+val comment = databaseComment.map {
+  case Token("TOK_DATABASECOMMENT", Token(com, Nil) :: Nil) => 
unquoteString(com)
+  case _ => parseFailed("Invalid CREATE DATABASE command", node)
+}
+val props = dbprops.toSeq.flatMap {
+  case Token("TOK_DATABASEPROPERTIES", Token("TOK_DBPROPLIST", 
propList) :: Nil) =>
+extractProps(propList, "TOK_TABLEPROPERTY")
+  case _ => parseFailed("Invalid CREATE DATABASE command", node)
+}.toMap
+CreateDatabase(databaseName, ifNotExists.isDefined, location, 
comment, props)(node.source)
+
+  // CREATE [TEMPORARY] FUNCTION [db_name.]function_name AS class_name
+  // [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 
'file_uri'] ];
+  case Token("TOK_CREATEFUNCTION", args) =>
+// Example format:
+//
+//   TOK_CREATEFUNCTION
+// :- db_name
+// :- func_name
+// :- alias
+// +- TOK_RESOURCE_LIST
+//:- TOK_RESOURCE_URI
+//:  :- TOK_JAR
+//:  +- '/path/to/jar'
+//+- TOK_RESOURCE_URI
+//   :- TOK_FILE
+//   +- 'path/to/file'
+val (funcNameArgs, otherArgs) = args.partition {
+  case Token("TOK_RESOURCE_LIST", _) => false
+  case Token("TOK_TEMPORARY", _) => false
+  case Token(_, Nil) => true
+  case _ => parseFailed("Invalid CREATE FUNCTION command", node)
+}
+// If database name is specified, there are 3 tokens, otherwise 2.
+val (funcName, alias) = funcNameArgs match {
+  case Token(dbName, Nil) :: Token(fname, Nil) :: Token(aname, 
Nil) :: Nil =>
+(unquoteString(dbName) + "." + unquoteString(fname), 
unquoteString(aname))
+  case Token(fname, Nil) :: Token(aname, Nil) :: Nil =>
+(unquoteString(fname), unquoteString(aname))
+  case _ =>
+parseFailed("Invalid CREATE FUNCTION command", node)
+}
+// Extract other keywords, if they exist
+val Seq(rList, temp) = getClauses(Seq("TOK_RESOURCE_LIST", 
"TOK_TEMPORARY"), otherArgs)
+val resourcesMap = rList.toSeq.flatMap {
+  case Token("TOK_RESOURCE_LIST", resources) =>
+resources.map {
+  case Token("TOK_RESOURCE_URI", rType :: Token(rPath, Nil) :: 
Nil) =>
+val resourceType = rType match {
+  case Token("TOK_JAR", Nil) => "jar"
+  case Token("TOK_FILE", Nil) => "file"
+  case Token("TOK_ARCHIVE", Nil) => "archive"
+  case Token(f, _) => parseFailed(s"Unexpected resource 
format '$f'", node)
+}
+(resourceType, unquoteString(rPath))
+  case _ => parseFailed("Invalid CREATE FUNCTION command", 
node)
+}
+  case _ => parseFailed("Invalid CREATE FUNCTION command", node)
+}.toMap
+CreateFunction(funcName, alias, resourcesMap, 
temp.isDefined)(node.source)
--- End diff --

What will this map look like if I have `USING JAR 'jar1', JAR 'jar2', ...`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, 

[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-195520611
  
**[Test build #52937 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52937/consoleFull)**
 for PR 11573 at commit 
[`bd91b0f`](https://github.com/apache/spark/commit/bd91b0f3c850da796b22c7e575658625d2486057).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-11 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55867164
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/bucket.scala 
---
@@ -24,12 +27,15 @@ package org.apache.spark.sql.execution.datasources
  *
  * @param numBuckets number of buckets.
  * @param bucketColumnNames the names of the columns that used to generate 
the bucket id.
- * @param sortColumnNames the names of the columns that used to sort data 
in each bucket.
+ * @param sortColumns (name, sort direction) of columns that used to sort 
data in each bucket.
  */
 private[sql] case class BucketSpec(
 numBuckets: Int,
 bucketColumnNames: Seq[String],
-sortColumnNames: Seq[String])
+sortColumns: Seq[(String, SortDirection)]) {
--- End diff --

It is great to have `SortDirection` defined explicitly. So, when we create 
it we know what will the order by.

However, this change implies that we can use `descending` as the order, 
which is actually not allowed because `UnsafeKVExternalSorter` only sort keys 
with ascending order (this sorter is used in `DynamicPartitionWriterContainer` 
when we generate bucketing files). So, I think for now, it is better to revert 
this change. In future, we can revisit it when we store sort direction in 
metastore and we can actually sort rows in a bucket file with a descending 
order.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-194613303
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52770/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-194613298
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-194612768
  
**[Test build #52770 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52770/consoleFull)**
 for PR 11573 at commit 
[`37854fc`](https://github.com/apache/spark/commit/37854fccec3af29f15acfdd4693a532b6f862469).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-194600792
  
Also cc @rxin @yhuai.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-194590031
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-194590032
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52773/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-194589783
  
**[Test build #52773 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52773/consoleFull)**
 for PR 11573 at commit 
[`6ad8dd5`](https://github.com/apache/spark/commit/6ad8dd5f5294fb7c9424651920706645d1dba8f9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-194555700
  
**[Test build #52773 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52773/consoleFull)**
 for PR 11573 at commit 
[`6ad8dd5`](https://github.com/apache/spark/commit/6ad8dd5f5294fb7c9424651920706645d1dba8f9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-194550969
  
**[Test build #52770 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52770/consoleFull)**
 for PR 11573 at commit 
[`37854fc`](https://github.com/apache/spark/commit/37854fccec3af29f15acfdd4693a532b6f862469).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-194549041
  
@hvanhovell I believe as of the latest commit I've addressed all of your 
comments. This is ready from my side so I've removed the WIP tag. PTAL.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-194189126
  
@hvanhovell I've addressed most of your comments except a couple ones where 
I said I would fix later. Separately I've also significantly cleaned up the 
logic and signatures in the alter table code so you should have a look at that 
as well. It's probably quite different from what it looked like when you last 
reviewed this patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55487582
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala ---
@@ -64,6 +85,53 @@ private[sql] class SparkQl(conf: ParserConf = 
SimpleParserConf()) extends Cataly
 val tableIdent = extractTableIdent(nameParts)
 RefreshTable(tableIdent)
 
+  case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: 
createDatabaseArgs) =>
+val Seq(
+  allowExisting,
+  dbLocation,
+  databaseComment,
+  dbprops) = getClauses(Seq(
+  "TOK_IFNOTEXISTS",
+  "TOK_DATABASELOCATION",
+  "TOK_DATABASECOMMENT",
+  "TOK_DATABASEPROPERTIES"), createDatabaseArgs)
+val location = dbLocation.map {
+  case Token("TOK_DATABASELOCATION", Token(loc, Nil) :: Nil) => 
unquoteString(loc)
+}
+val comment = databaseComment.map {
+  case Token("TOK_DATABASECOMMENT", Token(comment, Nil) :: Nil) => 
unquoteString(comment)
+}
+val props: Map[String, String] = dbprops.toSeq.flatMap {
+  case Token("TOK_DATABASEPROPERTIES", propList) =>
+propList.flatMap(extractProps(_, "TOK_DBPROPLIST", 
"TOK_TABLEPROPERTY"))
+}.toMap
+CreateDatabase(databaseName, allowExisting.isDefined, location, 
comment, props)(node.source)
+
+  case Token("TOK_CREATEFUNCTION", func :: as :: createFuncArgs) =>
--- End diff --

will fix later


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55487513
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala
 ---
@@ -0,0 +1,410 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending}
+import org.apache.spark.sql.catalyst.parser._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+
+/**
+ * Helper object to parse alter table commands.
+ */
+object AlterTableCommandParser {
+  import ParserUtils._
+
+  /**
+   * Parse the given node assuming it is an alter table command.
+   */
+  def parse(v1: ASTNode): LogicalPlan = {
+v1.children match {
+  case (tabName @ Token("TOK_TABNAME", _)) :: restNodes =>
+val tableIdent: TableIdentifier = extractTableIdent(tabName)
+val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children)
+val partition = partitionSpec.flatMap(parsePartitionSpec)
+matchAlterTableCommands(v1, restNodes, tableIdent, partition)
+  case _ =>
+throw new AnalysisException(s"Could not parse alter table command: 
'${v1.text}'")
+}
+  }
+
+  private def cleanAndUnquoteString(s: String): String = {
+cleanIdentifier(unquoteString(s))
+  }
+
+  private def parsePartitionSpec(node: ASTNode): Option[Map[String, 
Option[String]]] = {
+node match {
+  case Token("TOK_PARTSPEC", partitions) =>
+val spec = partitions.map {
+  case Token("TOK_PARTVAL", ident :: constant :: Nil) =>
+(cleanAndUnquoteString(ident.text), 
Some(cleanAndUnquoteString(constant.text)))
+  case Token("TOK_PARTVAL", ident :: Nil) =>
+(cleanAndUnquoteString(ident.text), None)
+}.toMap
+Some(spec)
+  case _ => None
+}
+  }
+
+  private def extractTableProps(node: ASTNode): Map[String, 
Option[String]] = {
+node match {
+  case Token("TOK_TABLEPROPERTIES", propsList) =>
+propsList.flatMap {
+  case Token("TOK_TABLEPROPLIST", props) =>
+props.map {
+  case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", 
Nil) :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+(k, None)
+  case Token("TOK_TABLEPROPERTY", key :: value :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+val v = cleanAndUnquoteString(value.text)
+(k, Some(v))
+}
+}.toMap
+  case _ =>
+throw new AnalysisException(
+  s"Expected table properties in alter table command: 
'${node.text}'")
+}
+  }
+
+  // TODO: This method is massive. Break it down. Also, add some 
comments...
+  private def matchAlterTableCommands(
+  node: ASTNode,
+  nodes: Seq[ASTNode],
+  tableIdent: TableIdentifier,
+  partition: Option[Map[String, Option[String]]]): LogicalPlan = {
+nodes match {
+  case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ =>
+val renamedTable = getClause("TOK_TABNAME", renameArgs)
+val renamedTableIdent: TableIdentifier = 
extractTableIdent(renamedTable)
+AlterTableRename(tableIdent, renamedTableIdent)(node.source)
+
+  case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ =>
+val setTableProperties = extractTableProps(args.head)
+AlterTableSetProperties(
+  tableIdent,
+  

[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55487572
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/bucket.scala 
---
@@ -29,7 +32,8 @@ package org.apache.spark.sql.execution.datasources
 private[sql] case class BucketSpec(
 numBuckets: Int,
 bucketColumnNames: Seq[String],
-sortColumnNames: Seq[String])
+sortColumnNames: Seq[String],
+sortDirections: Seq[SortDirection] = Nil)
--- End diff --

will fix later


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-194188020
  
**[Test build #52739 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52739/consoleFull)**
 for PR 11573 at commit 
[`6538829`](https://github.com/apache/spark/commit/6538829fea21463826970997cf05eaedfcadb100).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55487318
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala
 ---
@@ -0,0 +1,410 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending}
+import org.apache.spark.sql.catalyst.parser._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+
+/**
+ * Helper object to parse alter table commands.
+ */
+object AlterTableCommandParser {
+  import ParserUtils._
+
+  /**
+   * Parse the given node assuming it is an alter table command.
+   */
+  def parse(v1: ASTNode): LogicalPlan = {
+v1.children match {
+  case (tabName @ Token("TOK_TABNAME", _)) :: restNodes =>
+val tableIdent: TableIdentifier = extractTableIdent(tabName)
+val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children)
+val partition = partitionSpec.flatMap(parsePartitionSpec)
+matchAlterTableCommands(v1, restNodes, tableIdent, partition)
+  case _ =>
+throw new AnalysisException(s"Could not parse alter table command: 
'${v1.text}'")
+}
+  }
+
+  private def cleanAndUnquoteString(s: String): String = {
+cleanIdentifier(unquoteString(s))
+  }
+
+  private def parsePartitionSpec(node: ASTNode): Option[Map[String, 
Option[String]]] = {
+node match {
+  case Token("TOK_PARTSPEC", partitions) =>
+val spec = partitions.map {
+  case Token("TOK_PARTVAL", ident :: constant :: Nil) =>
+(cleanAndUnquoteString(ident.text), 
Some(cleanAndUnquoteString(constant.text)))
+  case Token("TOK_PARTVAL", ident :: Nil) =>
+(cleanAndUnquoteString(ident.text), None)
+}.toMap
+Some(spec)
+  case _ => None
+}
+  }
+
+  private def extractTableProps(node: ASTNode): Map[String, 
Option[String]] = {
+node match {
+  case Token("TOK_TABLEPROPERTIES", propsList) =>
+propsList.flatMap {
+  case Token("TOK_TABLEPROPLIST", props) =>
+props.map {
+  case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", 
Nil) :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+(k, None)
+  case Token("TOK_TABLEPROPERTY", key :: value :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+val v = cleanAndUnquoteString(value.text)
+(k, Some(v))
+}
+}.toMap
+  case _ =>
+throw new AnalysisException(
+  s"Expected table properties in alter table command: 
'${node.text}'")
+}
+  }
+
+  // TODO: This method is massive. Break it down. Also, add some 
comments...
+  private def matchAlterTableCommands(
+  node: ASTNode,
+  nodes: Seq[ASTNode],
+  tableIdent: TableIdentifier,
+  partition: Option[Map[String, Option[String]]]): LogicalPlan = {
+nodes match {
+  case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ =>
+val renamedTable = getClause("TOK_TABNAME", renameArgs)
+val renamedTableIdent: TableIdentifier = 
extractTableIdent(renamedTable)
+AlterTableRename(tableIdent, renamedTableIdent)(node.source)
+
+  case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ =>
+val setTableProperties = extractTableProps(args.head)
+AlterTableSetProperties(
+  tableIdent,
+  

[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55487276
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala
 ---
@@ -0,0 +1,410 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending}
+import org.apache.spark.sql.catalyst.parser._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+
+/**
+ * Helper object to parse alter table commands.
+ */
+object AlterTableCommandParser {
+  import ParserUtils._
+
+  /**
+   * Parse the given node assuming it is an alter table command.
+   */
+  def parse(v1: ASTNode): LogicalPlan = {
+v1.children match {
+  case (tabName @ Token("TOK_TABNAME", _)) :: restNodes =>
+val tableIdent: TableIdentifier = extractTableIdent(tabName)
+val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children)
+val partition = partitionSpec.flatMap(parsePartitionSpec)
+matchAlterTableCommands(v1, restNodes, tableIdent, partition)
+  case _ =>
+throw new AnalysisException(s"Could not parse alter table command: 
'${v1.text}'")
+}
+  }
+
+  private def cleanAndUnquoteString(s: String): String = {
+cleanIdentifier(unquoteString(s))
+  }
+
+  private def parsePartitionSpec(node: ASTNode): Option[Map[String, 
Option[String]]] = {
+node match {
+  case Token("TOK_PARTSPEC", partitions) =>
+val spec = partitions.map {
+  case Token("TOK_PARTVAL", ident :: constant :: Nil) =>
+(cleanAndUnquoteString(ident.text), 
Some(cleanAndUnquoteString(constant.text)))
+  case Token("TOK_PARTVAL", ident :: Nil) =>
+(cleanAndUnquoteString(ident.text), None)
+}.toMap
+Some(spec)
+  case _ => None
+}
+  }
+
+  private def extractTableProps(node: ASTNode): Map[String, 
Option[String]] = {
+node match {
+  case Token("TOK_TABLEPROPERTIES", propsList) =>
+propsList.flatMap {
+  case Token("TOK_TABLEPROPLIST", props) =>
+props.map {
+  case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", 
Nil) :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+(k, None)
+  case Token("TOK_TABLEPROPERTY", key :: value :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+val v = cleanAndUnquoteString(value.text)
+(k, Some(v))
+}
+}.toMap
+  case _ =>
+throw new AnalysisException(
+  s"Expected table properties in alter table command: 
'${node.text}'")
+}
+  }
+
+  // TODO: This method is massive. Break it down. Also, add some 
comments...
+  private def matchAlterTableCommands(
+  node: ASTNode,
+  nodes: Seq[ASTNode],
+  tableIdent: TableIdentifier,
+  partition: Option[Map[String, Option[String]]]): LogicalPlan = {
+nodes match {
+  case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ =>
--- End diff --

@viirya can you answer this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-194186989
  
**[Test build #52737 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52737/consoleFull)**
 for PR 11573 at commit 
[`f54501f`](https://github.com/apache/spark/commit/f54501f6c244c8e37b332991fd22617ec1ef1e76).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55486952
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala ---
@@ -64,6 +85,53 @@ private[sql] class SparkQl(conf: ParserConf = 
SimpleParserConf()) extends Cataly
 val tableIdent = extractTableIdent(nameParts)
 RefreshTable(tableIdent)
 
+  case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: 
createDatabaseArgs) =>
+val Seq(
+  allowExisting,
--- End diff --

I've renamed it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55486925
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala
 ---
@@ -0,0 +1,410 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending}
+import org.apache.spark.sql.catalyst.parser._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+
+/**
+ * Helper object to parse alter table commands.
+ */
+object AlterTableCommandParser {
+  import ParserUtils._
+
+  /**
+   * Parse the given node assuming it is an alter table command.
+   */
+  def parse(v1: ASTNode): LogicalPlan = {
+v1.children match {
+  case (tabName @ Token("TOK_TABNAME", _)) :: restNodes =>
+val tableIdent: TableIdentifier = extractTableIdent(tabName)
+val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children)
+val partition = partitionSpec.flatMap(parsePartitionSpec)
+matchAlterTableCommands(v1, restNodes, tableIdent, partition)
+  case _ =>
+throw new AnalysisException(s"Could not parse alter table command: 
'${v1.text}'")
+}
+  }
+
+  private def cleanAndUnquoteString(s: String): String = {
+cleanIdentifier(unquoteString(s))
--- End diff --

if you look at `ParserUtils` that's done in `cleanIdentifier`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55352952
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala ---
@@ -64,6 +85,53 @@ private[sql] class SparkQl(conf: ParserConf = 
SimpleParserConf()) extends Cataly
 val tableIdent = extractTableIdent(nameParts)
 RefreshTable(tableIdent)
 
+  case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: 
createDatabaseArgs) =>
+val Seq(
+  allowExisting,
--- End diff --

When i see a parameter that is named `allowExists` I think the existence of 
the current table is allowed, and that as a consequence the current create 
table command is going to overwrite the existing table. Why not name this 
`ifNotExists`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55337949
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
@@ -0,0 +1,197 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import org.apache.spark.Logging
+import org.apache.spark.sql.{Row, SQLContext}
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Attribute, 
AttributeReference}
+import org.apache.spark.sql.execution.datasources.BucketSpec
+import org.apache.spark.sql.types._
+
+
+/**
+ * A DDL command expected to be run in the underlying system without Spark 
parsing the
--- End diff --

The query has been parsed it should be something like ...without Spark 
executing the querty...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55337179
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/bucket.scala 
---
@@ -29,7 +32,8 @@ package org.apache.spark.sql.execution.datasources
 private[sql] case class BucketSpec(
 numBuckets: Int,
 bucketColumnNames: Seq[String],
-sortColumnNames: Seq[String])
+sortColumnNames: Seq[String],
+sortDirections: Seq[SortDirection] = Nil)
--- End diff --

I still think sortColumnNames and sortDirections should be combined into a 
seq of pairs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55336977
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala
 ---
@@ -0,0 +1,410 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending}
+import org.apache.spark.sql.catalyst.parser._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+
+/**
+ * Helper object to parse alter table commands.
+ */
+object AlterTableCommandParser {
+  import ParserUtils._
+
+  /**
+   * Parse the given node assuming it is an alter table command.
+   */
+  def parse(v1: ASTNode): LogicalPlan = {
+v1.children match {
+  case (tabName @ Token("TOK_TABNAME", _)) :: restNodes =>
+val tableIdent: TableIdentifier = extractTableIdent(tabName)
+val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children)
+val partition = partitionSpec.flatMap(parsePartitionSpec)
+matchAlterTableCommands(v1, restNodes, tableIdent, partition)
+  case _ =>
+throw new AnalysisException(s"Could not parse alter table command: 
'${v1.text}'")
+}
+  }
+
+  private def cleanAndUnquoteString(s: String): String = {
+cleanIdentifier(unquoteString(s))
+  }
+
+  private def parsePartitionSpec(node: ASTNode): Option[Map[String, 
Option[String]]] = {
+node match {
+  case Token("TOK_PARTSPEC", partitions) =>
+val spec = partitions.map {
+  case Token("TOK_PARTVAL", ident :: constant :: Nil) =>
+(cleanAndUnquoteString(ident.text), 
Some(cleanAndUnquoteString(constant.text)))
+  case Token("TOK_PARTVAL", ident :: Nil) =>
+(cleanAndUnquoteString(ident.text), None)
+}.toMap
+Some(spec)
+  case _ => None
+}
+  }
+
+  private def extractTableProps(node: ASTNode): Map[String, 
Option[String]] = {
+node match {
+  case Token("TOK_TABLEPROPERTIES", propsList) =>
+propsList.flatMap {
+  case Token("TOK_TABLEPROPLIST", props) =>
+props.map {
+  case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", 
Nil) :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+(k, None)
+  case Token("TOK_TABLEPROPERTY", key :: value :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+val v = cleanAndUnquoteString(value.text)
+(k, Some(v))
+}
+}.toMap
+  case _ =>
+throw new AnalysisException(
+  s"Expected table properties in alter table command: 
'${node.text}'")
+}
+  }
+
+  // TODO: This method is massive. Break it down. Also, add some 
comments...
+  private def matchAlterTableCommands(
+  node: ASTNode,
+  nodes: Seq[ASTNode],
+  tableIdent: TableIdentifier,
+  partition: Option[Map[String, Option[String]]]): LogicalPlan = {
+nodes match {
+  case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ =>
+val renamedTable = getClause("TOK_TABNAME", renameArgs)
+val renamedTableIdent: TableIdentifier = 
extractTableIdent(renamedTable)
+AlterTableRename(tableIdent, renamedTableIdent)(node.source)
+
+  case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ =>
+val setTableProperties = extractTableProps(args.head)
+AlterTableSetProperties(
+  tableIdent,
+  

[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55336439
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala
 ---
@@ -0,0 +1,410 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending}
+import org.apache.spark.sql.catalyst.parser._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+
+/**
+ * Helper object to parse alter table commands.
+ */
+object AlterTableCommandParser {
+  import ParserUtils._
+
+  /**
+   * Parse the given node assuming it is an alter table command.
+   */
+  def parse(v1: ASTNode): LogicalPlan = {
+v1.children match {
+  case (tabName @ Token("TOK_TABNAME", _)) :: restNodes =>
+val tableIdent: TableIdentifier = extractTableIdent(tabName)
+val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children)
+val partition = partitionSpec.flatMap(parsePartitionSpec)
+matchAlterTableCommands(v1, restNodes, tableIdent, partition)
+  case _ =>
+throw new AnalysisException(s"Could not parse alter table command: 
'${v1.text}'")
+}
+  }
+
+  private def cleanAndUnquoteString(s: String): String = {
+cleanIdentifier(unquoteString(s))
+  }
+
+  private def parsePartitionSpec(node: ASTNode): Option[Map[String, 
Option[String]]] = {
+node match {
+  case Token("TOK_PARTSPEC", partitions) =>
+val spec = partitions.map {
+  case Token("TOK_PARTVAL", ident :: constant :: Nil) =>
+(cleanAndUnquoteString(ident.text), 
Some(cleanAndUnquoteString(constant.text)))
+  case Token("TOK_PARTVAL", ident :: Nil) =>
+(cleanAndUnquoteString(ident.text), None)
+}.toMap
+Some(spec)
+  case _ => None
+}
+  }
+
+  private def extractTableProps(node: ASTNode): Map[String, 
Option[String]] = {
+node match {
+  case Token("TOK_TABLEPROPERTIES", propsList) =>
+propsList.flatMap {
+  case Token("TOK_TABLEPROPLIST", props) =>
+props.map {
+  case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", 
Nil) :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+(k, None)
+  case Token("TOK_TABLEPROPERTY", key :: value :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+val v = cleanAndUnquoteString(value.text)
+(k, Some(v))
+}
+}.toMap
+  case _ =>
+throw new AnalysisException(
+  s"Expected table properties in alter table command: 
'${node.text}'")
+}
+  }
+
+  // TODO: This method is massive. Break it down. Also, add some 
comments...
+  private def matchAlterTableCommands(
+  node: ASTNode,
+  nodes: Seq[ASTNode],
+  tableIdent: TableIdentifier,
+  partition: Option[Map[String, Option[String]]]): LogicalPlan = {
+nodes match {
+  case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ =>
+val renamedTable = getClause("TOK_TABNAME", renameArgs)
+val renamedTableIdent: TableIdentifier = 
extractTableIdent(renamedTable)
+AlterTableRename(tableIdent, renamedTableIdent)(node.source)
+
+  case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ =>
+val setTableProperties = extractTableProps(args.head)
+AlterTableSetProperties(
+  tableIdent,
+  

[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55336099
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala
 ---
@@ -0,0 +1,410 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending}
+import org.apache.spark.sql.catalyst.parser._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+
+/**
+ * Helper object to parse alter table commands.
+ */
+object AlterTableCommandParser {
+  import ParserUtils._
+
+  /**
+   * Parse the given node assuming it is an alter table command.
+   */
+  def parse(v1: ASTNode): LogicalPlan = {
+v1.children match {
+  case (tabName @ Token("TOK_TABNAME", _)) :: restNodes =>
+val tableIdent: TableIdentifier = extractTableIdent(tabName)
+val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children)
+val partition = partitionSpec.flatMap(parsePartitionSpec)
+matchAlterTableCommands(v1, restNodes, tableIdent, partition)
+  case _ =>
+throw new AnalysisException(s"Could not parse alter table command: 
'${v1.text}'")
+}
+  }
+
+  private def cleanAndUnquoteString(s: String): String = {
+cleanIdentifier(unquoteString(s))
+  }
+
+  private def parsePartitionSpec(node: ASTNode): Option[Map[String, 
Option[String]]] = {
+node match {
+  case Token("TOK_PARTSPEC", partitions) =>
+val spec = partitions.map {
+  case Token("TOK_PARTVAL", ident :: constant :: Nil) =>
+(cleanAndUnquoteString(ident.text), 
Some(cleanAndUnquoteString(constant.text)))
+  case Token("TOK_PARTVAL", ident :: Nil) =>
+(cleanAndUnquoteString(ident.text), None)
+}.toMap
+Some(spec)
+  case _ => None
+}
+  }
+
+  private def extractTableProps(node: ASTNode): Map[String, 
Option[String]] = {
+node match {
+  case Token("TOK_TABLEPROPERTIES", propsList) =>
+propsList.flatMap {
+  case Token("TOK_TABLEPROPLIST", props) =>
+props.map {
+  case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", 
Nil) :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+(k, None)
+  case Token("TOK_TABLEPROPERTY", key :: value :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+val v = cleanAndUnquoteString(value.text)
+(k, Some(v))
+}
+}.toMap
+  case _ =>
+throw new AnalysisException(
+  s"Expected table properties in alter table command: 
'${node.text}'")
+}
+  }
+
+  // TODO: This method is massive. Break it down. Also, add some 
comments...
+  private def matchAlterTableCommands(
+  node: ASTNode,
+  nodes: Seq[ASTNode],
+  tableIdent: TableIdentifier,
+  partition: Option[Map[String, Option[String]]]): LogicalPlan = {
+nodes match {
+  case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ =>
+val renamedTable = getClause("TOK_TABNAME", renameArgs)
+val renamedTableIdent: TableIdentifier = 
extractTableIdent(renamedTable)
+AlterTableRename(tableIdent, renamedTableIdent)(node.source)
+
+  case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ =>
+val setTableProperties = extractTableProps(args.head)
+AlterTableSetProperties(
+  tableIdent,
+  

[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55335658
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala ---
@@ -64,6 +85,53 @@ private[sql] class SparkQl(conf: ParserConf = 
SimpleParserConf()) extends Cataly
 val tableIdent = extractTableIdent(nameParts)
 RefreshTable(tableIdent)
 
+  case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: 
createDatabaseArgs) =>
+val Seq(
+  allowExisting,
--- End diff --

Not quite understand your question?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55335240
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala
 ---
@@ -0,0 +1,410 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending}
+import org.apache.spark.sql.catalyst.parser._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+
+/**
+ * Helper object to parse alter table commands.
+ */
+object AlterTableCommandParser {
+  import ParserUtils._
+
+  /**
+   * Parse the given node assuming it is an alter table command.
+   */
+  def parse(v1: ASTNode): LogicalPlan = {
+v1.children match {
+  case (tabName @ Token("TOK_TABNAME", _)) :: restNodes =>
+val tableIdent: TableIdentifier = extractTableIdent(tabName)
+val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children)
+val partition = partitionSpec.flatMap(parsePartitionSpec)
+matchAlterTableCommands(v1, restNodes, tableIdent, partition)
+  case _ =>
+throw new AnalysisException(s"Could not parse alter table command: 
'${v1.text}'")
+}
+  }
+
+  private def cleanAndUnquoteString(s: String): String = {
+cleanIdentifier(unquoteString(s))
+  }
+
+  private def parsePartitionSpec(node: ASTNode): Option[Map[String, 
Option[String]]] = {
+node match {
+  case Token("TOK_PARTSPEC", partitions) =>
+val spec = partitions.map {
+  case Token("TOK_PARTVAL", ident :: constant :: Nil) =>
+(cleanAndUnquoteString(ident.text), 
Some(cleanAndUnquoteString(constant.text)))
+  case Token("TOK_PARTVAL", ident :: Nil) =>
+(cleanAndUnquoteString(ident.text), None)
+}.toMap
+Some(spec)
+  case _ => None
+}
+  }
+
+  private def extractTableProps(node: ASTNode): Map[String, 
Option[String]] = {
+node match {
+  case Token("TOK_TABLEPROPERTIES", propsList) =>
+propsList.flatMap {
+  case Token("TOK_TABLEPROPLIST", props) =>
+props.map {
+  case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", 
Nil) :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+(k, None)
+  case Token("TOK_TABLEPROPERTY", key :: value :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+val v = cleanAndUnquoteString(value.text)
+(k, Some(v))
+}
+}.toMap
+  case _ =>
+throw new AnalysisException(
+  s"Expected table properties in alter table command: 
'${node.text}'")
+}
+  }
+
+  // TODO: This method is massive. Break it down. Also, add some 
comments...
+  private def matchAlterTableCommands(
+  node: ASTNode,
+  nodes: Seq[ASTNode],
+  tableIdent: TableIdentifier,
+  partition: Option[Map[String, Option[String]]]): LogicalPlan = {
+nodes match {
+  case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ =>
--- End diff --

Whe the wildcards at the end every pattern match. Do we expect additional 
input which can safely ignore?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with 

[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55335131
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala
 ---
@@ -0,0 +1,410 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending}
+import org.apache.spark.sql.catalyst.parser._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+
+/**
+ * Helper object to parse alter table commands.
+ */
+object AlterTableCommandParser {
+  import ParserUtils._
+
+  /**
+   * Parse the given node assuming it is an alter table command.
+   */
+  def parse(v1: ASTNode): LogicalPlan = {
+v1.children match {
+  case (tabName @ Token("TOK_TABNAME", _)) :: restNodes =>
+val tableIdent: TableIdentifier = extractTableIdent(tabName)
+val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children)
+val partition = partitionSpec.flatMap(parsePartitionSpec)
+matchAlterTableCommands(v1, restNodes, tableIdent, partition)
+  case _ =>
+throw new AnalysisException(s"Could not parse alter table command: 
'${v1.text}'")
+}
+  }
+
+  private def cleanAndUnquoteString(s: String): String = {
+cleanIdentifier(unquoteString(s))
+  }
+
+  private def parsePartitionSpec(node: ASTNode): Option[Map[String, 
Option[String]]] = {
+node match {
+  case Token("TOK_PARTSPEC", partitions) =>
+val spec = partitions.map {
+  case Token("TOK_PARTVAL", ident :: constant :: Nil) =>
+(cleanAndUnquoteString(ident.text), 
Some(cleanAndUnquoteString(constant.text)))
+  case Token("TOK_PARTVAL", ident :: Nil) =>
+(cleanAndUnquoteString(ident.text), None)
+}.toMap
+Some(spec)
+  case _ => None
+}
+  }
+
+  private def extractTableProps(node: ASTNode): Map[String, 
Option[String]] = {
+node match {
+  case Token("TOK_TABLEPROPERTIES", propsList) =>
+propsList.flatMap {
+  case Token("TOK_TABLEPROPLIST", props) =>
+props.map {
+  case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", 
Nil) :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+(k, None)
+  case Token("TOK_TABLEPROPERTY", key :: value :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+val v = cleanAndUnquoteString(value.text)
+(k, Some(v))
+}
+}.toMap
+  case _ =>
+throw new AnalysisException(
+  s"Expected table properties in alter table command: 
'${node.text}'")
+}
+  }
+
+  // TODO: This method is massive. Break it down. Also, add some 
comments...
+  private def matchAlterTableCommands(
+  node: ASTNode,
+  nodes: Seq[ASTNode],
+  tableIdent: TableIdentifier,
+  partition: Option[Map[String, Option[String]]]): LogicalPlan = {
+nodes match {
+  case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ =>
+val renamedTable = getClause("TOK_TABNAME", renameArgs)
+val renamedTableIdent: TableIdentifier = 
extractTableIdent(renamedTable)
+AlterTableRename(tableIdent, renamedTableIdent)(node.source)
+
+  case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ =>
+val setTableProperties = extractTableProps(args.head)
+AlterTableSetProperties(
+  tableIdent,
+  

[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55335079
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala
 ---
@@ -0,0 +1,410 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending}
+import org.apache.spark.sql.catalyst.parser._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+
+/**
+ * Helper object to parse alter table commands.
+ */
+object AlterTableCommandParser {
+  import ParserUtils._
+
+  /**
+   * Parse the given node assuming it is an alter table command.
+   */
+  def parse(v1: ASTNode): LogicalPlan = {
+v1.children match {
+  case (tabName @ Token("TOK_TABNAME", _)) :: restNodes =>
+val tableIdent: TableIdentifier = extractTableIdent(tabName)
+val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children)
+val partition = partitionSpec.flatMap(parsePartitionSpec)
+matchAlterTableCommands(v1, restNodes, tableIdent, partition)
+  case _ =>
+throw new AnalysisException(s"Could not parse alter table command: 
'${v1.text}'")
+}
+  }
+
+  private def cleanAndUnquoteString(s: String): String = {
+cleanIdentifier(unquoteString(s))
+  }
+
+  private def parsePartitionSpec(node: ASTNode): Option[Map[String, 
Option[String]]] = {
+node match {
+  case Token("TOK_PARTSPEC", partitions) =>
+val spec = partitions.map {
+  case Token("TOK_PARTVAL", ident :: constant :: Nil) =>
+(cleanAndUnquoteString(ident.text), 
Some(cleanAndUnquoteString(constant.text)))
+  case Token("TOK_PARTVAL", ident :: Nil) =>
+(cleanAndUnquoteString(ident.text), None)
+}.toMap
+Some(spec)
+  case _ => None
+}
+  }
+
+  private def extractTableProps(node: ASTNode): Map[String, 
Option[String]] = {
+node match {
+  case Token("TOK_TABLEPROPERTIES", propsList) =>
+propsList.flatMap {
+  case Token("TOK_TABLEPROPLIST", props) =>
+props.map {
+  case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", 
Nil) :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+(k, None)
+  case Token("TOK_TABLEPROPERTY", key :: value :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+val v = cleanAndUnquoteString(value.text)
+(k, Some(v))
+}
+}.toMap
+  case _ =>
+throw new AnalysisException(
+  s"Expected table properties in alter table command: 
'${node.text}'")
+}
+  }
+
+  // TODO: This method is massive. Break it down. Also, add some 
comments...
+  private def matchAlterTableCommands(
+  node: ASTNode,
+  nodes: Seq[ASTNode],
+  tableIdent: TableIdentifier,
+  partition: Option[Map[String, Option[String]]]): LogicalPlan = {
+nodes match {
+  case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ =>
+val renamedTable = getClause("TOK_TABNAME", renameArgs)
+val renamedTableIdent: TableIdentifier = 
extractTableIdent(renamedTable)
+AlterTableRename(tableIdent, renamedTableIdent)(node.source)
+
+  case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ =>
+val setTableProperties = extractTableProps(args.head)
+AlterTableSetProperties(
+  tableIdent,
+  

[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55334443
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala
 ---
@@ -0,0 +1,410 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending}
+import org.apache.spark.sql.catalyst.parser._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+
+/**
+ * Helper object to parse alter table commands.
+ */
+object AlterTableCommandParser {
+  import ParserUtils._
+
+  /**
+   * Parse the given node assuming it is an alter table command.
+   */
+  def parse(v1: ASTNode): LogicalPlan = {
+v1.children match {
+  case (tabName @ Token("TOK_TABNAME", _)) :: restNodes =>
+val tableIdent: TableIdentifier = extractTableIdent(tabName)
+val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children)
+val partition = partitionSpec.flatMap(parsePartitionSpec)
+matchAlterTableCommands(v1, restNodes, tableIdent, partition)
+  case _ =>
+throw new AnalysisException(s"Could not parse alter table command: 
'${v1.text}'")
+}
+  }
+
+  private def cleanAndUnquoteString(s: String): String = {
+cleanIdentifier(unquoteString(s))
+  }
+
+  private def parsePartitionSpec(node: ASTNode): Option[Map[String, 
Option[String]]] = {
+node match {
+  case Token("TOK_PARTSPEC", partitions) =>
+val spec = partitions.map {
+  case Token("TOK_PARTVAL", ident :: constant :: Nil) =>
+(cleanAndUnquoteString(ident.text), 
Some(cleanAndUnquoteString(constant.text)))
+  case Token("TOK_PARTVAL", ident :: Nil) =>
+(cleanAndUnquoteString(ident.text), None)
+}.toMap
+Some(spec)
+  case _ => None
+}
+  }
+
+  private def extractTableProps(node: ASTNode): Map[String, 
Option[String]] = {
+node match {
+  case Token("TOK_TABLEPROPERTIES", propsList) =>
+propsList.flatMap {
+  case Token("TOK_TABLEPROPLIST", props) =>
+props.map {
+  case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", 
Nil) :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+(k, None)
+  case Token("TOK_TABLEPROPERTY", key :: value :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+val v = cleanAndUnquoteString(value.text)
+(k, Some(v))
+}
+}.toMap
+  case _ =>
+throw new AnalysisException(
+  s"Expected table properties in alter table command: 
'${node.text}'")
+}
+  }
+
+  // TODO: This method is massive. Break it down. Also, add some 
comments...
+  private def matchAlterTableCommands(
+  node: ASTNode,
+  nodes: Seq[ASTNode],
+  tableIdent: TableIdentifier,
+  partition: Option[Map[String, Option[String]]]): LogicalPlan = {
+nodes match {
+  case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ =>
+val renamedTable = getClause("TOK_TABNAME", renameArgs)
+val renamedTableIdent: TableIdentifier = 
extractTableIdent(renamedTable)
+AlterTableRename(tableIdent, renamedTableIdent)(node.source)
+
+  case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ =>
+val setTableProperties = extractTableProps(args.head)
+AlterTableSetProperties(
+  tableIdent,
+  

[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55333881
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala
 ---
@@ -0,0 +1,410 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending}
+import org.apache.spark.sql.catalyst.parser._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+
+/**
+ * Helper object to parse alter table commands.
+ */
+object AlterTableCommandParser {
+  import ParserUtils._
+
+  /**
+   * Parse the given node assuming it is an alter table command.
+   */
+  def parse(v1: ASTNode): LogicalPlan = {
+v1.children match {
+  case (tabName @ Token("TOK_TABNAME", _)) :: restNodes =>
+val tableIdent: TableIdentifier = extractTableIdent(tabName)
+val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children)
+val partition = partitionSpec.flatMap(parsePartitionSpec)
+matchAlterTableCommands(v1, restNodes, tableIdent, partition)
+  case _ =>
+throw new AnalysisException(s"Could not parse alter table command: 
'${v1.text}'")
+}
+  }
+
+  private def cleanAndUnquoteString(s: String): String = {
+cleanIdentifier(unquoteString(s))
+  }
+
+  private def parsePartitionSpec(node: ASTNode): Option[Map[String, 
Option[String]]] = {
+node match {
+  case Token("TOK_PARTSPEC", partitions) =>
+val spec = partitions.map {
+  case Token("TOK_PARTVAL", ident :: constant :: Nil) =>
+(cleanAndUnquoteString(ident.text), 
Some(cleanAndUnquoteString(constant.text)))
+  case Token("TOK_PARTVAL", ident :: Nil) =>
+(cleanAndUnquoteString(ident.text), None)
+}.toMap
+Some(spec)
+  case _ => None
+}
+  }
+
+  private def extractTableProps(node: ASTNode): Map[String, 
Option[String]] = {
+node match {
+  case Token("TOK_TABLEPROPERTIES", propsList) =>
+propsList.flatMap {
+  case Token("TOK_TABLEPROPLIST", props) =>
+props.map {
+  case Token("TOK_TABLEPROPERTY", key :: Token("TOK_NULL", 
Nil) :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+(k, None)
+  case Token("TOK_TABLEPROPERTY", key :: value :: Nil) =>
+val k = cleanAndUnquoteString(key.text)
+val v = cleanAndUnquoteString(value.text)
+(k, Some(v))
+}
+}.toMap
+  case _ =>
+throw new AnalysisException(
+  s"Expected table properties in alter table command: 
'${node.text}'")
+}
+  }
+
+  // TODO: This method is massive. Break it down. Also, add some 
comments...
+  private def matchAlterTableCommands(
+  node: ASTNode,
+  nodes: Seq[ASTNode],
+  tableIdent: TableIdentifier,
+  partition: Option[Map[String, Option[String]]]): LogicalPlan = {
+nodes match {
+  case rename @ Token("TOK_ALTERTABLE_RENAME", renameArgs) :: _ =>
+val renamedTable = getClause("TOK_TABNAME", renameArgs)
+val renamedTableIdent: TableIdentifier = 
extractTableIdent(renamedTable)
+AlterTableRename(tableIdent, renamedTableIdent)(node.source)
+
+  case Token("TOK_ALTERTABLE_PROPERTIES", args) :: _ =>
+val setTableProperties = extractTableProps(args.head)
+AlterTableSetProperties(
+  tableIdent,
+  

[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55333529
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommandParser.scala
 ---
@@ -0,0 +1,410 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending}
+import org.apache.spark.sql.catalyst.parser._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+import org.apache.spark.sql.types.StructType
+
+
+/**
+ * Helper object to parse alter table commands.
+ */
+object AlterTableCommandParser {
+  import ParserUtils._
+
+  /**
+   * Parse the given node assuming it is an alter table command.
+   */
+  def parse(v1: ASTNode): LogicalPlan = {
+v1.children match {
+  case (tabName @ Token("TOK_TABNAME", _)) :: restNodes =>
+val tableIdent: TableIdentifier = extractTableIdent(tabName)
+val partitionSpec = getClauseOption("TOK_PARTSPEC", v1.children)
+val partition = partitionSpec.flatMap(parsePartitionSpec)
+matchAlterTableCommands(v1, restNodes, tableIdent, partition)
+  case _ =>
+throw new AnalysisException(s"Could not parse alter table command: 
'${v1.text}'")
+}
+  }
+
+  private def cleanAndUnquoteString(s: String): String = {
+cleanIdentifier(unquoteString(s))
--- End diff --

I think unqoute string does the cleaning (i.e. stripping leading and 
trailing quotes) for us.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r5511
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala ---
@@ -64,6 +85,53 @@ private[sql] class SparkQl(conf: ParserConf = 
SimpleParserConf()) extends Cataly
 val tableIdent = extractTableIdent(nameParts)
 RefreshTable(tableIdent)
 
+  case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: 
createDatabaseArgs) =>
+val Seq(
+  allowExisting,
+  dbLocation,
+  databaseComment,
+  dbprops) = getClauses(Seq(
+  "TOK_IFNOTEXISTS",
+  "TOK_DATABASELOCATION",
+  "TOK_DATABASECOMMENT",
+  "TOK_DATABASEPROPERTIES"), createDatabaseArgs)
+val location = dbLocation.map {
+  case Token("TOK_DATABASELOCATION", Token(loc, Nil) :: Nil) => 
unquoteString(loc)
+}
+val comment = databaseComment.map {
+  case Token("TOK_DATABASECOMMENT", Token(comment, Nil) :: Nil) => 
unquoteString(comment)
+}
+val props: Map[String, String] = dbprops.toSeq.flatMap {
+  case Token("TOK_DATABASEPROPERTIES", propList) =>
+propList.flatMap(extractProps(_, "TOK_DBPROPLIST", 
"TOK_TABLEPROPERTY"))
+}.toMap
+CreateDatabase(databaseName, allowExisting.isDefined, location, 
comment, props)(node.source)
+
+  case Token("TOK_CREATEFUNCTION", func :: as :: createFuncArgs) =>
--- End diff --

I am not sure this pattern works for functions with a dotted name. For 
example

create function s.t as 'foo.bar.func'

Produces the following AST:

TOK_CREATEFUNCTION 1, 0, 10, 16
:- s 1, 4, 4, 16
:- t 1, 6, 6, 18
+- 'foo.bar.func' 1, 10, 10, 23

Notice how the name is splitted into two top-level ASTNodes.

I think we need to change the parser rule here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-08 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55332214
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala ---
@@ -64,6 +85,53 @@ private[sql] class SparkQl(conf: ParserConf = 
SimpleParserConf()) extends Cataly
 val tableIdent = extractTableIdent(nameParts)
 RefreshTable(tableIdent)
 
+  case Token("TOK_CREATEDATABASE", Token(databaseName, Nil) :: 
createDatabaseArgs) =>
+val Seq(
+  allowExisting,
--- End diff --

Isn't `allowExists` the exact opposite of `IF NOT EXISTS`? I am asking 
because I have a similar case in my PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-07 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55324768
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
@@ -0,0 +1,197 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import org.apache.spark.Logging
+import org.apache.spark.sql.{Row, SQLContext}
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Attribute, 
AttributeReference}
+import org.apache.spark.sql.execution.datasources.BucketSpec
+import org.apache.spark.sql.types._
+
+
+/**
+ * A DDL command expected to be run in the underlying system without Spark 
parsing the
+ * query text.
+ */
+abstract class NativeDDLCommands(val sql: String) extends RunnableCommand {
+
+  override def run(sqlContext: SQLContext): Seq[Row] = {
+sqlContext.runNativeSql(sql)
+  }
+
+  override val output: Seq[Attribute] = {
+Seq(AttributeReference("result", StringType, nullable = false)())
+  }
+
+}
+
+case class CreateDatabase(
+databaseName: String,
+allowExisting: Boolean,
+path: Option[String],
+comment: Option[String],
+props: Map[String, String])(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class CreateFunction(
+functionName: String,
+alias: String,
+resourcesMap: Map[String, String],
+isTemp: Boolean)(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableRename(
+tableName: TableIdentifier,
+renameTableName: TableIdentifier)(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableSetProperties(
+tableName: TableIdentifier,
+setProperties: Map[String, Option[String]])(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableDropProperties(
+tableName: TableIdentifier,
+dropProperties: Map[String, Option[String]],
+allowExisting: Boolean)(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableSerDeProperties(
+tableName: TableIdentifier,
+serdeClassName: Option[String],
+serdeProperties: Option[Map[String, Option[String]]],
+partition: Option[Map[String, Option[String]]])(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableStoreProperties(
+tableName: TableIdentifier,
+buckets: Option[BucketSpec],
+// TODO: use `clustered` and `sorted` instead for simplicity
+noClustered: Boolean,
+noSorted: Boolean)(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableSkewed(
+tableName: TableIdentifier,
+skewedCols: Seq[String],
+skewedValues: Seq[Seq[String]],
+storedAsDirs: Boolean,
+notSkewed: Boolean,
+// TODO: what??
+notStoredAsDirs: Boolean)(sql: String)
--- End diff --

`notStoredAsDirs` is used when the command is `ALTER TABLE table_name NOT 
STORED AS DIRECTORIES`. When it is true, `storedAsDirs` is false of course.

`storedAsDirs` is used with skewed arguments columns and values.

I think we can use just one variable for them.

When the command is `ALTER TABLE table_name NOT SKEWED`, then the values of 
 `notStoredAsDirs` and `storedAsDirs` don't matter. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: 

[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-07 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55324206
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
@@ -0,0 +1,197 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import org.apache.spark.Logging
+import org.apache.spark.sql.{Row, SQLContext}
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Attribute, 
AttributeReference}
+import org.apache.spark.sql.execution.datasources.BucketSpec
+import org.apache.spark.sql.types._
+
+
+/**
+ * A DDL command expected to be run in the underlying system without Spark 
parsing the
+ * query text.
+ */
+abstract class NativeDDLCommands(val sql: String) extends RunnableCommand {
+
+  override def run(sqlContext: SQLContext): Seq[Row] = {
+sqlContext.runNativeSql(sql)
+  }
+
+  override val output: Seq[Attribute] = {
+Seq(AttributeReference("result", StringType, nullable = false)())
+  }
+
+}
+
+case class CreateDatabase(
+databaseName: String,
+allowExisting: Boolean,
+path: Option[String],
+comment: Option[String],
+props: Map[String, String])(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class CreateFunction(
+functionName: String,
+alias: String,
+resourcesMap: Map[String, String],
+isTemp: Boolean)(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableRename(
+tableName: TableIdentifier,
+renameTableName: TableIdentifier)(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableSetProperties(
+tableName: TableIdentifier,
+setProperties: Map[String, Option[String]])(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableDropProperties(
+tableName: TableIdentifier,
+dropProperties: Map[String, Option[String]],
+allowExisting: Boolean)(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableSerDeProperties(
+tableName: TableIdentifier,
+serdeClassName: Option[String],
+serdeProperties: Option[Map[String, Option[String]]],
+partition: Option[Map[String, Option[String]]])(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableStoreProperties(
+tableName: TableIdentifier,
+buckets: Option[BucketSpec],
+// TODO: use `clustered` and `sorted` instead for simplicity
+noClustered: Boolean,
+noSorted: Boolean)(sql: String)
--- End diff --

Just because the corresponding token is `TOK_NOT_CLUSTERED` and 
`TOK_NOT_SORTED`. We can use positive here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-193637484
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-193637485
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52630/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-07 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-193637270
  
**[Test build #52630 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52630/consoleFull)**
 for PR 11573 at commit 
[`a663b5c`](https://github.com/apache/spark/commit/a663b5ccca7718b41df8b8ede462ac6c9e0b8e8f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-07 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-193610715
  
Note: The only changes I made on top of #11048 is addressing the 
outstanding comments in that patch and some minor clean ups. It's entirely 
possible that there still are things that are missing or incorrect given the 
original patch was not reviewed completely yet.

@hvanhovell @yhuai PTAL.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-07 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11573#issuecomment-193609366
  
**[Test build #52630 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52630/consoleFull)**
 for PR 11573 at commit 
[`a663b5c`](https://github.com/apache/spark/commit/a663b5ccca7718b41df8b8ede462ac6c9e0b8e8f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-07 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55316476
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
@@ -0,0 +1,197 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import org.apache.spark.Logging
+import org.apache.spark.sql.{Row, SQLContext}
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Attribute, 
AttributeReference}
+import org.apache.spark.sql.execution.datasources.BucketSpec
+import org.apache.spark.sql.types._
+
+
+/**
+ * A DDL command expected to be run in the underlying system without Spark 
parsing the
+ * query text.
+ */
+abstract class NativeDDLCommands(val sql: String) extends RunnableCommand {
+
+  override def run(sqlContext: SQLContext): Seq[Row] = {
+sqlContext.runNativeSql(sql)
+  }
+
+  override val output: Seq[Attribute] = {
+Seq(AttributeReference("result", StringType, nullable = false)())
+  }
+
+}
+
+case class CreateDatabase(
+databaseName: String,
+allowExisting: Boolean,
+path: Option[String],
+comment: Option[String],
+props: Map[String, String])(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class CreateFunction(
+functionName: String,
+alias: String,
+resourcesMap: Map[String, String],
+isTemp: Boolean)(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableRename(
+tableName: TableIdentifier,
+renameTableName: TableIdentifier)(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableSetProperties(
+tableName: TableIdentifier,
+setProperties: Map[String, Option[String]])(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableDropProperties(
+tableName: TableIdentifier,
+dropProperties: Map[String, Option[String]],
+allowExisting: Boolean)(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableSerDeProperties(
+tableName: TableIdentifier,
+serdeClassName: Option[String],
+serdeProperties: Option[Map[String, Option[String]]],
+partition: Option[Map[String, Option[String]]])(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableStoreProperties(
+tableName: TableIdentifier,
+buckets: Option[BucketSpec],
+// TODO: use `clustered` and `sorted` instead for simplicity
+noClustered: Boolean,
+noSorted: Boolean)(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableSkewed(
+tableName: TableIdentifier,
+skewedCols: Seq[String],
+skewedValues: Seq[Seq[String]],
+storedAsDirs: Boolean,
+notSkewed: Boolean,
+// TODO: what??
+notStoredAsDirs: Boolean)(sql: String)
--- End diff --

@viirya I'm really confused about this flag. We have another one called 
`storedAsDirs` and this one says `notStoredAsDirs`. It would seem that one is 
always the opposite of the other, but that's not actually the case in tests, 
e.g.

```
// DDLCommandSuite, "alter table: skewed"
val expected3 = AlterTableSkewed(
  TableIdentifier("table_name", None),
  Seq("dt", "country"),
  Seq(List("2008-08-08", "us"), List("2009-09-09", "uk")),
  storedAsDirs = false,
  notSkewed = false,
  notStoredAsDirs = false)(sql3)
```
seems contradictory?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-07 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/11573#discussion_r55316403
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
@@ -0,0 +1,197 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import org.apache.spark.Logging
+import org.apache.spark.sql.{Row, SQLContext}
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.expressions.{Attribute, 
AttributeReference}
+import org.apache.spark.sql.execution.datasources.BucketSpec
+import org.apache.spark.sql.types._
+
+
+/**
+ * A DDL command expected to be run in the underlying system without Spark 
parsing the
+ * query text.
+ */
+abstract class NativeDDLCommands(val sql: String) extends RunnableCommand {
+
+  override def run(sqlContext: SQLContext): Seq[Row] = {
+sqlContext.runNativeSql(sql)
+  }
+
+  override val output: Seq[Attribute] = {
+Seq(AttributeReference("result", StringType, nullable = false)())
+  }
+
+}
+
+case class CreateDatabase(
+databaseName: String,
+allowExisting: Boolean,
+path: Option[String],
+comment: Option[String],
+props: Map[String, String])(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class CreateFunction(
+functionName: String,
+alias: String,
+resourcesMap: Map[String, String],
+isTemp: Boolean)(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableRename(
+tableName: TableIdentifier,
+renameTableName: TableIdentifier)(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableSetProperties(
+tableName: TableIdentifier,
+setProperties: Map[String, Option[String]])(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableDropProperties(
+tableName: TableIdentifier,
+dropProperties: Map[String, Option[String]],
+allowExisting: Boolean)(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableSerDeProperties(
+tableName: TableIdentifier,
+serdeClassName: Option[String],
+serdeProperties: Option[Map[String, Option[String]]],
+partition: Option[Map[String, Option[String]]])(sql: String)
+  extends NativeDDLCommands(sql) with Logging
+
+case class AlterTableStoreProperties(
+tableName: TableIdentifier,
+buckets: Option[BucketSpec],
+// TODO: use `clustered` and `sorted` instead for simplicity
+noClustered: Boolean,
+noSorted: Boolean)(sql: String)
--- End diff --

@viirya was there any reason why these have to be negative? It's much 
easier to understand if it's positive, i.e. `clustered` and `sorted`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13139][SQL] Parse Hive DDL commands our...

2016-03-07 Thread andrewor14
GitHub user andrewor14 opened a pull request:

https://github.com/apache/spark/pull/11573

[SPARK-13139][SQL] Parse Hive DDL commands ourselves

## What changes were proposed in this pull request?

This patch is ported over from @viirya's changes in #11048. Currently for 
most DDLs we just pass the query text directly to Hive. Instead, we should 
parse these commands ourselves and in the future (not part of this patch) use 
the `HiveCatalog` to process these DDLs. This is a pretext to merging 
`SQLContext` and `HiveContext`.

Note: As of this patch we still pass the query text to Hive. The difference 
is that we now parse the commands ourselves so in the future we can just use 
our own catalog.

## How was this patch tested?

Jenkins, new `DDLCommandSuite`, which comprises of about 40% of the changes 
here.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andrewor14/spark parser-plus-plus

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11573.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11573


commit fc3c1684ad8e24ab9b05f0f7e02659ea2e365ebd
Author: Andrew Or 
Date:   2016-03-04T22:20:07Z

Move things into new ParserUtils object

commit adcb561a6bcb91a946448b8a5601155c9d714675
Author: Andrew Or 
Date:   2016-03-07T22:40:48Z

Merge branch 'master' of github.com:apache/spark into parser-plus-plus

commit 010afddf40d776b3009cee98057e74d499c45012
Author: Andrew Or 
Date:   2016-03-07T23:56:09Z

Port over viirya's changes in #11048

commit 007907460d72a3aa82b222567c08589c62efb614
Author: Andrew Or 
Date:   2016-03-08T04:43:31Z

Address comments from #11408 + fix style

commit 02de9b771cc74abb6f1971081411bd6d8c4c4b5f
Author: Andrew Or 
Date:   2016-03-08T04:45:43Z

Merge branch 'master' of github.com:apache/spark into parser-plus-plus

Conflicts:

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/bucket.scala

commit 3766f83bb5551c0fd978117c3f41efb99257c984
Author: Andrew Or 
Date:   2016-03-08T04:49:22Z

Minor fixes




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org