[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user DazhuangSu commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r132893104 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,55 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def isRangeComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.foreach { expr => + expr.references.foreach { attr => +if (!table.partitionColumnNames.exists(resolver(_, attr.name))) { + throw new AnalysisException(s"${attr.name} is not a valid partition column " + +s"in table ${table.identifier.quotedString}.") +} + } } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(isRangeComparison)) { + val partitionSet = specs.flatMap { spec => +val partitions = catalog.listPartitionsByFilter(table.identifier, Seq(spec)).map(_.spec) +if (partitions.isEmpty && !ifExists) { + throw new AnalysisException(s"There is no partition for ${spec.sql}") +} +partitions + }.distinct + catalog.dropPartitions( +table.identifier, partitionSet, ignoreIfNotExists = ifExists, purge = purge) +} else { + val normalizedSpecs = specs.map { expr => +val spec = splitConjunctivePredicates(expr).map { + case BinaryComparison(AttributeReference(name, _, _, _), right) => name -> right.toString +}.toMap +PartitioningUtils.normalizePartitionSpec( + spec, + table.partitionColumnNames, + table.identifier.quotedString, + resolver) + } + catalog.dropPartitions( +table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +} --- End diff -- The function named isRangeComparison treats BinaryComparison and EqualTo as 2 different scenarios. For BinaryComparison, it lists partitions by filter which are defined by expressions. But for EqualTo, it gets TablePartitionSpec through PartitioningUtils.normalizePartitionSpec using spec(Map[String, String]). Why donât we just treat EqualTo as a special case for BinaryComparison and let catalog.listPartitionsByFilter do the work to get TablePartitionSpec. In my test, this can solve the problem when using int literal after patching this pr. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15704 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87914133 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -243,7 +243,7 @@ partitionSpec ; partitionVal -: identifier (EQ constant)? +: expression --- End diff -- It's removed now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87910722 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,58 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def isRangeComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.foreach { expr => + expr.references.foreach { attr => +if (!table.partitionColumnNames.exists(resolver(_, attr.name))) { + throw new AnalysisException(s"${attr.name} is not a valid partition column " + +s"in table ${table.identifier.quotedString}.") +} + } } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(isRangeComparison)) { + val partitionSet = scala.collection.mutable.Set.empty[CatalogTablePartition] + specs.foreach { spec => +val partitions = catalog.listPartitionsByFilter(table.identifier, Seq(spec)) +if (partitions.nonEmpty) { + partitionSet ++= partitions +} else if (!ifExists) { + throw new AnalysisException(s"There is no partition for ${spec.sql}") +} + } + catalog.dropPartitions(table.identifier, partitionSet.map(_.spec).toSeq, +ignoreIfNotExists = ifExists, purge = purge) +} else { + val normalizedSpecs = specs.map { expr => +val spec = splitConjunctivePredicates(expr).map { + case BinaryComparison(left, right) => --- End diff -- Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87910682 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala --- @@ -215,8 +215,14 @@ case class DataSourceAnalysis(conf: CatalystConf) extends Rule[LogicalPlan] { if (overwrite.enabled) { val deletedPartitions = initialMatchingPartitions.toSet -- updatedPartitions if (deletedPartitions.nonEmpty) { + import org.apache.spark.sql.catalyst.expressions._ + val expressions = deletedPartitions.map { specs => +specs.map { case (key, value) => + EqualTo(AttributeReference(key, StringType)(), Literal.create(value, StringType)) +}.reduceLeft(org.apache.spark.sql.catalyst.expressions.And) --- End diff -- Yep. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87892226 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala --- @@ -215,8 +215,14 @@ case class DataSourceAnalysis(conf: CatalystConf) extends Rule[LogicalPlan] { if (overwrite.enabled) { val deletedPartitions = initialMatchingPartitions.toSet -- updatedPartitions if (deletedPartitions.nonEmpty) { + import org.apache.spark.sql.catalyst.expressions._ + val expressions = deletedPartitions.map { specs => +specs.map { case (key, value) => + EqualTo(AttributeReference(key, StringType)(), Literal.create(value, StringType)) +}.reduceLeft(org.apache.spark.sql.catalyst.expressions.And) --- End diff -- just `And`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87895663 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,58 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def isRangeComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.foreach { expr => + expr.references.foreach { attr => +if (!table.partitionColumnNames.exists(resolver(_, attr.name))) { + throw new AnalysisException(s"${attr.name} is not a valid partition column " + +s"in table ${table.identifier.quotedString}.") +} + } } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(isRangeComparison)) { + val partitionSet = scala.collection.mutable.Set.empty[CatalogTablePartition] + specs.foreach { spec => --- End diff -- use `flatMap` and `distinct`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87896496 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -225,6 +226,108 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country = '$country', quarter = '$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR', quarter > '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=CA/quarter=1") :: +Row("country=CA/quarter=2") :: +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR'), PARTITION (quarter <= '1')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= 2), PARTITION (quarter >= '4')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: Nil) + + // According to the declarative partition spec definitions, this drops the union of target + // partitions without exceptions. Hive raises exceptions because it handle them sequentially. --- End diff -- NIT: handle -> handles --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87895874 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,58 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def isRangeComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.foreach { expr => + expr.references.foreach { attr => +if (!table.partitionColumnNames.exists(resolver(_, attr.name))) { + throw new AnalysisException(s"${attr.name} is not a valid partition column " + +s"in table ${table.identifier.quotedString}.") +} + } } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(isRangeComparison)) { + val partitionSet = scala.collection.mutable.Set.empty[CatalogTablePartition] + specs.foreach { spec => +val partitions = catalog.listPartitionsByFilter(table.identifier, Seq(spec)) +if (partitions.nonEmpty) { + partitionSet ++= partitions +} else if (!ifExists) { + throw new AnalysisException(s"There is no partition for ${spec.sql}") +} + } + catalog.dropPartitions(table.identifier, partitionSet.map(_.spec).toSeq, +ignoreIfNotExists = ifExists, purge = purge) +} else { + val normalizedSpecs = specs.map { expr => +val spec = splitConjunctivePredicates(expr).map { + case BinaryComparison(left, right) => --- End diff -- Use pattern match on left? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87896147 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -243,7 +243,7 @@ partitionSpec ; partitionVal -: identifier (EQ constant)? +: expression --- End diff -- You could also remove the `partitionVal` rule --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87688739 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,66 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def isRangeComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.foreach { expr => + expr.references.foreach { attr => +if (!table.partitionColumnNames.exists(resolver(_, attr.name))) { + throw new AnalysisException(s"${attr.name} is not a valid partition column " + +s"in table ${table.identifier.quotedString}.") +} + } } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(isRangeComparison)) { + if (!ifExists) { +// Prevent query execution if one of partition specs is invalid. +specs.foreach { spec => + val partitions = catalog.listPartitionsByFilter(table.identifier, Seq(spec)) --- End diff -- Yep, correct! Thank you so much, @viirya . Then, I'll update the PR like that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87688495 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,66 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def isRangeComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.foreach { expr => + expr.references.foreach { attr => +if (!table.partitionColumnNames.exists(resolver(_, attr.name))) { + throw new AnalysisException(s"${attr.name} is not a valid partition column " + +s"in table ${table.identifier.quotedString}.") +} + } } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(isRangeComparison)) { + if (!ifExists) { +// Prevent query execution if one of partition specs is invalid. +specs.foreach { spec => + val partitions = catalog.listPartitionsByFilter(table.identifier, Seq(spec)) --- End diff -- I think it is good. Actually the partitions dropped in the end should be the same. The difference is only if an exception is thrown or not, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87684919 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -225,6 +226,111 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country = '$country', quarter = '$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR', quarter > '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=CA/quarter=1") :: +Row("country=CA/quarter=2") :: +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR'), PARTITION (quarter <= '1')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= 2), PARTITION (quarter >= '4')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: Nil) + + val m = intercept[AnalysisException] { +sql("ALTER TABLE sales DROP PARTITION (quarter <= 4), PARTITION (quarter <= '3')") + }.getMessage + // `PARTITION (quarter <= '2')` should raises exceptions because `PARTITION (quarter <= 4)` --- End diff -- Thanks. I fixed that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87683697 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -225,6 +226,111 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country = '$country', quarter = '$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR', quarter > '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=CA/quarter=1") :: +Row("country=CA/quarter=2") :: +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR'), PARTITION (quarter <= '1')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= 2), PARTITION (quarter >= '4')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: Nil) + + val m = intercept[AnalysisException] { +sql("ALTER TABLE sales DROP PARTITION (quarter <= 4), PARTITION (quarter <= '3')") + }.getMessage + // `PARTITION (quarter <= '2')` should raises exceptions because `PARTITION (quarter <= 4)` --- End diff -- Incorrect comment: `PARTITION (quarter <= '2')` -> `PARTITION (quarter <= '3')`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87683492 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,66 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def isRangeComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.foreach { expr => + expr.references.foreach { attr => +if (!table.partitionColumnNames.exists(resolver(_, attr.name))) { + throw new AnalysisException(s"${attr.name} is not a valid partition column " + +s"in table ${table.identifier.quotedString}.") +} + } } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(isRangeComparison)) { + if (!ifExists) { +// Prevent query execution if one of partition specs is invalid. +specs.foreach { spec => + val partitions = catalog.listPartitionsByFilter(table.identifier, Seq(spec)) --- End diff -- Can we not list partitions twice? It might be time consuming task. We can keep listed partitions and drop it in later block. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87559593 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -225,6 +226,102 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country='$country', quarter='$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR', quarter > '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=CA/quarter=1") :: +Row("country=CA/quarter=2") :: +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR'), PARTITION (quarter <= '1')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= 2), PARTITION (quarter >= '4')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: Nil) + + val m = intercept[AnalysisException] { +sql("ALTER TABLE sales DROP PARTITION (quarter <= 4), PARTITION (quarter <= '2')") + }.getMessage + // `PARTITION (quarter <= '2')` should raises exceptions because `PARTITION (quarter <= 4)` + // already removes all partitions. --- End diff -- I'll update PR like the above. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87558776 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -225,6 +226,102 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country='$country', quarter='$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR', quarter > '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=CA/quarter=1") :: +Row("country=CA/quarter=2") :: +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR'), PARTITION (quarter <= '1')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= 2), PARTITION (quarter >= '4')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: Nil) + + val m = intercept[AnalysisException] { +sql("ALTER TABLE sales DROP PARTITION (quarter <= 4), PARTITION (quarter <= '2')") + }.getMessage + // `PARTITION (quarter <= '2')` should raises exceptions because `PARTITION (quarter <= 4)` + // already removes all partitions. --- End diff -- For Case 1, we had better prevent query execution. I think that is consistent with equal only spec. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87556548 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -225,6 +226,102 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country='$country', quarter='$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR', quarter > '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=CA/quarter=1") :: +Row("country=CA/quarter=2") :: +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR'), PARTITION (quarter <= '1')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= 2), PARTITION (quarter >= '4')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: Nil) + + val m = intercept[AnalysisException] { +sql("ALTER TABLE sales DROP PARTITION (quarter <= 4), PARTITION (quarter <= '2')") + }.getMessage + // `PARTITION (quarter <= '2')` should raises exceptions because `PARTITION (quarter <= 4)` + // already removes all partitions. --- End diff -- Case 1 is also interesting. Because the first spec is valid but the second is not. Should we run spec 1 and throw exception for spec 2? Or disallow all of them? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87555750 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -225,6 +226,102 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country='$country', quarter='$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR', quarter > '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=CA/quarter=1") :: +Row("country=CA/quarter=2") :: +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR'), PARTITION (quarter <= '1')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= 2), PARTITION (quarter >= '4')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: Nil) + + val m = intercept[AnalysisException] { +sql("ALTER TABLE sales DROP PARTITION (quarter <= 4), PARTITION (quarter <= '2')") + }.getMessage + // `PARTITION (quarter <= '2')` should raises exceptions because `PARTITION (quarter <= 4)` + // already removes all partitions. --- End diff -- I mean case 2. Current approach will drop the first spec successfully and then the second spec will fail. I am not sure if this behavior is consistent with Hive or not. For the case of dropping partitions with only equal to spec, the current behavior is collecting distinct matching partitions. So the overlapping specs will not be a problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87555430 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -225,6 +226,102 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country='$country', quarter='$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR', quarter > '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=CA/quarter=1") :: +Row("country=CA/quarter=2") :: +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR'), PARTITION (quarter <= '1')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= 2), PARTITION (quarter >= '4')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: Nil) + + val m = intercept[AnalysisException] { +sql("ALTER TABLE sales DROP PARTITION (quarter <= 4), PARTITION (quarter <= '2')") + }.getMessage + // `PARTITION (quarter <= '2')` should raises exceptions because `PARTITION (quarter <= 4)` + // already removes all partitions. --- End diff -- To prevent Case 1, we need to run `listPartitionsByFilter` twice. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87554964 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -225,6 +226,102 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country='$country', quarter='$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR', quarter > '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=CA/quarter=1") :: +Row("country=CA/quarter=2") :: +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR'), PARTITION (quarter <= '1')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= 2), PARTITION (quarter >= '4')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: Nil) + + val m = intercept[AnalysisException] { +sql("ALTER TABLE sales DROP PARTITION (quarter <= 4), PARTITION (quarter <= '2')") + }.getMessage + // `PARTITION (quarter <= '2')` should raises exceptions because `PARTITION (quarter <= 4)` + // already removes all partitions. --- End diff -- There are two cases, so I confused them. 1. If there is only one partition `quarter=3`, `PARTITION (quater <= '2')` should not be allowed here. 2. There is only one partition `quarter=1', `PARTITION (quarter <= 4)` will succeed and `PARTITION (quater <= '2')` will raise exception on runtime. You mean Case 1, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87554053 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -225,6 +226,102 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country='$country', quarter='$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR', quarter > '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=CA/quarter=1") :: +Row("country=CA/quarter=2") :: +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR'), PARTITION (quarter <= '1')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= 2), PARTITION (quarter >= '4')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: Nil) + + val m = intercept[AnalysisException] { +sql("ALTER TABLE sales DROP PARTITION (quarter <= 4), PARTITION (quarter <= '2')") + }.getMessage + // `PARTITION (quarter <= '2')` should raises exceptions because `PARTITION (quarter <= 4)` + // already removes all partitions. --- End diff -- Hi, @viirya . I knew that discussion, but I think this is consistent. The previous code checks the partition existence before dropping. Here, before dropping, both `quarter <= 4` and `quater <= '2'` partition spec is valid because they have their target partitions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87549789 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -225,6 +226,102 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country='$country', quarter='$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR', quarter > '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=CA/quarter=1") :: +Row("country=CA/quarter=2") :: +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR'), PARTITION (quarter <= '1')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= 2), PARTITION (quarter >= '4')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: Nil) + + val m = intercept[AnalysisException] { +sql("ALTER TABLE sales DROP PARTITION (quarter <= 4), PARTITION (quarter <= '2')") + }.getMessage + // `PARTITION (quarter <= '2')` should raises exceptions because `PARTITION (quarter <= 4)` + // already removes all partitions. --- End diff -- As we have discussed before, this behavior may not the same as ALTER TABLE DROP PARTITION with only equal to spec. Should we make them consistent? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87549043 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -226,6 +227,63 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country='$country', quarter='$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") --- End diff -- I added the case by updating the existing testcases. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r87546041 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -226,6 +227,63 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country='$country', quarter='$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") --- End diff -- To add that, we should make another testcases because the remaining partitions are not enough to test that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86393914 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -243,7 +243,7 @@ partitionSpec ; partitionVal -: identifier (EQ constant)? +: identifier (comparisonOperator constant)? --- End diff -- @dongjoon-hyun that is an interesting point. We might need to take a step back here. The only thing I want to prevent from happening is that we are going to recreate an expression hierarchy, and not reuse what is already there. That being said, things must remain concise. So if it takes a large amount of time or code, then we should reconsider. In this case I would like to try the expression approach first if you don't mind. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86389182 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,56 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasNonEqualToComparison(expr: Expression): Boolean = { --- End diff -- Perfect! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86386317 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,56 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasNonEqualToComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.foreach { expr => + expr.references.foreach { attr => +if (!table.partitionColumnNames.exists(resolver(_, attr.name))) { + throw new AnalysisException(s"${attr.name} is not a valid partition column " + +s"in table ${table.identifier.quotedString}.") +} + } } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(hasNonEqualToComparison)) { + specs.foreach { spec => +val partitions = catalog.listPartitionsByFilter(table.identifier, Seq(spec)) +if (partitions.nonEmpty) { + catalog.dropPartitions( +table.identifier, partitions.map(_.spec), ignoreIfNotExists = ifExists, purge = purge) +} else if (!ifExists) { + throw new AnalysisException(s"There is no partition for $spec.") +} + } +} else { + val normalizedSpecs = specs.map { expr => +val spec = splitConjunctivePredicates(expr) + .map(_.asInstanceOf[BinaryComparison]) --- End diff -- Yep. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86385850 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -204,6 +207,41 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging { } /** + * Create a partition filter specification. + */ + def visitPartitionFilterSpec(ctx: PartitionSpecContext): Expression = withOrigin(ctx) { --- End diff -- Yep. I will add that. Also, add similar one to `visitPartitionSpec`, too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86385701 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,56 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasNonEqualToComparison(expr: Expression): Boolean = { --- End diff -- isRangeComparison? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86385537 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -243,7 +243,7 @@ partitionSpec ; partitionVal -: identifier (EQ constant)? +: identifier (comparisonOperator constant)? --- End diff -- You mean that we change this as an expression and validate differently case by case. For example, - `ADD PARTITION` should validate `column=value`. - `DROP PARTITION` should validate `column comparisonOperator constant` except NSEQ. Did I understand correctly? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86377116 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,56 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasNonEqualToComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.foreach { expr => + expr.references.foreach { attr => +if (!table.partitionColumnNames.exists(resolver(_, attr.name))) { + throw new AnalysisException(s"${attr.name} is not a valid partition column " + +s"in table ${table.identifier.quotedString}.") +} + } } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(hasNonEqualToComparison)) { + specs.foreach { spec => +val partitions = catalog.listPartitionsByFilter(table.identifier, Seq(spec)) +if (partitions.nonEmpty) { + catalog.dropPartitions( +table.identifier, partitions.map(_.spec), ignoreIfNotExists = ifExists, purge = purge) +} else if (!ifExists) { + throw new AnalysisException(s"There is no partition for $spec.") +} + } +} else { + val normalizedSpecs = specs.map { expr => +val spec = splitConjunctivePredicates(expr) + .map(_.asInstanceOf[BinaryComparison]) --- End diff -- Just use a pattern match here! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86378445 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,56 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasNonEqualToComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.foreach { expr => --- End diff -- > rant: > this code here to me is evidence that we should actually analyze these commands. The analyzer should throw out invalid commands, instead of doing it all during run(). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86376440 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -204,6 +207,41 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging { } /** + * Create a partition filter specification. + */ + def visitPartitionFilterSpec(ctx: PartitionSpecContext): Expression = withOrigin(ctx) { --- End diff -- Modify this. Let the AST builder make the expression(s) and validate them here. You can also introduce the AttributeReference here. For example: ```scala val parts = ctx.partitionVal.asScala.map { pVal => expression(pVal) match { case cmp @ BinaryComparison(UnresolvedAttribute(name), constant: Literal) => cmp.withNewChildren(Seq(AttributeReference(name, StringType)(), constant)) case _ => throw new ParseException(..., ctx) } } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86374321 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -243,7 +243,7 @@ partitionSpec ; partitionVal -: identifier (EQ constant)? +: identifier (comparisonOperator constant)? --- End diff -- Just make this an expression and check if expressions makes sense in the `AstBuilder`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86296370 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -204,6 +207,40 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging { } /** + * Create a partition filter specification. + */ + def visitPartitionFilterSpec(ctx: PartitionSpecContext): Expression = withOrigin(ctx) { +val parts = ctx.partitionVal.asScala.map { pVal => + val name = pVal.identifier.getText + val operator = Option(pVal.comparisonOperator).map(_.getText) + validate(operator.nonEmpty && !operator.get.equals("<=>"), +"'<=>' operator is not allowed in partition specification.", ctx) --- End diff -- Ah. I missed this. Move the check of "<=>" to match should avoid this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86296115 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,56 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasNonEqualToComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.foreach { expr => + expr.references.foreach { attr => +if (!table.partitionColumnNames.exists(resolver(_, attr.name))) { + throw new AnalysisException(s"${attr.name} is not a valid partition column " + +s"in table ${table.identifier.quotedString}.") +} + } } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(hasNonEqualToComparison)) { + specs.foreach { spec => +val partitions = catalog.listPartitionsByFilter(table.identifier, Seq(spec)) +if (partitions.nonEmpty) { + catalog.dropPartitions( +table.identifier, partitions.map(_.spec), ignoreIfNotExists = ifExists, purge = purge) +} else if (!ifExists) { + throw new AnalysisException(s"There is no partition for $spec.") --- End diff -- https://github.com/apache/hive/blob/345353c0ea5d3ddda9f6d89cbf8cd0e92726fcb6/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2288 I think it should be "Partition or table doesn't exist." --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86295656 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,56 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasNonEqualToComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.foreach { expr => + expr.references.foreach { attr => +if (!table.partitionColumnNames.exists(resolver(_, attr.name))) { + throw new AnalysisException(s"${attr.name} is not a valid partition column " + +s"in table ${table.identifier.quotedString}.") +} + } } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(hasNonEqualToComparison)) { + specs.foreach { spec => +val partitions = catalog.listPartitionsByFilter(table.identifier, Seq(spec)) +if (partitions.nonEmpty) { + catalog.dropPartitions( +table.identifier, partitions.map(_.spec), ignoreIfNotExists = ifExists, purge = purge) +} else if (!ifExists) { + throw new AnalysisException(s"There is no partition for $spec.") --- End diff -- Please check the Hive's error message and maybe we can improve this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86292070 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -204,6 +207,40 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging { } /** + * Create a partition filter specification. + */ + def visitPartitionFilterSpec(ctx: PartitionSpecContext): Expression = withOrigin(ctx) { +val parts = ctx.partitionVal.asScala.map { pVal => + val name = pVal.identifier.getText + val operator = Option(pVal.comparisonOperator).map(_.getText) + validate(operator.nonEmpty && !operator.get.equals("<=>"), +"'<=>' operator is not allowed in partition specification.", ctx) --- End diff -- `quater` has no operator. So, it will not reach here due to `if (operator.isDefined) {`. To make it sure, I'll add the testcase. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86291888 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -193,6 +193,9 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging { ctx: PartitionSpecContext): Map[String, Option[String]] = withOrigin(ctx) { val parts = ctx.partitionVal.asScala.map { pVal => val name = pVal.identifier.getText + val operator = Option(pVal.comparisonOperator).map(_.getText) + validate(operator.isEmpty || operator.get.equals("="), +"Only '=' operator is allowed for this partition specification.", ctx) --- End diff -- Please improve the message to reflect the first case `operator.isEmpty`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86291769 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -204,6 +207,40 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging { } /** + * Create a partition filter specification. + */ + def visitPartitionFilterSpec(ctx: PartitionSpecContext): Expression = withOrigin(ctx) { +val parts = ctx.partitionVal.asScala.map { pVal => + val name = pVal.identifier.getText + val operator = Option(pVal.comparisonOperator).map(_.getText) + validate(operator.nonEmpty && !operator.get.equals("<=>"), +"'<=>' operator is not allowed in partition specification.", ctx) --- End diff -- Please add the above test case into the negative test cases. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86291653 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -204,6 +207,40 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging { } /** + * Create a partition filter specification. + */ + def visitPartitionFilterSpec(ctx: PartitionSpecContext): Expression = withOrigin(ctx) { +val parts = ctx.partitionVal.asScala.map { pVal => + val name = pVal.identifier.getText + val operator = Option(pVal.comparisonOperator).map(_.getText) + validate(operator.nonEmpty && !operator.get.equals("<=>"), +"'<=>' operator is not allowed in partition specification.", ctx) --- End diff -- Actually, this is wrong. ```Scala sql("ALTER TABLE sales DROP PARTITION (country < 'KR', quarter)") ``` The above statement also matches this case, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86291176 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,56 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasNonEqualToComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined --- End diff -- Sorry, I deleted my previous comment. I just realized it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86291122 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -29,9 +29,10 @@ import org.apache.hadoop.mapred.{FileInputFormat, JobConf} import org.apache.spark.sql.{AnalysisException, Row, SparkSession} import org.apache.spark.sql.catalyst.TableIdentifier import org.apache.spark.sql.catalyst.analysis.Resolver -import org.apache.spark.sql.catalyst.catalog.{CatalogDatabase, CatalogTable, CatalogTablePartition, CatalogTableType, SessionCatalog} +import org.apache.spark.sql.catalyst.catalog._ import org.apache.spark.sql.catalyst.catalog.CatalogTypes.TablePartitionSpec -import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference} +import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference, BinaryComparison, + EqualTo, Expression, PredicateHelper} --- End diff -- Yep. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86291115 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,56 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasNonEqualToComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined --- End diff -- There is `And` expression. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86291078 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,56 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasNonEqualToComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined --- End diff -- If we just read the function name, the logics should be like: ```Scala expr.find(e => !e.isInstanceOf[EqualTo]).isDefined ``` Right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86290924 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,56 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasNonEqualToComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.foreach { expr => + expr.references.foreach { attr => +if (!table.partitionColumnNames.exists(resolver(_, attr.name))) { + throw new AnalysisException(s"${attr.name} is not a valid partition column " + +s"in table ${table.identifier.quotedString}.") +} + } } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(hasNonEqualToComparison)) { + specs.foreach { spec => +val partitions = catalog.listPartitionsByFilter(table.identifier, Seq(spec)) +if (partitions.nonEmpty) { + catalog.dropPartitions( +table.identifier, partitions.map(_.spec), ignoreIfNotExists = ifExists, purge = purge) +} else if (!ifExists) { + throw new AnalysisException(s"There is no partition for ${spec}.") --- End diff -- Yep. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86290826 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -204,6 +207,40 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging { } /** + * Create a partition filter specification. + */ + def visitPartitionFilterSpec(ctx: PartitionSpecContext): Expression = withOrigin(ctx) { +val parts = ctx.partitionVal.asScala.map { pVal => + val name = pVal.identifier.getText + val operator = Option(pVal.comparisonOperator).map(_.getText) + validate(operator.nonEmpty && !operator.get.equals("<=>"), +"'<=>' operator is not allowed in partition specification.", ctx) --- End diff -- Yep. Moved. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86290671 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -29,9 +29,10 @@ import org.apache.hadoop.mapred.{FileInputFormat, JobConf} import org.apache.spark.sql.{AnalysisException, Row, SparkSession} import org.apache.spark.sql.catalyst.TableIdentifier import org.apache.spark.sql.catalyst.analysis.Resolver -import org.apache.spark.sql.catalyst.catalog.{CatalogDatabase, CatalogTable, CatalogTablePartition, CatalogTableType, SessionCatalog} +import org.apache.spark.sql.catalyst.catalog._ import org.apache.spark.sql.catalyst.catalog.CatalogTypes.TablePartitionSpec -import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference} +import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference, BinaryComparison, + EqualTo, Expression, PredicateHelper} --- End diff -- ```Scala import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference, BinaryComparison} import org.apache.spark.sql.catalyst.expressions.{EqualTo, Expression, PredicateHelper} ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86289128 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,56 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasNonEqualToComparison(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.foreach { expr => + expr.references.foreach { attr => +if (!table.partitionColumnNames.exists(resolver(_, attr.name))) { + throw new AnalysisException(s"${attr.name} is not a valid partition column " + +s"in table ${table.identifier.quotedString}.") +} + } } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(hasNonEqualToComparison)) { + specs.foreach { spec => +val partitions = catalog.listPartitionsByFilter(table.identifier, Seq(spec)) +if (partitions.nonEmpty) { + catalog.dropPartitions( +table.identifier, partitions.map(_.spec), ignoreIfNotExists = ifExists, purge = purge) +} else if (!ifExists) { + throw new AnalysisException(s"There is no partition for ${spec}.") --- End diff -- Nit: `${spec}` -> `$spec` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86288958 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -226,6 +227,79 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country='$country', quarter='$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR', quarter > '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=CA/quarter=1") :: +Row("country=CA/quarter=2") :: +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR'), PARTITION (quarter <= '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= 3), PARTITION (quarter >= '4')") + checkAnswer(sql("SHOW PARTITIONS sales"), Nil) +} + } + + test("SPARK-17732: Error handling for drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + val m = intercept[AnalysisException] { +sql("ALTER TABLE sales DROP PARTITION (unknown = 'KR')") + }.getMessage + assert(m.contains("unknown is not a valid partition column in table")) + + val m2 = intercept[AnalysisException] { +sql("ALTER TABLE sales DROP PARTITION (unknown < 'KR')") + }.getMessage + assert(m2.contains("unknown is not a valid partition column in table")) + + val m3 = intercept[AnalysisException] { +sql("ALTER TABLE sales DROP PARTITION (unknown <=> 'KR')") + }.getMessage + assert(m3.contains("'<=>' operator is not allowed in partition specification")) --- End diff -- One more negative case? How about `unknown <=> upper('KR')`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86288781 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -204,6 +207,40 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging { } /** + * Create a partition filter specification. + */ + def visitPartitionFilterSpec(ctx: PartitionSpecContext): Expression = withOrigin(ctx) { +val parts = ctx.partitionVal.asScala.map { pVal => + val name = pVal.identifier.getText + val operator = Option(pVal.comparisonOperator).map(_.getText) + validate(operator.nonEmpty && !operator.get.equals("<=>"), +"'<=>' operator is not allowed in partition specification.", ctx) --- End diff -- Why not adding it inside the following `match block`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86282010 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -226,6 +227,63 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country='$country', quarter='$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") --- End diff -- I mean something like PARTITON (quarter <= '2'), PARTITION (quarter >= '4'). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86242187 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -193,6 +193,9 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging { ctx: PartitionSpecContext): Map[String, Option[String]] = withOrigin(ctx) { val parts = ctx.partitionVal.asScala.map { pVal => val name = pVal.identifier.getText + val operator = Option(pVal.comparisonOperator).map(_.getText) + validate(operator.isEmpty || operator.get.equals("="), +"Only '=' partition specification is allowed.", ctx) --- End diff -- I revised it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86239543 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,53 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasComplexExpr(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.flatMap(splitConjunctivePredicates).map { --- End diff -- Yep. That's much better. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86237888 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -204,6 +207,38 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging { } /** + * Create a partition filter specification. + */ + def visitPartitionFilterSpec(ctx: PartitionSpecContext): Expression = withOrigin(ctx) { +val parts = ctx.partitionVal.asScala.map { pVal => + val name = pVal.identifier.getText + val operator = Option(pVal.comparisonOperator).map(_.getText) + if (operator.isDefined) { +val left = AttributeReference(name, DataTypes.StringType)() +val right = expression(pVal.constant) +val operator = pVal.comparisonOperator().getChild(0).asInstanceOf[TerminalNode] +operator.getSymbol.getType match { + case SqlBaseParser.EQ => +EqualTo(left, right) + case SqlBaseParser.NEQ | SqlBaseParser.NEQJ => +Not(EqualTo(left, right)) + case SqlBaseParser.LT => +LessThan(left, right) + case SqlBaseParser.LTE => +LessThanOrEqual(left, right) + case SqlBaseParser.GT => +GreaterThan(left, right) + case SqlBaseParser.GTE => +GreaterThanOrEqual(left, right) --- End diff -- Yep. Validation is added. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86238278 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,53 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasComplexExpr(expr: Expression): Boolean = { --- End diff -- Maybe, `hasNonEqualToComparison` is better? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86076296 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -226,6 +227,63 @@ class HiveDDLSuite } } + test("SPARK-17732: Drop partitions by filter") { +withTable("sales") { + sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") + + for (country <- Seq("US", "CA", "KR")) { +for (quarter <- 1 to 4) { + sql(s"ALTER TABLE sales ADD PARTITION (country='$country', quarter='$quarter')") +} + } + + sql("ALTER TABLE sales DROP PARTITION (country < 'KR')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=1") :: +Row("country=KR/quarter=2") :: +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=1") :: +Row("country=US/quarter=2") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (quarter <= '2')") + checkAnswer(sql("SHOW PARTITIONS sales"), +Row("country=KR/quarter=3") :: +Row("country=KR/quarter=4") :: +Row("country=US/quarter=3") :: +Row("country=US/quarter=4") :: Nil) + + sql("ALTER TABLE sales DROP PARTITION (country='KR', quarter='4')") + sql("ALTER TABLE sales DROP PARTITION (country='US', quarter='3')") --- End diff -- Let's add a test for dropping multiple partition specs? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86075964 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,53 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasComplexExpr(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.flatMap(splitConjunctivePredicates).map { + case BinaryComparison(AttributeReference(key, _, _, _), _) => +table.partitionColumnNames.find(resolver(_, key)).getOrElse { + throw new AnalysisException( +s"$key is not a valid partition column in table ${table.identifier.quotedString}.") +} } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(hasComplexExpr)) { + val partitions = catalog.listPartitionsByFilter(table.identifier, specs) + if (partitions.nonEmpty) { +catalog.dropPartitions( + table.identifier, partitions.map(_.spec), ignoreIfNotExists = ifExists, purge = purge) + } else if (!ifExists) { +throw new AnalysisException(specs.toString) --- End diff -- This might be not clear enough. Add a short error message? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86075872 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,53 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasComplexExpr(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.flatMap(splitConjunctivePredicates).map { --- End diff -- I think we don't need to do `splitConjunctivePredicates`. Just iterates each attribute in every spec expression's `references` and do the following resolving check, should be enough. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86075107 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,53 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasComplexExpr(expr: Expression): Boolean = { --- End diff -- The function name looks confusing. Actually they are not more complex operators, are they? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86074588 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -204,6 +207,38 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging { } /** + * Create a partition filter specification. + */ + def visitPartitionFilterSpec(ctx: PartitionSpecContext): Expression = withOrigin(ctx) { +val parts = ctx.partitionVal.asScala.map { pVal => + val name = pVal.identifier.getText + val operator = Option(pVal.comparisonOperator).map(_.getText) + if (operator.isDefined) { +val left = AttributeReference(name, DataTypes.StringType)() +val right = expression(pVal.constant) +val operator = pVal.comparisonOperator().getChild(0).asInstanceOf[TerminalNode] +operator.getSymbol.getType match { + case SqlBaseParser.EQ => +EqualTo(left, right) + case SqlBaseParser.NEQ | SqlBaseParser.NEQJ => +Not(EqualTo(left, right)) + case SqlBaseParser.LT => +LessThan(left, right) + case SqlBaseParser.LTE => +LessThanOrEqual(left, right) + case SqlBaseParser.GT => +GreaterThan(left, right) + case SqlBaseParser.GTE => +GreaterThanOrEqual(left, right) --- End diff -- Failed to match `SqlBaseParser.NSEQ` might cause runtime error. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r86015221 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -418,27 +419,53 @@ case class AlterTableRenamePartitionCommand( */ case class AlterTableDropPartitionCommand( tableName: TableIdentifier, -specs: Seq[TablePartitionSpec], +specs: Seq[Expression], ifExists: Boolean, purge: Boolean) - extends RunnableCommand { + extends RunnableCommand with PredicateHelper { + + private def hasComplexExpr(expr: Expression): Boolean = { +expr.find(e => e.isInstanceOf[BinaryComparison] && !e.isInstanceOf[EqualTo]).isDefined + } override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val table = catalog.getTableMetadata(tableName) +val resolver = sparkSession.sessionState.conf.resolver DDLUtils.verifyAlterTableType(catalog, table, isView = false) DDLUtils.verifyPartitionProviderIsHive(sparkSession, table, "ALTER TABLE DROP PARTITION") -val normalizedSpecs = specs.map { spec => - PartitioningUtils.normalizePartitionSpec( -spec, -table.partitionColumnNames, -table.identifier.quotedString, -sparkSession.sessionState.conf.resolver) +specs.flatMap(splitConjunctivePredicates).map { + case BinaryComparison(AttributeReference(key, _, _, _), _) => +table.partitionColumnNames.find(resolver(_, key)).getOrElse { + throw new AnalysisException( +s"$key is not a valid partition column in table ${table.identifier.quotedString}.") +} } -catalog.dropPartitions( - table.identifier, normalizedSpecs, ignoreIfNotExists = ifExists, purge = purge) +if (specs.exists(hasComplexExpr)) { --- End diff -- `listPartitionsByFilter` is not supported in `InMemoryCatalog`. So, we should use this only when it is needed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r85984522 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -631,6 +631,42 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat client.dropPartitions(db, table, parts.map(lowerCasePartitionSpec), ignoreIfNotExists, purge) } + override def dropPartitionsByFilter( --- End diff -- Thank you for review, @viirya and @hvanhovell . Sure, no problem. I just thought we need to have this in `ExternalCatalog` before `Catalog Federation ( SPARK-15777 )`. I will remove those stuff. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r85963632 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -631,6 +631,42 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat client.dropPartitions(db, table, parts.map(lowerCasePartitionSpec), ignoreIfNotExists, purge) } + override def dropPartitionsByFilter( --- End diff -- +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15704#discussion_r85873381 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -631,6 +631,42 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat client.dropPartitions(db, table, parts.map(lowerCasePartitionSpec), ignoreIfNotExists, purge) } + override def dropPartitionsByFilter( --- End diff -- As we already have `listPartitionsByFilter`, do we need another `dropPartitionsByFilter`? Looks like there are many duplicate codes between them. We can combine `listPartitionsByFilter` and `dropPartitions` do the same thing, instead of adding new API like this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15704: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION sho...
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/15704 [SPARK-17732][SQL] ALTER TABLE DROP PARTITION should support comparators ## What changes were proposed in this pull request? This PR aims to support `comparators`, e.g. '<', '<=', '>', '>=', again in Apache Spark 2.0 for backward compatibility. **Spark 1.6.2** ``` scala scala> sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") res0: org.apache.spark.sql.DataFrame = [result: string] scala> sql("ALTER TABLE sales DROP PARTITION (country < 'KR')") res1: org.apache.spark.sql.DataFrame = [result: string] ``` **Spark 2.0** ``` scala scala> sql("CREATE TABLE sales(id INT) PARTITIONED BY (country STRING, quarter STRING)") res0: org.apache.spark.sql.DataFrame = [] scala> sql("ALTER TABLE sales DROP PARTITION (country < 'KR')") org.apache.spark.sql.catalyst.parser.ParseException: mismatched input '<' expecting {')', ','}(line 1, pos 42) ``` After this PR, it's supported. ## How was this patch tested? Pass the Jenkins test with a newly added testcase. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-17732-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15704.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15704 commit 84f2315501b2f34d247e8750d1e01fff6ff9fb55 Author: Dongjoon Hyun Date: 2016-10-31T00:46:49Z [SPARK-17732][SQL] ALTER TABLE DROP PARTITION should support comparators --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org