[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r330373508 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala ## @@ -2012,6 +2012,26 @@ class SQLQuerySuite extends QueryTest with SQLTestUtils with TestHiveSingleton { } } + test("SPARK-28084 check for case insensitive property of partition column name in load command") { +withTempDir { dir => + val path = dir.toURI.toString.stripSuffix("/") + val dirPath = dir.getAbsoluteFile + Files.append("1", new File(dirPath, "part-r-11"), StandardCharsets.UTF_8) + withTable("part_table") { +withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { + sql( +""" + |CREATE TABLE part_table (c STRING) + |PARTITIONED BY (d STRING) +""".stripMargin) + sql("LOAD DATA LOCAL INPATH '$path/part-r-11' " + Review comment: yeah, not observed that interpolation is happening in the statement (: , corrected it now 👍 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r330373167 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala ## @@ -2012,6 +2012,26 @@ class SQLQuerySuite extends QueryTest with SQLTestUtils with TestHiveSingleton { } } + test("SPARK-28084 check for case insensitive property of partition column name in load command") { +withTempDir { dir => + val path = dir.toURI.toString.stripSuffix("/") + val dirPath = dir.getAbsoluteFile + Files.append("1", new File(dirPath, s"part-r-11"), StandardCharsets.UTF_8) + withTable("part_table") { +withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { + sql( +""" + |CREATE TABLE part_table (c STRING) + |PARTITIONED BY (d STRING) +""".stripMargin) + sql(s"LOAD DATA LOCAL INPATH '$path/part-r-11' " + +s"INTO TABLE part_table PARTITION(D ='1')") Review comment: yeah, not observed that interpolation is happening in the statement (:, corrected it now 👍 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r330192653 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala ## @@ -2012,6 +2012,26 @@ class SQLQuerySuite extends QueryTest with SQLTestUtils with TestHiveSingleton { } } + test("SPARK-28084 check for case insensitive property of partition column name in load command") { +withTempDir { dir => + val path = dir.toURI.toString.stripSuffix("/") + val dirPath = dir.getAbsoluteFile + Files.append("1", new File(dirPath, s"part-r-11"), StandardCharsets.UTF_8) Review comment: updated. thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r330192302 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala ## @@ -2012,6 +2012,26 @@ class SQLQuerySuite extends QueryTest with SQLTestUtils with TestHiveSingleton { } } + test("SPARK-28084 check for case insensitive property of partition column name in load command") { +withTempDir { dir => + val path = dir.toURI.toString.stripSuffix("/") + val dirPath = dir.getAbsoluteFile + Files.append("1", new File(dirPath, s"part-r-11"), StandardCharsets.UTF_8) + withTable("part_table") { +withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { + sql( +""" + |CREATE TABLE part_table (c STRING) + |PARTITIONED BY (d STRING) +""".stripMargin) + sql(s"LOAD DATA LOCAL INPATH '$path/part-r-11' " + +s"INTO TABLE part_table PARTITION(D ='1')") Review comment: updated. thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r330192203 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -278,6 +278,13 @@ case class LoadDataCommand( val catalog = sparkSession.sessionState.catalog val targetTable = catalog.getTableMetadata(table) val tableIdentwithDB = targetTable.identifier.quotedString +val normalizedSpec = partition.map { spec => + PartitioningUtils.normalizePartitionSpec( +spec, +targetTable.partitionColumnNames, +targetTable.identifier.quotedString, Review comment: updated. thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r313005772 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala ## @@ -290,6 +291,44 @@ class HiveCommandSuite extends QueryTest with SQLTestUtils with TestHiveSingleto checkAnswer( sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), sql("SELECT * FROM non_part_table").collect()) + + // check for case insensitive property of partition column name in load command. + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +withInputFile { f => + sql(s"""$loadQuery INPATH "${f.toURI}" INTO TABLE part_table PARTITION(C="1", D="2")""") +} +checkAnswer( + sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), + sql("SELECT * FROM non_part_table").collect()) Review comment: @maropu @dongjoon-hyun There is a small problem, DDLParserSuite is a suite which basically does the verification of parsing and plan resolution, no where i can find a testcase with execution flow, In my case since the resolution happens while Load command execution, so i am not sure we shall use this suite for verifiction of this usecase, please suggest . Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r312742455 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala ## @@ -290,6 +291,44 @@ class HiveCommandSuite extends QueryTest with SQLTestUtils with TestHiveSingleto checkAnswer( sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), sql("SELECT * FROM non_part_table").collect()) + + // check for case insensitive property of partition column name in load command. + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +withInputFile { f => + sql(s"""$loadQuery INPATH "${f.toURI}" INTO TABLE part_table PARTITION(C="1", D="2")""") +} +checkAnswer( + sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), + sql("SELECT * FROM non_part_table").collect()) Review comment: > oh, really? > > https://github.com/apache/spark/blob/a3bbc371cbdbcb469a181de8f58f83e193d4f341/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLParserSuite.scala#L1505 @maropu sorry for misinterpretation, i moved the testcase to DDLParserSuite.scala, please let me know for any further clarifications. thanks for all valuable inputs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r312742455 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala ## @@ -290,6 +291,44 @@ class HiveCommandSuite extends QueryTest with SQLTestUtils with TestHiveSingleto checkAnswer( sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), sql("SELECT * FROM non_part_table").collect()) + + // check for case insensitive property of partition column name in load command. + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +withInputFile { f => + sql(s"""$loadQuery INPATH "${f.toURI}" INTO TABLE part_table PARTITION(C="1", D="2")""") +} +checkAnswer( + sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), + sql("SELECT * FROM non_part_table").collect()) Review comment: > oh, really? > > https://github.com/apache/spark/blob/a3bbc371cbdbcb469a181de8f58f83e193d4f341/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLParserSuite.scala#L1505 @maropu sorry for misinterpettation, i moved the testcase to DDLParserSuite.scala, please let me know for any further clarifications. thanks for all valuable inputs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r306629137 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala ## @@ -290,6 +291,44 @@ class HiveCommandSuite extends QueryTest with SQLTestUtils with TestHiveSingleto checkAnswer( sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), sql("SELECT * FROM non_part_table").collect()) + + // check for case insensitive property of partition column name in load command. + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +withInputFile { f => + sql(s"""$loadQuery INPATH "${f.toURI}" INTO TABLE part_table PARTITION(C="1", D="2")""") +} +checkAnswer( + sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), + sql("SELECT * FROM non_part_table").collect()) Review comment: you want me to move this testcase to DDLParserSuite.scala? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r305624506 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala ## @@ -290,6 +291,44 @@ class HiveCommandSuite extends QueryTest with SQLTestUtils with TestHiveSingleto checkAnswer( sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), sql("SELECT * FROM non_part_table").collect()) + + // check for case insensitive property of partition column name in load command. + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +withInputFile { f => + sql(s"""$loadQuery INPATH "${f.toURI}" INTO TABLE part_table PARTITION(C="1", D="2")""") +} +checkAnswer( + sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), + sql("SELECT * FROM non_part_table").collect()) + } +} + } + + test("SPARK-28084 case insensitive property of partition column name in insert command") { +// check for case insensitive property of partition column name in insert command. +withTable("part_table") { + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +sql("CREATE TABLE part_table (price int, qty int) partitioned by (year int, month int)") +sql("INSERT INTO part_table PARTITION(YEar = 2015, month = 1) SELECT 1, 1") +checkAnswer( + sql("SELECT * FROM part_table"), + Row(1, 1, 2015, 1)) + } +} + } + + test("SPARK-28084 case insensitive property of partition column name " + +"in insert command - dynamic partition") { Review comment: Handled the comments let me know for further suggestions/clarifications. thanks for the review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r305624456 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala ## @@ -290,6 +291,44 @@ class HiveCommandSuite extends QueryTest with SQLTestUtils with TestHiveSingleto checkAnswer( sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), sql("SELECT * FROM non_part_table").collect()) + + // check for case insensitive property of partition column name in load command. + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +withInputFile { f => + sql(s"""$loadQuery INPATH "${f.toURI}" INTO TABLE part_table PARTITION(C="1", D="2")""") +} +checkAnswer( + sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), + sql("SELECT * FROM non_part_table").collect()) + } +} + } + + test("SPARK-28084 case insensitive property of partition column name in insert command") { +// check for case insensitive property of partition column name in insert command. +withTable("part_table") { + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +sql("CREATE TABLE part_table (price int, qty int) partitioned by (year int, month int)") +sql("INSERT INTO part_table PARTITION(YEar = 2015, month = 1) SELECT 1, 1") +checkAnswer( + sql("SELECT * FROM part_table"), + Row(1, 1, 2015, 1)) + } +} + } + + test("SPARK-28084 case insensitive property of partition column name " + +"in insert command - dynamic partition") { Review comment: you are right, thanks for the suggestion :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r305624432 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala ## @@ -290,6 +291,44 @@ class HiveCommandSuite extends QueryTest with SQLTestUtils with TestHiveSingleto checkAnswer( sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), sql("SELECT * FROM non_part_table").collect()) + + // check for case insensitive property of partition column name in load command. + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +withInputFile { f => + sql(s"""$loadQuery INPATH "${f.toURI}" INTO TABLE part_table PARTITION(C="1", D="2")""") +} +checkAnswer( + sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), + sql("SELECT * FROM non_part_table").collect()) + } +} + } + + test("SPARK-28084 case insensitive property of partition column name in insert command") { +// check for case insensitive property of partition column name in insert command. +withTable("part_table") { + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +sql("CREATE TABLE part_table (price int, qty int) partitioned by (year int, month int)") +sql("INSERT INTO part_table PARTITION(YEar = 2015, month = 1) SELECT 1, 1") +checkAnswer( + sql("SELECT * FROM part_table"), + Row(1, 1, 2015, 1)) + } +} + } + + test("SPARK-28084 case insensitive property of partition column name " + +"in insert command - dynamic partition") { +// check for case insensitive property of partition column name in insert command. +withTable("part_table") { + val modeConfKey = "hive.exec.dynamic.partition.mode" + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false", modeConfKey -> "nonstrict") { Review comment: fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r305624421 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala ## @@ -290,6 +291,44 @@ class HiveCommandSuite extends QueryTest with SQLTestUtils with TestHiveSingleto checkAnswer( sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), sql("SELECT * FROM non_part_table").collect()) + + // check for case insensitive property of partition column name in load command. + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +withInputFile { f => + sql(s"""$loadQuery INPATH "${f.toURI}" INTO TABLE part_table PARTITION(C="1", D="2")""") +} +checkAnswer( + sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), + sql("SELECT * FROM non_part_table").collect()) Review comment: > Is this LOAD issue specific to Hive? If not, could you move this test to `DDLParserSuite.scala`? Yes, i think Load commands are specific to Hive tables. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r305624372 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala ## @@ -290,6 +291,44 @@ class HiveCommandSuite extends QueryTest with SQLTestUtils with TestHiveSingleto checkAnswer( sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), sql("SELECT * FROM non_part_table").collect()) + + // check for case insensitive property of partition column name in load command. + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +withInputFile { f => + sql(s"""$loadQuery INPATH "${f.toURI}" INTO TABLE part_table PARTITION(C="1", D="2")""") +} +checkAnswer( + sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), + sql("SELECT * FROM non_part_table").collect()) Review comment: nit: you don't need collect(). Fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r305484319 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala ## @@ -290,6 +291,29 @@ class HiveCommandSuite extends QueryTest with SQLTestUtils with TestHiveSingleto checkAnswer( sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), sql("SELECT * FROM non_part_table").collect()) + + // check for case insensitive property of partition column name in load command. + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +withInputFile { f => + sql(s"""$loadQuery INPATH "${f.toURI}" INTO TABLE part_table PARTITION(C="1", D="2")""") +} +checkAnswer( + sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), + sql("SELECT * FROM non_part_table").collect()) + } +} + } + + test("case insensitive property of partition column name in insert command") { +// check for case insensitive property of partition column name in insert command. +withTable("part_table") { + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +sql("CREATE TABLE part_table (price int, qty int) partitioned by (year int, month int)") +sql("INSERT INTO part_table PARTITION(YEar = 2015, month = 1) SELECT 1, 1") Review comment: Added, please let me the know the added test-case is sufficient or not, thanks for your valuable suggestions and time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r305483970 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala ## @@ -290,6 +291,29 @@ class HiveCommandSuite extends QueryTest with SQLTestUtils with TestHiveSingleto checkAnswer( sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), sql("SELECT * FROM non_part_table").collect()) + + // check for case insensitive property of partition column name in load command. + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +withInputFile { f => + sql(s"""$loadQuery INPATH "${f.toURI}" INTO TABLE part_table PARTITION(C="1", D="2")""") +} +checkAnswer( + sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), + sql("SELECT * FROM non_part_table").collect()) + } +} + } + + test("case insensitive property of partition column name in insert command") { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r304197975 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -296,8 +296,10 @@ case class LoadDataCommand( s"do not match number of partitioned columns in table " + s"(${targetTable.partitionColumnNames.size})") } + val resolver = sparkSession.sessionState.conf.resolver partition.get.keys.foreach { colName => -if (!targetTable.partitionColumnNames.contains(colName)) { +if (!targetTable + .partitionColumnNames.exists(fieldName => resolver(fieldName, colName))) { Review comment: done. thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r304197975 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -296,8 +296,10 @@ case class LoadDataCommand( s"do not match number of partitioned columns in table " + s"(${targetTable.partitionColumnNames.size})") } + val resolver = sparkSession.sessionState.conf.resolver partition.get.keys.foreach { colName => -if (!targetTable.partitionColumnNames.contains(colName)) { +if (!targetTable + .partitionColumnNames.exists(fieldName => resolver(fieldName, colName))) { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r304029971 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ## @@ -879,7 +879,11 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat // columns. Here we Lowercase the column names before passing the partition spec to Hive // client, to satisfy Hive. // scalastyle:off caselocale - orderedPartitionSpec.put(colName.toLowerCase, partition(colName)) + partition.keys.foreach { partColName => +if (partColName.toLowerCase == colName.toLowerCase()) { Review comment: very valid point, but the problem here is the partition name in TablePartitionSpec string wont match with the column name coming from the table metadata, so when we try to get the partition value from TablePartitionSpec with table metadata column name we get empty .this is why i am trying to get the partition value based on the column name specified in TablePartitionSpec . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r304031127 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ## @@ -879,7 +879,11 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat // columns. Here we Lowercase the column names before passing the partition spec to Hive // client, to satisfy Hive. // scalastyle:off caselocale - orderedPartitionSpec.put(colName.toLowerCase, partition(colName)) + partition.keys.foreach { partColName => +if (partColName.toLowerCase == colName.toLowerCase()) { Review comment: > I left a few comments, @sujith71955 . The UT is the most important comment. Could you update your PR, please? I updated my PR based on above comments. thanks all for the valuable time and great suggestions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r304030670 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala ## @@ -290,6 +291,13 @@ class HiveCommandSuite extends QueryTest with SQLTestUtils with TestHiveSingleto checkAnswer( sql("SELECT employeeID, employeeName FROM part_table WHERE c = '2' AND d = '1'"), sql("SELECT * FROM non_part_table").collect()) + + // check for case insensitive of partition columns while querying. + withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") { +checkAnswer( + sql("SELECT employeeID, employeeName FROM part_table WHERE C = '2' AND D = '1'"), + sql("SELECT * FROM non_part_table").collect()) + } Review comment: sorry it was a mistake :) i missed load data command where we shall specify the partition column name with different case compare to the actual provided while creation. thanks for pointing out. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r304029971 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ## @@ -879,7 +879,11 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat // columns. Here we Lowercase the column names before passing the partition spec to Hive // client, to satisfy Hive. // scalastyle:off caselocale - orderedPartitionSpec.put(colName.toLowerCase, partition(colName)) + partition.keys.foreach { partColName => +if (partColName.toLowerCase == colName.toLowerCase()) { Review comment: very valid question, the problem here is the partition spec string wont match with the column name coming from the table metadata, so when we try to get the partition value detail from TablePartitionSpec based on table metadata column name we get empty .this is why i am trying to get the partition value based on the column name specified in spec. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r304022398 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -29,7 +29,7 @@ import org.apache.hadoop.fs.{FileContext, FsConstants, Path} import org.apache.spark.sql.{AnalysisException, Row, SparkSession} import org.apache.spark.sql.catalyst.TableIdentifier -import org.apache.spark.sql.catalyst.analysis.{NoSuchPartitionException, UnresolvedAttribute, UnresolvedRelation} +import org.apache.spark.sql.catalyst.analysis.{NoSuchPartitionException, Resolver, UnresolvedAttribute, UnresolvedRelation} Review comment: fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r304015757 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -296,8 +296,10 @@ case class LoadDataCommand( s"do not match number of partitioned columns in table " + s"(${targetTable.partitionColumnNames.size})") } + val resolverFunc = sparkSession.sessionState.conf.resolver partition.get.keys.foreach { colName => -if (!targetTable.partitionColumnNames.contains(colName)) { +if (!targetTable + .partitionColumnNames.exists(fieldName => resolverFunc(fieldName, colName))) { Review comment: fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
sujith71955 commented on a change in pull request #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#discussion_r304015169 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -296,8 +296,10 @@ case class LoadDataCommand( s"do not match number of partitioned columns in table " + s"(${targetTable.partitionColumnNames.size})") } + val resolverFunc = sparkSession.sessionState.conf.resolver Review comment: fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org