[ https://issues.apache.org/jira/browse/SPARK-27020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784047#comment-16784047 ]
Truong Duc Kien commented on SPARK-27020: ----------------------------------------- Hi, here are the command to reproduce the issue on my cluster. All commands are executed using spark-sql without any additional parameters. {code:sql} create database test_spark; use test_spark; create external table test_insert(a int) partitioned by (part_a string, part_b string) stored as parquet location '/apps/spark/warehouse/test_spark.db/test_insert'; {code} {code:sql} // OK > insert into table test_insert partition(part_a='a', part_b='b') values(1); {code} {code:sql} // OK > insert into table test_insert partition(part_a, part_b) values(2, 'a' , 'b'); .. 19/03/05 11:17:29 INFO Hive: New loading path = hdfs://datalake/apps/spark/warehouse/test_spark.db/test_insert/.hive-staging_hive_2019-03-05_11-17-29_547_8053153849357088752-1/-ext-10000/part_a=a/part_b=b with partSpec {part_a=a, part_b=b} 19/03/05 11:17:30 INFO Hive: Loaded 1 partitions Time taken: 0.71 seconds ... {code} {code:sql} // Not OK > insert into table test_insert partition(part_a='a', part_b) values (3, 'b'); ... 19/03/05 11:19:21 WARN warehouse: Cannot create partition spec from hdfs://datalake/; missing keys [part_a] 19/03/05 11:19:21 WARN FileOperations: Ignoring invalid DP directory hdfs://datalake/apps/spark/warehouse/test_spark.db/test_insert/.hive-staging_hive_2019-03-05_11-19-21_365_800377896579975615-1/-ext-10000/part_b=b 19/03/05 11:19:21 INFO Hive: Loaded 0 partitions Time taken: 0.466 seconds ... {code} > Unable to insert data with partial dynamic partition with Spark & Hive 3 > ------------------------------------------------------------------------ > > Key: SPARK-27020 > URL: https://issues.apache.org/jira/browse/SPARK-27020 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.3.2 > Environment: Hortonwork HDP 3.1.0 > Spark 2.3.2 > Hive 3 > Reporter: Truong Duc Kien > Priority: Major > > When performing inserting data with dynamic partition, the operation fails if > all partitions are not dynamic. For example: > The query > {code:sql} > insert overwrite table t1 (part_a='a', part_b) select * from t2 > {code} > will fails with errors > {code:xml} > Cannot create partition spec from hdfs://xxxx/ ; missing keys [part_a] > Ignoring invalid DP directory <path to staging directory> > {code} > On the other hand, if I remove the static value of part_a to make the insert > fully dynamic, the following query will success. > {code:sql} > insert overwrite table t1 (part_a, part_b) select * from t2 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org