[ 
https://issues.apache.org/jira/browse/SPARK-27020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784047#comment-16784047
 ] 

Truong Duc Kien commented on SPARK-27020:
-----------------------------------------

Hi, here are the command to reproduce the issue on my cluster. All commands are 
executed using spark-sql without any additional parameters.
{code:sql}
create database test_spark;
use test_spark;
create external table test_insert(a int) partitioned by (part_a string, part_b 
string) stored as parquet location 
'/apps/spark/warehouse/test_spark.db/test_insert';
{code}
{code:sql}
// OK
> insert into table test_insert partition(part_a='a', part_b='b') values(1); 
{code}
{code:sql}
// OK
> insert into table test_insert partition(part_a, part_b) values(2, 'a' , 'b'); 
..
19/03/05 11:17:29 INFO Hive: New loading path = 
hdfs://datalake/apps/spark/warehouse/test_spark.db/test_insert/.hive-staging_hive_2019-03-05_11-17-29_547_8053153849357088752-1/-ext-10000/part_a=a/part_b=b
 with partSpec {part_a=a, part_b=b}
19/03/05 11:17:30 INFO Hive: Loaded 1 partitions
Time taken: 0.71 seconds
...
{code}
{code:sql}
// Not OK
> insert into table test_insert partition(part_a='a', part_b) values (3, 'b'); 
...
19/03/05 11:19:21 WARN warehouse: Cannot create partition spec from 
hdfs://datalake/; missing keys [part_a]
19/03/05 11:19:21 WARN FileOperations: Ignoring invalid DP directory 
hdfs://datalake/apps/spark/warehouse/test_spark.db/test_insert/.hive-staging_hive_2019-03-05_11-19-21_365_800377896579975615-1/-ext-10000/part_b=b
19/03/05 11:19:21 INFO Hive: Loaded 0 partitions
Time taken: 0.466 seconds
...
{code}

> Unable to insert data with partial dynamic partition with Spark & Hive 3
> ------------------------------------------------------------------------
>
>                 Key: SPARK-27020
>                 URL: https://issues.apache.org/jira/browse/SPARK-27020
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.2
>         Environment: Hortonwork HDP 3.1.0
> Spark 2.3.2
> Hive 3
>            Reporter: Truong Duc Kien
>            Priority: Major
>
> When performing inserting data with dynamic partition, the operation fails if 
> all partitions are not dynamic. For example:
> The query
> {code:sql}
> insert overwrite table t1 (part_a='a', part_b) select * from t2
> {code}
> will fails with errors
> {code:xml}
> Cannot create partition spec from hdfs://xxxx/ ; missing keys [part_a]
> Ignoring invalid DP directory <path to staging directory>
> {code}
> On the other hand, if I remove the static value of part_a to make the insert 
> fully dynamic, the following query will success.
> {code:sql}
> insert overwrite table t1 (part_a, part_b) select * from t2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to