Jared Jia created HIVE-29218:
--------------------------------

             Summary: LOAD OVERWRITE PARTITION on muti-level partititoned 
external Iceberg table may unexpectedly delete other partitions1
                 Key: HIVE-29218
                 URL: https://issues.apache.org/jira/browse/HIVE-29218
             Project: Hive
          Issue Type: Bug
          Components: HiveServer2, Iceberg integration
    Affects Versions: 4.0.0-beta-1
            Reporter: Jared Jia
            Assignee: Jared Jia
         Attachments: pcol1=x_pcol2=y.parquet, pcol1=x_pcol2=z.parquet

When using {{appendFile}} as the implementation of {{{}LOAD{}}}, there is an 
issue with handling {{LOAD OVERWRITE}} on multi-level partitioned Iceberg 
tables. The overwrite logic may incorrectly delete partitions that should not 
be affected.

*Steps to Reproduce*

1. Create an Iceberg table partitioned by two columns, {{pcol1}} and 
{{{}pcol2{}}}.
{code:java}
create external table ice_parquet_multi_partitioned (
    strcol string,
    intcol integer
) partitioned by (pcol1 string, pcol2 string)
stored by iceberg; {code}
2. Insert data into partition {{{}pcol1=x/pcol2=y{}}}.
{code:java}
LOAD DATA LOCAL INPATH '/path/to/pcol1=x_pcol2=y.parquet' INTO TABLE 
ice_parquet_multi_partitioned PARTITION (pcol1='x', pcol2='y');{code}
3. Run a {{LOAD OVERWRITE}} into another partition, e.g., 
{{{}pcol1=x/pcol2=z{}}}.
{code:java}
LOAD DATA LOCAL INPATH '/path/to/pcol1=x_pcol2=z.parquet' OVERWRITE INTO TABLE
ice_parquet_multi_partitioned PARTITION (pcol1='x', pcol2='z');{code}
*Expected Behavior*
Only the target partition ({{{}pcol1=x/pcol2=z{}}}) should be overwritten. 
Existing partitions ({{{}pcol1=x/pcol2=y{}}}) should remain intact.

*Actual Behavior*
The existing partition {{pcol1=x/pcol2=y}} is unexpectedly deleted when 
overwriting another partition with the same {{pcol1}} value but different 
{{{}pcol2{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to