[ 
https://issues.apache.org/jira/browse/HIVE-26319?focusedWorklogId=780788&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780788
 ]

ASF GitHub Bot logged work on HIVE-26319:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/Jun/22 11:46
            Start Date: 13/Jun/22 11:46
    Worklog Time Spent: 10m 
      Work Description: kasakrisz opened a new pull request, #3362:
URL: https://github.com/apache/hive/pull/3362

   ### What changes were proposed in this pull request?
   Rewrite update statements of iceberg tables to multi insert statement 
similarly in case of native acid tables.
   
   When generating the rewritten statement:
   * Get the virtual columns from the table's storage handler in case of non 
native acid tables
   * Include the old values to the select clause of the delete branch of the 
multi insert statement.
   
   When executing the multi insert:
   * Two iceberg writers are used which produce a data delta file and a delete 
delta file. The result of these writers should be merged into one 
`FilesForCommit` if both writers are run in the same task.
   * In case of more complex statements (ex. partitioned and/or bucketed) more 
than one Tez task produces commit info so this patch enables storing all of 
them.
   * Every `FileSinkOperator` creates its own jobConf instance because the 
iceberg write operation is stored in it and it is different in both instance.
   
   
   ### Why are the changes needed?
   See #2855
   + Preparation for iceberg Merge implementation.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   ```
   mvn test -Dtest.output.overwrite -DskipSparkTests 
-Dtest=TestIcebergLlapLocalCliDriver -Dqfile=update_iceberg_partitioned_orc2.q 
-pl itests/qtest-iceberg -Piceberg -Pitests
   ```




Issue Time Tracking
-------------------

            Worklog Id:     (was: 780788)
    Remaining Estimate: 0h
            Time Spent: 10m

> Iceberg integration: Perform update split early
> -----------------------------------------------
>
>                 Key: HIVE-26319
>                 URL: https://issues.apache.org/jira/browse/HIVE-26319
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Krisztian Kasa
>            Assignee: Krisztian Kasa
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Extend update split early to iceberg tables like in HIVE-21160 for native 
> acid tables



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to