Peter Varga created HIVE-23956:
----------------------------------
Summary: Delete delta directory file information should be pushed
to execution side
Key: HIVE-23956
URL: https://issues.apache.org/jira/browse/HIVE-23956
Project: Hive
Issue Type: Improvement
Reporter: Peter Varga
Assignee: Peter Varga
Since HIVE-23840 LLAP cache is used to retrieve the tail of the ORC bucket
files in the delete deltas, but to use the cache the fileId must be determined,
so one more FileSystem call is issued for each bucket.
This fileId is already available during compilation in the AcidState
calculation, we should serialise this to the OrcSplit, and remove the
unnecessary FS calls.
Furthermore instead of sending the SyntheticFileId directly, we should pass the
attemptId instead of the standard path hash, this way the path and the
SyntheticFileId. can be calculated, and it will work even, if the move free
delete operations will be introduced.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)