[
https://issues.apache.org/jira/browse/HIVE-20533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
mahesh kumar behera updated HIVE-20533:
---
Description: Add notification for add partition and insert operation adds
the list of files added by the operation. The file listing is done at target
side for replication load. This takes 2-3 seconds as s3 is slow. This can be
improved by using the file list from the event directory and same can be used
to populate the notification table. (was: In replication load, both add
partition and insert operations are handled through import. Import creates 3
major tasks. Copy, add partition and move. Copy does the copy of data from
source location to staging directory. Then add partition (which runs in
parallel to copy) creates the partition in meta store. Its a no op in case of
insert and by the time this ddl task is executed for insert partition would be
already present. The third operation is move. Which actually moves the file
from staging directory to actual location. And then in case of insert it adds
the insert event to notification table. It does this for add partition
operation which is redundant as the event for add partition would have been
written already by ddl task. With the optimization to copy directly to actual
table location in S3, move task can be avoided for add partition operation
replay and replay of insert need not create the add partition (ddl) task.)
> Adding notification is taking time in S3 replication
>
>
> Key: HIVE-20533
> URL: https://issues.apache.org/jira/browse/HIVE-20533
> Project: Hive
> Issue Type: Sub-task
> Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Add notification for add partition and insert operation adds the list of
> files added by the operation. The file listing is done at target side for
> replication load. This takes 2-3 seconds as s3 is slow. This can be improved
> by using the file list from the event directory and same can be used to
> populate the notification table.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)