[jira] [Updated] (HIVE-20533) Adding notification is taking time in S3 replication

2022-10-21 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-20533:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Adding notification is taking time in S3 replication
> 
>
> Key: HIVE-20533
> URL: https://issues.apache.org/jira/browse/HIVE-20533
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>
> Add notification for add partition and insert operation adds the list of 
> files added by the operation. The file listing is done at target side for 
> replication load. This takes 2-3 seconds as s3 is slow. This can be improved 
> by using the file list from the event directory and same can be used to 
> populate the notification table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-20533) Adding notification is taking time in S3 replication

2018-09-10 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-20533:
---
Description: Add notification for add partition and insert operation adds 
the list of files added by the operation. The file listing is done at target 
side for replication load. This takes 2-3 seconds as s3 is slow. This can be 
improved by using the file list from the event directory and same can be used 
to populate the notification table.  (was: In replication load, both add 
partition and insert operations are handled through import. Import creates 3 
major tasks. Copy, add partition and move. Copy does the copy of data from 
source location to staging directory. Then add partition (which runs in 
parallel to copy) creates the partition in meta store. Its a no op in case of 
insert and by the time this ddl task is executed for insert partition would be 
already present. The third operation is move. Which actually moves the file 
from staging directory to actual location. And then in case of insert it adds 
the insert event to notification table. It does this for add partition 
operation which is redundant as the event for add partition would have been 
written already by ddl task. With the optimization to copy directly to actual 
table location in S3, move task can be avoided for add partition operation 
replay and replay of insert need not create the add partition (ddl) task.)

> Adding notification is taking time in S3 replication
> 
>
> Key: HIVE-20533
> URL: https://issues.apache.org/jira/browse/HIVE-20533
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Add notification for add partition and insert operation adds the list of 
> files added by the operation. The file listing is done at target side for 
> replication load. This takes 2-3 seconds as s3 is slow. This can be improved 
> by using the file list from the event directory and same can be used to 
> populate the notification table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)