[ https://issues.apache.org/jira/browse/HIVE-20517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
mahesh kumar behera updated HIVE-20517: --------------------------------------- Status: Open (was: Patch Available) > Creation of staging directory and Move operation is taking time in S3 > --------------------------------------------------------------------- > > Key: HIVE-20517 > URL: https://issues.apache.org/jira/browse/HIVE-20517 > Project: Hive > Issue Type: Sub-task > Components: repl > Affects Versions: 4.0.0 > Reporter: mahesh kumar behera > Assignee: mahesh kumar behera > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20517.01.patch, HIVE-20517.02.patch, > HIVE-20517.03.patch > > > Operations like insert and add partition creates a staging directory to > generate the files and then move the files created to actual location. In > replication flow, the files are first copied to the staging directory and > then moved (rename) to the actual table location. In case of S3, move is not > an atomic operation. It internally does a copy and delete. So it can not > guarantee the consistency required. So it is better to copy the files > directly to the actual location. This will help in avoiding the staging > directory creation (which takes 1-2 seconds in s3) and move (which takes time > proportional to file size). -- This message was sent by Atlassian JIRA (v7.6.3#76005)