Sergio Peña created HIVE-11940:
----------------------------------
Summary: "INSERT OVERWRITE" query is very slow because it creates
one "distcp" per file to copy data from staging directory to target directory
Key: HIVE-11940
URL: https://issues.apache.org/jira/browse/HIVE-11940
Project: Hive
Issue Type: Bug
Affects Versions: 1.2.1
Reporter: Sergio Peña
Assignee: Sergio Peña
When hive.exec.stagingdir is set to ".hive-staging", which will be placed under
the target directory when running "INSERT OVERWRITE" query, Hive will grab all
files under the staging directory and copy them ONE BY ONE to target directory.
When hive exec.stagingdir is set to "/tmp/hive", Hive will simply do a RENAME
operation which will be instant.
This happens with files that are not encrypted.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)