sankarh commented on code in PR #3909:
URL: https://github.com/apache/hive/pull/3909#discussion_r1062211047
##########
ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java:
##########
@@ -5263,23 +5263,38 @@ public Void call() throws HiveException {
private static void moveAcidFilesForDelta(String deltaFileType, FileSystem
fs,
Path dst, Set<Path>
createdDeltaDirs,
- List<Path> newFiles, FileStatus
deltaStat) throws HiveException {
-
+ List<Path> newFiles, FileStatus
deltaStat, HiveConf conf) throws HiveException {
+ String configuredOwner = HiveConf.getVar(conf,
ConfVars.HIVE_LOAD_DATA_OWNER);
Path deltaPath = deltaStat.getPath();
// Create the delta directory. Don't worry if it already exists,
// as that likely means another task got to it first. Then move each of
the buckets.
// it would be more efficient to try to move the delta with it's buckets
but that is
// harder to make race condition proof.
Path deltaDest = new Path(dst, deltaPath.getName());
try {
+ FileSystem destFs = deltaDest.getFileSystem(conf);
if (!createdDeltaDirs.contains(deltaDest)) {
try {
- if(fs.mkdirs(deltaDest)) {
+ // Check if the src and dest filesystems are same or not
+ // if the src and dest filesystems are different, then we need to do
copy, instead of rename
+ if (needToCopy(conf, deltaStat.getPath(), deltaDest, fs, destFs,
configuredOwner, true)) {
Review Comment:
Can you add tests that cover these 2 newly added flows?
##########
ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java:
##########
@@ -5263,23 +5263,38 @@ public Void call() throws HiveException {
private static void moveAcidFilesForDelta(String deltaFileType, FileSystem
fs,
Path dst, Set<Path>
createdDeltaDirs,
- List<Path> newFiles, FileStatus
deltaStat) throws HiveException {
-
+ List<Path> newFiles, FileStatus
deltaStat, HiveConf conf) throws HiveException {
+ String configuredOwner = HiveConf.getVar(conf,
ConfVars.HIVE_LOAD_DATA_OWNER);
Path deltaPath = deltaStat.getPath();
// Create the delta directory. Don't worry if it already exists,
// as that likely means another task got to it first. Then move each of
the buckets.
// it would be more efficient to try to move the delta with it's buckets
but that is
// harder to make race condition proof.
Path deltaDest = new Path(dst, deltaPath.getName());
try {
+ FileSystem destFs = deltaDest.getFileSystem(conf);
if (!createdDeltaDirs.contains(deltaDest)) {
try {
- if(fs.mkdirs(deltaDest)) {
+ // Check if the src and dest filesystems are same or not
+ // if the src and dest filesystems are different, then we need to do
copy, instead of rename
+ if (needToCopy(conf, deltaStat.getPath(), deltaDest, fs, destFs,
configuredOwner, true)) {
+ //copy if across file system or encryption zones.
+ LOG.debug("Copying source " + deltaStat.getPath() + " to " +
deltaDest + " because HDFS encryption zones are different.");
Review Comment:
The log msg can also highlight, the copy is across different file systems or
encryption zones.
##########
ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java:
##########
@@ -5294,11 +5309,24 @@ private static void moveAcidFilesForDelta(String
deltaFileType, FileSystem fs,
for (FileStatus bucketStat : bucketStats) {
Path bucketSrc = bucketStat.getPath();
Path bucketDest = new Path(deltaDest, bucketSrc.getName());
+ FileSystem bucketDestFs = bucketDest.getFileSystem(conf);
LOG.info("Moving bucket " + bucketSrc.toUri().toString() + " to " +
bucketDest.toUri().toString());
try {
- fs.rename(bucketSrc, bucketDest);
- if (newFiles != null) {
+ // Check if the src and dest filesystems are same or not
+ // if the src and dest filesystems are different, then we need to
do copy, instead of rename
+ if (needToCopy(conf, bucketSrc, bucketDest, fs, bucketDestFs,
configuredOwner, true)) {
+ //copy if across file system or encryption zones.
+ LOG.debug("Copying source " + bucketSrc + " to " + bucketDest +
" because HDFS encryption zones are different.");
+ FileUtils.copy(fs, bucketSrc, bucketDestFs, bucketDest,
+ true, // delete source
+ true, // overwrite destination
+ conf);
+ } else {
+ // do the rename
+ bucketDestFs.rename(bucketSrc, bucketDest);
+ }
+ if (newFiles != null) {
newFiles.add(bucketDest);
Review Comment:
Need to add extra tab space for the body of "if" path.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]