[GitHub] [hadoop] ayushtkn commented on a change in pull request #2088: HDFS-15427. Merged ListStatus with Fallback target filesystem and InternalDirViewFS.

2020-06-23 Thread GitBox


ayushtkn commented on a change in pull request #2088:
URL: https://github.com/apache/hadoop/pull/2088#discussion_r443999571



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java
##
@@ -1258,63 +1261,72 @@ public FileStatus getFileStatus(Path f) throws 
IOException {
 FileStatus status =
 ((ChRootedFileSystem)link.getTargetFileSystem())
 .getMyFs().getFileStatus(new Path(linkedPath));
-result[i++] = new FileStatus(status.getLen(), status.isDirectory(),
-status.getReplication(), status.getBlockSize(),
-status.getModificationTime(), status.getAccessTime(),
-status.getPermission(), status.getOwner(), status.getGroup(),
-null, path);
+linkStatuses.add(
+new FileStatus(status.getLen(), status.isDirectory(),
+status.getReplication(), status.getBlockSize(),
+status.getModificationTime(), status.getAccessTime(),
+status.getPermission(), status.getOwner(),
+status.getGroup(), null, path));
   } catch (FileNotFoundException ex) {
 LOG.warn("Cannot get one of the children's(" + path
 + ")  target path(" + link.getTargetFileSystem().getUri()
 + ") file status.", ex);
 throw ex;
   }
 } else {
-  result[i++] =
+  internalDirStatuses.add(
   new FileStatus(0, true, 0, 0, creationTime, creationTime,
   PERMISSION_555, ugi.getShortUserName(),
-  ugi.getPrimaryGroupName(), path);
+  ugi.getPrimaryGroupName(), path));
 }
   }
+  FileStatus[] internalDirStatusesMergedWithFallBack = internalDirStatuses
+  .toArray(new FileStatus[internalDirStatuses.size()]);
   if (fallbackStatuses.length > 0) {
-return consolidateFileStatuses(fallbackStatuses, result);
-  } else {
-return result;
+internalDirStatusesMergedWithFallBack =
+merge(fallbackStatuses, internalDirStatusesMergedWithFallBack);
   }
+  // Links will always have precedence than internalDir or fallback paths.
+  return merge(linkStatuses.toArray(new FileStatus[linkStatuses.size()]),
+  internalDirStatusesMergedWithFallBack);
 }
 
-private FileStatus[] consolidateFileStatuses(FileStatus[] fallbackStatuses,
-FileStatus[] mountPointStatuses) {
+private FileStatus[] merge(FileStatus[] toStatuses,
+FileStatus[] fromStatuses) {
   ArrayList result = new ArrayList<>();
   Set pathSet = new HashSet<>();
-  for (FileStatus status : mountPointStatuses) {
+  for (FileStatus status : toStatuses) {
 result.add(status);
 pathSet.add(status.getPath().getName());
   }
-  for (FileStatus status : fallbackStatuses) {
+  for (FileStatus status : fromStatuses) {
 if (!pathSet.contains(status.getPath().getName())) {
   result.add(status);
 }
   }
-  return result.toArray(new FileStatus[0]);
+  return result.toArray(new FileStatus[result.size()]);
 }
 
 private FileStatus[] listStatusForFallbackLink() throws IOException {
-  if (theInternalDir.isRoot() &&
-  theInternalDir.getFallbackLink() != null) {
-FileSystem linkedFs =
-theInternalDir.getFallbackLink().getTargetFileSystem();
-// Fallback link is only applicable for root
-FileStatus[] statuses = linkedFs.listStatus(new Path("/"));
-for (FileStatus status : statuses) {
-  // Fix the path back to viewfs scheme
-  status.setPath(
-  new Path(myUri.toString(), status.getPath().getName()));
+  if (this.fsState.getRootFallbackLink() != null) {
+FileSystem linkedFallbackFs =
+this.fsState.getRootFallbackLink().getTargetFileSystem();
+Path p = Path.getPathWithoutSchemeAndAuthority(
+new Path(theInternalDir.fullPath));
+if (theInternalDir.isRoot() || linkedFallbackFs.exists(p)) {
+  // Fallback link is only applicable for root

Review comment:
   This comment line can be removed now? Now it isn't just root?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] ayushtkn commented on a change in pull request #2088: HDFS-15427. Merged ListStatus with Fallback target filesystem and InternalDirViewFS.

2020-06-23 Thread GitBox


ayushtkn commented on a change in pull request #2088:
URL: https://github.com/apache/hadoop/pull/2088#discussion_r443999571



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java
##
@@ -1258,63 +1261,72 @@ public FileStatus getFileStatus(Path f) throws 
IOException {
 FileStatus status =
 ((ChRootedFileSystem)link.getTargetFileSystem())
 .getMyFs().getFileStatus(new Path(linkedPath));
-result[i++] = new FileStatus(status.getLen(), status.isDirectory(),
-status.getReplication(), status.getBlockSize(),
-status.getModificationTime(), status.getAccessTime(),
-status.getPermission(), status.getOwner(), status.getGroup(),
-null, path);
+linkStatuses.add(
+new FileStatus(status.getLen(), status.isDirectory(),
+status.getReplication(), status.getBlockSize(),
+status.getModificationTime(), status.getAccessTime(),
+status.getPermission(), status.getOwner(),
+status.getGroup(), null, path));
   } catch (FileNotFoundException ex) {
 LOG.warn("Cannot get one of the children's(" + path
 + ")  target path(" + link.getTargetFileSystem().getUri()
 + ") file status.", ex);
 throw ex;
   }
 } else {
-  result[i++] =
+  internalDirStatuses.add(
   new FileStatus(0, true, 0, 0, creationTime, creationTime,
   PERMISSION_555, ugi.getShortUserName(),
-  ugi.getPrimaryGroupName(), path);
+  ugi.getPrimaryGroupName(), path));
 }
   }
+  FileStatus[] internalDirStatusesMergedWithFallBack = internalDirStatuses
+  .toArray(new FileStatus[internalDirStatuses.size()]);
   if (fallbackStatuses.length > 0) {
-return consolidateFileStatuses(fallbackStatuses, result);
-  } else {
-return result;
+internalDirStatusesMergedWithFallBack =
+merge(fallbackStatuses, internalDirStatusesMergedWithFallBack);
   }
+  // Links will always have precedence than internalDir or fallback paths.
+  return merge(linkStatuses.toArray(new FileStatus[linkStatuses.size()]),
+  internalDirStatusesMergedWithFallBack);
 }
 
-private FileStatus[] consolidateFileStatuses(FileStatus[] fallbackStatuses,
-FileStatus[] mountPointStatuses) {
+private FileStatus[] merge(FileStatus[] toStatuses,
+FileStatus[] fromStatuses) {
   ArrayList result = new ArrayList<>();
   Set pathSet = new HashSet<>();
-  for (FileStatus status : mountPointStatuses) {
+  for (FileStatus status : toStatuses) {
 result.add(status);
 pathSet.add(status.getPath().getName());
   }
-  for (FileStatus status : fallbackStatuses) {
+  for (FileStatus status : fromStatuses) {
 if (!pathSet.contains(status.getPath().getName())) {
   result.add(status);
 }
   }
-  return result.toArray(new FileStatus[0]);
+  return result.toArray(new FileStatus[result.size()]);
 }
 
 private FileStatus[] listStatusForFallbackLink() throws IOException {
-  if (theInternalDir.isRoot() &&
-  theInternalDir.getFallbackLink() != null) {
-FileSystem linkedFs =
-theInternalDir.getFallbackLink().getTargetFileSystem();
-// Fallback link is only applicable for root
-FileStatus[] statuses = linkedFs.listStatus(new Path("/"));
-for (FileStatus status : statuses) {
-  // Fix the path back to viewfs scheme
-  status.setPath(
-  new Path(myUri.toString(), status.getPath().getName()));
+  if (this.fsState.getRootFallbackLink() != null) {
+FileSystem linkedFallbackFs =
+this.fsState.getRootFallbackLink().getTargetFileSystem();
+Path p = Path.getPathWithoutSchemeAndAuthority(
+new Path(theInternalDir.fullPath));
+if (theInternalDir.isRoot() || linkedFallbackFs.exists(p)) {
+  // Fallback link is only applicable for root

Review comment:
   This comment line can be removed now? No it isn't just root

##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFileSystemLinkFallback.java
##
@@ -359,4 +361,249 @@ public void 
testListingWithFallbackLinkWithSameMountDirectories()
   assertTrue(vfs.getFileStatus(childDir).isDirectory());
 }
   }
+
+  /**
+   * Tests ListStatus on non-link parent with fallback configured.
+   * 
=Example.==
+   * = Fallback path tree === Mount Path Tree 
==
+   * 
===
+   * * 

[GitHub] [hadoop] ayushtkn commented on a change in pull request #2088: HDFS-15427. Merged ListStatus with Fallback target filesystem and InternalDirViewFS.

2020-06-22 Thread GitBox


ayushtkn commented on a change in pull request #2088:
URL: https://github.com/apache/hadoop/pull/2088#discussion_r443743633



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java
##
@@ -1258,40 +1260,52 @@ public FileStatus getFileStatus(Path f) throws 
IOException {
 FileStatus status =
 ((ChRootedFileSystem)link.getTargetFileSystem())
 .getMyFs().getFileStatus(new Path(linkedPath));
-result[i++] = new FileStatus(status.getLen(), status.isDirectory(),
-status.getReplication(), status.getBlockSize(),
-status.getModificationTime(), status.getAccessTime(),
-status.getPermission(), status.getOwner(), status.getGroup(),
-null, path);
+linkStatuses.add(
+new FileStatus(status.getLen(), status.isDirectory(),
+status.getReplication(), status.getBlockSize(),
+status.getModificationTime(), status.getAccessTime(),
+status.getPermission(), status.getOwner(),
+status.getGroup(), null, path));
   } catch (FileNotFoundException ex) {
 LOG.warn("Cannot get one of the children's(" + path
 + ")  target path(" + link.getTargetFileSystem().getUri()
 + ") file status.", ex);
 throw ex;
   }
 } else {
-  result[i++] =
+  internalDirStatuses.add(
   new FileStatus(0, true, 0, 0, creationTime, creationTime,
   PERMISSION_555, ugi.getShortUserName(),
-  ugi.getPrimaryGroupName(), path);
+  ugi.getPrimaryGroupName(), path));
 }
   }
+  FileStatus[] internalDirStatusesMergedWithFallBack = internalDirStatuses
+  .toArray(new FileStatus[internalDirStatuses.size()]);
   if (fallbackStatuses.length > 0) {
-return consolidateFileStatuses(fallbackStatuses, result);
+internalDirStatusesMergedWithFallBack =
+merge(fallbackStatuses, internalDirStatusesMergedWithFallBack);
+  }
+
+  // we don't use target file status as we show the mount link as symlink.
+  if (showMountLinksAsSymlinks) {
+return merge(internalDirStatusesMergedWithFallBack,
+linkStatuses.toArray(new FileStatus[linkStatuses.size()]));

Review comment:
   >Currently it will not work because we will say /user/hive is readOnly. 
But the plan is to support and create /user/hive/warehouse2 in fallback fs. So, 
it make sense to get the permission/others from fallback dir and represent. 
Does this make sense? 
   
   In most cases it makes sense, For one case I still have a doubt.
   But say taking the same tree,  /user1/hive/warehouse/partition-0 is mounted. 
and the same partion-0 is available through fallback as well.
   if I perform any operation say mkdir(/user1/hive/warehouse/partition-0/file) 
this will go to mount point resolved path not to partion-0 of the fallback 
path. The above line while listing will show permissions of partion-0 from fall 
back link, when listing /user1/hive/warehouse, Correct? But actually the mount 
entry will always be used. The question is which case partion-0 of the fallback 
will be used above the partion-0 from the mount entry. Even when partion-0 has 
target path available.
   
   if say in the fallback partion-0 isn't a directory, it is a file, we will 
show it as a file on listing /user1/hive/warehouse/, but mkdir on 
(/user1/hive/warehouse/partition-0/file) will be success, Wouldn't it be a 
inconsistent behavior ? 
   
   I modified your test for my case : 
   ```
@Test
 public void testLSOnLinkParentWithFallbackLinkWithSameMountDirectoryTree()
 throws Exception {
   Configuration conf = new Configuration();
   conf.setBoolean(Constants.CONFIG_VIEWFS_MOUNT_LINKS_AS_SYMLINKS, true);
   ConfigUtil.addLink(conf, "/user1/hive/warehouse/partition-0",
   new Path(targetTestRoot.toString()).toUri());
   // Creating multiple directories path under the fallback directory.
   // "/user1/hive/warehouse/partition-0" directory already exists as
   // configured mount point.
   Path file1 = new Path(targetTestRoot,
   "fallbackDir/user1/hive/warehouse/partition-0"); // partition-0 in
   // fallback is a file  ---> Changed this...
   Path dir2 = new Path(targetTestRoot, 
"fallbackDir/user1/hive/warehouse1");
   fsTarget.create(file1).close();
   fsTarget.mkdirs(dir2);
   URI viewFsUri = new URI(FsConstants.VIEWFS_SCHEME,
   Constants.CONFIG_VIEWFS_DEFAULT_MOUNT_TABLE, "/", null, null);
   
   ConfigUtil
   .addLinkFallback(conf, new Path(targetTestRoot, 
"fallbackDir").toUri());
   
   try (FileSystem vfs = FileSystem.get(viewFsUri, conf)) {
 for (FileStatus stat : vfs.listStatus(
 new 

[GitHub] [hadoop] ayushtkn commented on a change in pull request #2088: HDFS-15427. Merged ListStatus with Fallback target filesystem and InternalDirViewFS.

2020-06-22 Thread GitBox


ayushtkn commented on a change in pull request #2088:
URL: https://github.com/apache/hadoop/pull/2088#discussion_r443743633



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java
##
@@ -1258,40 +1260,52 @@ public FileStatus getFileStatus(Path f) throws 
IOException {
 FileStatus status =
 ((ChRootedFileSystem)link.getTargetFileSystem())
 .getMyFs().getFileStatus(new Path(linkedPath));
-result[i++] = new FileStatus(status.getLen(), status.isDirectory(),
-status.getReplication(), status.getBlockSize(),
-status.getModificationTime(), status.getAccessTime(),
-status.getPermission(), status.getOwner(), status.getGroup(),
-null, path);
+linkStatuses.add(
+new FileStatus(status.getLen(), status.isDirectory(),
+status.getReplication(), status.getBlockSize(),
+status.getModificationTime(), status.getAccessTime(),
+status.getPermission(), status.getOwner(),
+status.getGroup(), null, path));
   } catch (FileNotFoundException ex) {
 LOG.warn("Cannot get one of the children's(" + path
 + ")  target path(" + link.getTargetFileSystem().getUri()
 + ") file status.", ex);
 throw ex;
   }
 } else {
-  result[i++] =
+  internalDirStatuses.add(
   new FileStatus(0, true, 0, 0, creationTime, creationTime,
   PERMISSION_555, ugi.getShortUserName(),
-  ugi.getPrimaryGroupName(), path);
+  ugi.getPrimaryGroupName(), path));
 }
   }
+  FileStatus[] internalDirStatusesMergedWithFallBack = internalDirStatuses
+  .toArray(new FileStatus[internalDirStatuses.size()]);
   if (fallbackStatuses.length > 0) {
-return consolidateFileStatuses(fallbackStatuses, result);
+internalDirStatusesMergedWithFallBack =
+merge(fallbackStatuses, internalDirStatusesMergedWithFallBack);
+  }
+
+  // we don't use target file status as we show the mount link as symlink.
+  if (showMountLinksAsSymlinks) {
+return merge(internalDirStatusesMergedWithFallBack,
+linkStatuses.toArray(new FileStatus[linkStatuses.size()]));

Review comment:
   >Currently it will not work because we will say /user/hive is readOnly. 
But the plan is to support and create /user/hive/warehouse2 in fallback fs. So, 
it make sense to get the permission/others from fallback dir and represent. 
Does this make sense? 
   
   In most cases it work, For one case I still have a doubt.
   But say taking the same tree,  /user1/hive/warehouse/partition-0 is mounted. 
and the same partion-0 is available through fallback as well.
   if I perform any operation say mkdir(/user1/hive/warehouse/partition-0/file) 
this will go to mount point resolved path not to partion-0 of the fallback 
path. The above line while listing will show permissions of partion-0 from fall 
back link, when listing /user1/hive/warehouse, Correct? But actually the mount 
entry will always be used. The question is which case partion-0 of the fallback 
will be used above the partion-0 from the mount entry. Even when partion-0 has 
target path available.
   
   if say in the fallback partion-0 isn't a directory, it is a file, we will 
show it as a file on listing /user1/hive/warehouse/, but mkdir on 
(/user1/hive/warehouse/partition-0/file) will be success, Wouldn't it be a 
inconsistent behavior ? 
   
   I modified your test for my case : 
   ```
@Test
 public void testLSOnLinkParentWithFallbackLinkWithSameMountDirectoryTree()
 throws Exception {
   Configuration conf = new Configuration();
   conf.setBoolean(Constants.CONFIG_VIEWFS_MOUNT_LINKS_AS_SYMLINKS, true);
   ConfigUtil.addLink(conf, "/user1/hive/warehouse/partition-0",
   new Path(targetTestRoot.toString()).toUri());
   // Creating multiple directories path under the fallback directory.
   // "/user1/hive/warehouse/partition-0" directory already exists as
   // configured mount point.
   Path file1 = new Path(targetTestRoot,
   "fallbackDir/user1/hive/warehouse/partition-0"); // partition-0 in
   // fallback is a file  ---> Changed this...
   Path dir2 = new Path(targetTestRoot, 
"fallbackDir/user1/hive/warehouse1");
   fsTarget.create(file1).close();
   fsTarget.mkdirs(dir2);
   URI viewFsUri = new URI(FsConstants.VIEWFS_SCHEME,
   Constants.CONFIG_VIEWFS_DEFAULT_MOUNT_TABLE, "/", null, null);
   
   ConfigUtil
   .addLinkFallback(conf, new Path(targetTestRoot, 
"fallbackDir").toUri());
   
   try (FileSystem vfs = FileSystem.get(viewFsUri, conf)) {
 for (FileStatus stat : vfs.listStatus(
 new 

[GitHub] [hadoop] ayushtkn commented on a change in pull request #2088: HDFS-15427. Merged ListStatus with Fallback target filesystem and InternalDirViewFS.

2020-06-22 Thread GitBox


ayushtkn commented on a change in pull request #2088:
URL: https://github.com/apache/hadoop/pull/2088#discussion_r443489757



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java
##
@@ -1258,40 +1260,52 @@ public FileStatus getFileStatus(Path f) throws 
IOException {
 FileStatus status =
 ((ChRootedFileSystem)link.getTargetFileSystem())
 .getMyFs().getFileStatus(new Path(linkedPath));
-result[i++] = new FileStatus(status.getLen(), status.isDirectory(),
-status.getReplication(), status.getBlockSize(),
-status.getModificationTime(), status.getAccessTime(),
-status.getPermission(), status.getOwner(), status.getGroup(),
-null, path);
+linkStatuses.add(
+new FileStatus(status.getLen(), status.isDirectory(),
+status.getReplication(), status.getBlockSize(),
+status.getModificationTime(), status.getAccessTime(),
+status.getPermission(), status.getOwner(),
+status.getGroup(), null, path));
   } catch (FileNotFoundException ex) {
 LOG.warn("Cannot get one of the children's(" + path
 + ")  target path(" + link.getTargetFileSystem().getUri()
 + ") file status.", ex);
 throw ex;
   }
 } else {
-  result[i++] =
+  internalDirStatuses.add(
   new FileStatus(0, true, 0, 0, creationTime, creationTime,
   PERMISSION_555, ugi.getShortUserName(),
-  ugi.getPrimaryGroupName(), path);
+  ugi.getPrimaryGroupName(), path));
 }
   }
+  FileStatus[] internalDirStatusesMergedWithFallBack = internalDirStatuses
+  .toArray(new FileStatus[internalDirStatuses.size()]);
   if (fallbackStatuses.length > 0) {
-return consolidateFileStatuses(fallbackStatuses, result);
+internalDirStatusesMergedWithFallBack =
+merge(fallbackStatuses, internalDirStatusesMergedWithFallBack);
+  }
+
+  // we don't use target file status as we show the mount link as symlink.
+  if (showMountLinksAsSymlinks) {
+return merge(internalDirStatusesMergedWithFallBack,
+linkStatuses.toArray(new FileStatus[linkStatuses.size()]));

Review comment:
   Thanx @umamaheswararao for initiating this, The idea seems good. This 
fallback mechanism is there in RBF too, through default namespace 
implementation, shall be great to bring it to ViewFs as well. 
   I got the overall idea, Just a doubt regarding the above line.
   Why in case `showMountLinkAsSymlinks` is true, we need to give preference to 
the fallback paths?
   Taking example from the structure in 
`testLSOnLinkParentWithFallbackLinkWithSameMountDirectoryTree`
   if we do ls on `/user1/hive/warehouse` the `linkStatuses` will have one 
partition0 which is mounted and the `internalDirStatusesMergedWithFallBack` 
will have another partition0 from the fallback. Shouldn't we give preference to 
the one from the mount entry. Since if I perform an operation on 
`user1/hive/warehouse/partiton0 the operation will be performed on the mount 
link resolved path rather than the Fallback partion0?
   
   Here collision is possible only between `linkStatus` and `fallbackStatus`, 
`linkStatus` and `internalDirStatuses` can not collide since ViewFs isn't 
allowing to mount /user1/hive -> (some path) as well as /user1/hive/warehouse 
-> (some path) which is allowed in RBF. So, In case if the exact mount entry is 
present any call will go to mount entry resolved path and the fallback dir will 
never be used, so in listing also it could have been overwritten?
   Can you help with logic here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org