JohnZZGithub commented on pull request #2185: URL: https://github.com/apache/hadoop/pull/2185#issuecomment-687687926
Thanks for the comments! > > I guess most caller of getMountPoints wants to traverse all the file systems to do some operation. E.g. setVerifyChecksum(). We didn't see issues on our internal Yarn + HDFS and Yarn + GCS clusters. The usage pattern includes but not limited to MR, Spark, Presto, Vertica loading and etc. But it's possible that some users might rely on these APIs. > > In YarnClient seems to be collecting tokens from all DelegationTokenIssuer. > in DelegationTokenIssuer#collectDelegationTokens > > ``` > // Now collect the tokens from the children. > final DelegationTokenIssuer[] ancillary = > issuer.getAdditionalTokenIssuers(); > if (ancillary != null) { > for (DelegationTokenIssuer subIssuer : ancillary) { > collectDelegationTokens(subIssuer, renewer, credentials, tokens); > } > } > ``` > > If you look here issuer is current fs and it's trying to get additionalTokenIssuers. > The default implementation of getDelegationTokenIssuers at FileSystem.java is simply getting all ChildFileSystems. > > ``` > @InterfaceAudience.Private > @Override > public DelegationTokenIssuer[] getAdditionalTokenIssuers() > throws IOException { > return getChildFileSystems(); > } > ``` > > This will get all the child file systems available. Currently the implementation is getChildFileSystems in ViewFileSystem is like below: > > ``` > @Override > public FileSystem[] getChildFileSystems() { > List<InodeTree.MountPoint<FileSystem>> mountPoints = > fsState.getMountPoints(); > Set<FileSystem> children = new HashSet<FileSystem>(); > for (InodeTree.MountPoint<FileSystem> mountPoint : mountPoints) { > FileSystem targetFs = mountPoint.target.targetFileSystem; > children.addAll(Arrays.asList(targetFs.getChildFileSystems())); > } > > if (fsState.isRootInternalDir() && fsState.getRootFallbackLink() != null) { > children.addAll(Arrays.asList( > fsState.getRootFallbackLink().targetFileSystem > .getChildFileSystems())); > } > return children.toArray(new FileSystem[]{}); > } > ``` > > It's iterating over mount points available getting all targetFileSystems. In the case of REGEX based mount points, we will not have any childFileSystems available via getChildFileSystems call. > We also implemented ViewDistributedFileSystem to provide hdfs specific API compatibility. Here also we used getChildFileSystems for some APIs. > > > Returning a MountPint with special FileSystem for Regex Mount points. We could cache the initialized fileSystem under the regex mountpoint and perform the operation. For filesystems that might appear in the future, we could cache the past calls from callers and try to apply it or just not support it. > > I am thinking that, how about adding the resolved mount points from RegxBased to MountPoints list? So, that when user calls getMounts, it will simply return whatever mount points so far inited. How many unique mount points could be there in total with Regx based in practice (resolved mappings)? We should document that, with RegEX based mount points, getMountPoints will return only currently resolved mount points. Totally make sense. I agree it won't work well with delegation tokens now. A few context here, the regex mount point feature was built more than 2 years ago. It was before Delegation Tokens was introduced (HADOOP-14556). The version we are using is also before HADOOP-14556. I agree we need to do more work to support delegation tokens. +1 on document it. Internally, we are using a mixed model of regex mount points and normal mount points. We mainly use regex mount points for GCS buckets. The difference from HDFS is we could use limited user namespaces to represent /user. But it's hard to do it for cloud storage as we will have many more buckets. We could make it clear to users the pros and cons of regex mount points. > > We did see an issue with addDelegationTokens in the secure Hadoop cluster. But the problem we met is not all normal mountpoints are secure. So the API caused a problem when it tries to initialize all children's file systems. We took a workaround by making it path-based. As for getDelegationTokens, I guess the problem is similar. We didn't see issues because it's not used. Could we make it path based too? > > Certainly we can make it uri path based. However users need to make use of it and it could be a long term improvement because users would not change immediately to new APIs what we introduce now. It will take longer time for upstream projects to change. > > > Could we make the inner cache a thread-safe structure and track all the opened file systems under regex mount points? > > Let's target to solve this problem first. Yes, I think maintaining initialized fs-es in InnerCache could help to close fileSystems correctly. Let's make it to thread-safe and add opened fs-es there. Thanks, let me fix it first. Do you think it makes sense to fix the close problem first and fix other issues in other PRs with subtasks? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org