[ https://issues.apache.org/jira/browse/HDFS-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16366761#comment-16366761 ]
Elek, Marton commented on HDFS-13108: ------------------------------------- Hi [~ste...@apache.org], Thank you very much for you review. I uploaded a patch with the improved version (you can also check the diff between the two patches here: https://github.com/elek/hadoop/commit/75f034a811cb78dba07f2be49356601e30febe85) OzoneFileSystem: L95/L304: fixed L433: I could not be null as it's called with the children of a parent path. It's checked in pathToKey (Objects.requireNonNull) I also added an additional check in the unit test if it works with the root path. TestOzoneFileInterfaces: L44, L98, L127, L141: Thanks the feedback, I fixed them. L43: I tried to find a written description of the right order of imports (in hadoop wiki and with google). Couldn't find any reference (please send me RTFM link if there is one). I improved the import order according to existing classes. The only rule what I found is to grouping the java/hadoop/other classes together (The order of the groups are different even between Namenode.java and Datanode.java) Please let me know what is the main rule and I would be happy to improve it further. L150: I appreciate your policy, I fixed all the assertTrue in TestOzoneFileInterface with adding messages to the assertions. {quote} Where is the documentation of the URI? {quote} It's part of HDFS-12664 which is blocked by HDFS-12734. > Ozone: OzoneFileSystem: Simplified url schema for Ozone File System > ------------------------------------------------------------------- > > Key: HDFS-13108 > URL: https://issues.apache.org/jira/browse/HDFS-13108 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone > Affects Versions: HDFS-7240 > Reporter: Elek, Marton > Assignee: Elek, Marton > Priority: Major > Attachments: HDFS-13108-HDFS-7240.001.patch, > HDFS-13108-HDFS-7240.002.patch, HDFS-13108-HDFS-7240.003.patch, > HDFS-13108-HDFS-7240.004.patch, HDFS-13108-HDFS-7240.005.patch > > > A. Current state > > 1. The datanode host / bucket /volume should be defined in the defaultFS (eg. > o3://datanode:9864/test/bucket1) > 2. The root file system points to the bucket (eg. 'dfs -ls /' lists all the > keys from the bucket1) > It works very well, but there are some limitations. > B. Problem one > The current code doesn't support fully qualified locations. For example 'dfs > -ls o3://datanode:9864/test/bucket1/dir1' is not working. > C.) Problem two > I tried to fix the previous problem, but it's not trivial. The biggest > problem is that there is a Path.makeQualified call which could transform > unqualified url to qualified url. This is part of the Path.java so it's > common for all the Hadoop file systems. > In the current implementations it qualifies an url with keeping the schema > (eg. o3:// ) and authority (eg: datanode: 9864) from the defaultfs and use > the relative path as the end of the qualified url. For example: > makeQualfied(defaultUri=o3://datanode:9864/test/bucket1, path=dir1/file) will > return o3://datanode:9864/dir1/file which is obviously wrong (the good would > be o3://datanode:9864/TEST/BUCKET1/dir1/file). I tried to do a workaround > with using a custom makeQualified in the Ozone code and it worked from > command line but couldn't work with Spark which use the Hadoop api and the > original makeQualified path. > D.) Solution > We should support makeQualified calls, so we can use any path in the > defaultFS. > > I propose to use a simplified schema as o3://bucket.volume/ > This is similar to the s3a format where the pattern is s3a://bucket.region/ > We don't need to set the hostname of the datanode (or ksm in case of service > discovery) but it would be configurable with additional hadoop configuraion > values such as fs.o3.bucket.buckename.volumename.address=http://datanode:9864 > (this is how the s3a works today, as I know). > We also need to define restrictions for the volume names (in our case it > should not include dot any more). > ps: some spark output > 2018-02-03 18:43:04 WARN Client:66 - Neither spark.yarn.jars nor > spark.yarn.archive is set, falling back to uploading libraries under > SPARK_HOME. > 2018-02-03 18:43:05 INFO Client:54 - Uploading resource > file:/tmp/spark-03119be0-9c3d-440c-8e9f-48c692412ab5/__spark_libs__2440448967844904444.zip > -> > o3://datanode:9864/user/hadoop/.sparkStaging/application_1517611085375_0001/__spark_libs__2440448967844904444.zip > My default fs was o3://datanode:9864/test/bucket1, but spark qualified the > name of the home directory. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org