[jira] [Commented] (HDFS-13108) Ozone: OzoneFileSystem: Simplified url schema for Ozone File System

Elek, Marton (JIRA) Fri, 16 Feb 2018 01:26:13 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16366761#comment-16366761
 ]


Elek, Marton commented on HDFS-13108:
-------------------------------------

Hi [~ste...@apache.org],

Thank you very much for you review.

I uploaded a patch with the improved version (you can also check the diff 
between the two patches here: 
https://github.com/elek/hadoop/commit/75f034a811cb78dba07f2be49356601e30febe85)

OzoneFileSystem: L95/L304: fixed

L433: I could not be null as it's called with the children of a parent path. 
It's checked in pathToKey (Objects.requireNonNull) I also added an additional 
check in the unit test if it works with the root path.

TestOzoneFileInterfaces: L44, L98, L127, L141: Thanks the feedback, I fixed 
them.

L43: I tried to find a written description of the right order of imports (in 
hadoop wiki and with google). Couldn't find any reference (please send me RTFM 
link if there is one). I improved the import order according to existing 
classes. The only rule what I found is to grouping the java/hadoop/other 
classes together (The order of the groups are different even between 
Namenode.java and Datanode.java) Please let me know what is the main rule and I 
would be happy to improve it further. 

L150: I appreciate your policy, I fixed all the assertTrue in 
TestOzoneFileInterface with adding messages to the assertions.

{quote}
Where is the documentation of the URI? 
{quote}

It's part of HDFS-12664 which is blocked by HDFS-12734.

> Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
> -------------------------------------------------------------------
>
>                 Key: HDFS-13108
>                 URL: https://issues.apache.org/jira/browse/HDFS-13108
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>    Affects Versions: HDFS-7240
>            Reporter: Elek, Marton
>            Assignee: Elek, Marton
>            Priority: Major
>         Attachments: HDFS-13108-HDFS-7240.001.patch, 
> HDFS-13108-HDFS-7240.002.patch, HDFS-13108-HDFS-7240.003.patch, 
> HDFS-13108-HDFS-7240.004.patch, HDFS-13108-HDFS-7240.005.patch
>
>
> A. Current state
>  
> 1. The datanode host / bucket /volume should be defined in the defaultFS (eg. 
>  o3://datanode:9864/test/bucket1)
> 2. The root file system points to the bucket (eg. 'dfs -ls /' lists all the 
> keys from the bucket1)
> It works very well, but there are some limitations.
> B. Problem one 
> The current code doesn't support fully qualified locations. For example 'dfs 
> -ls o3://datanode:9864/test/bucket1/dir1' is not working.
> C.) Problem two
> I tried to fix the previous problem, but it's not trivial. The biggest 
> problem is that there is a Path.makeQualified call which could transform 
> unqualified url to qualified url. This is part of the Path.java so it's 
> common for all the Hadoop file systems.
> In the current implementations it qualifies an url with keeping the schema 
> (eg. o3:// ) and authority (eg: datanode: 9864) from the defaultfs and use 
> the relative path as the end of the qualified url. For example:
> makeQualfied(defaultUri=o3://datanode:9864/test/bucket1, path=dir1/file) will 
> return o3://datanode:9864/dir1/file which is obviously wrong (the good would 
> be o3://datanode:9864/TEST/BUCKET1/dir1/file). I tried to do a workaround 
> with using a custom makeQualified in the Ozone code and it worked from 
> command line but couldn't work with Spark which use the Hadoop api and the 
> original makeQualified path.
> D.) Solution
> We should support makeQualified calls, so we can use any path in the 
> defaultFS.
>  
> I propose to use a simplified schema as o3://bucket.volume/ 
> This is similar to the s3a  format where the pattern is s3a://bucket.region/ 
> We don't need to set the hostname of the datanode (or ksm in case of service 
> discovery) but it would be configurable with additional hadoop configuraion 
> values such as fs.o3.bucket.buckename.volumename.address=http://datanode:9864 
> (this is how the s3a works today, as I know).
> We also need to define restrictions for the volume names (in our case it 
> should not include dot any more).
> ps: some spark output
> 2018-02-03 18:43:04 WARN  Client:66 - Neither spark.yarn.jars nor 
> spark.yarn.archive is set, falling back to uploading libraries under 
> SPARK_HOME.
> 2018-02-03 18:43:05 INFO  Client:54 - Uploading resource 
> file:/tmp/spark-03119be0-9c3d-440c-8e9f-48c692412ab5/__spark_libs__2440448967844904444.zip
>  -> 
> o3://datanode:9864/user/hadoop/.sparkStaging/application_1517611085375_0001/__spark_libs__2440448967844904444.zip
> My default fs was o3://datanode:9864/test/bucket1, but spark qualified the 
> name of the home directory.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13108) Ozone: OzoneFileSystem: Simplified url schema for Ozone File System

Reply via email to