[jira] [Updated] (HDFS-13108) Ozone: OzoneFileSystem: Simplified url schema for Ozone File System

2018-03-13 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-13108:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-7240
   Status: Resolved  (was: Patch Available)

[~elek] Thanks for the contribution. I have committed this to the feature 
branch.

> Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
> ---
>
> Key: HDFS-13108
> URL: https://issues.apache.org/jira/browse/HDFS-13108
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: HDFS-7240
>
> Attachments: HDFS-13108-HDFS-7240.001.patch, 
> HDFS-13108-HDFS-7240.002.patch, HDFS-13108-HDFS-7240.003.patch, 
> HDFS-13108-HDFS-7240.005.patch, HDFS-13108-HDFS-7240.006.patch, 
> HDFS-13108-HDFS-7240.007.patch
>
>
> A. Current state
>  
> 1. The datanode host / bucket /volume should be defined in the defaultFS (eg. 
>  o3://datanode:9864/test/bucket1)
> 2. The root file system points to the bucket (eg. 'dfs -ls /' lists all the 
> keys from the bucket1)
> It works very well, but there are some limitations.
> B. Problem one 
> The current code doesn't support fully qualified locations. For example 'dfs 
> -ls o3://datanode:9864/test/bucket1/dir1' is not working.
> C.) Problem two
> I tried to fix the previous problem, but it's not trivial. The biggest 
> problem is that there is a Path.makeQualified call which could transform 
> unqualified url to qualified url. This is part of the Path.java so it's 
> common for all the Hadoop file systems.
> In the current implementations it qualifies an url with keeping the schema 
> (eg. o3:// ) and authority (eg: datanode: 9864) from the defaultfs and use 
> the relative path as the end of the qualified url. For example:
> makeQualfied(defaultUri=o3://datanode:9864/test/bucket1, path=dir1/file) will 
> return o3://datanode:9864/dir1/file which is obviously wrong (the good would 
> be o3://datanode:9864/TEST/BUCKET1/dir1/file). I tried to do a workaround 
> with using a custom makeQualified in the Ozone code and it worked from 
> command line but couldn't work with Spark which use the Hadoop api and the 
> original makeQualified path.
> D.) Solution
> We should support makeQualified calls, so we can use any path in the 
> defaultFS.
>  
> I propose to use a simplified schema as o3://bucket.volume/ 
> This is similar to the s3a  format where the pattern is s3a://bucket.region/ 
> We don't need to set the hostname of the datanode (or ksm in case of service 
> discovery) but it would be configurable with additional hadoop configuraion 
> values such as fs.o3.bucket.buckename.volumename.address=http://datanode:9864 
> (this is how the s3a works today, as I know).
> We also need to define restrictions for the volume names (in our case it 
> should not include dot any more).
> ps: some spark output
> 2018-02-03 18:43:04 WARN  Client:66 - Neither spark.yarn.jars nor 
> spark.yarn.archive is set, falling back to uploading libraries under 
> SPARK_HOME.
> 2018-02-03 18:43:05 INFO  Client:54 - Uploading resource 
> file:/tmp/spark-03119be0-9c3d-440c-8e9f-48c692412ab5/__spark_libs__244044896784490.zip
>  -> 
> o3://datanode:9864/user/hadoop/.sparkStaging/application_1517611085375_0001/__spark_libs__244044896784490.zip
> My default fs was o3://datanode:9864/test/bucket1, but spark qualified the 
> name of the home directory.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13108) Ozone: OzoneFileSystem: Simplified url schema for Ozone File System

2018-02-20 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-13108:

Attachment: HDFS-13108-HDFS-7240.007.patch

> Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
> ---
>
> Key: HDFS-13108
> URL: https://issues.apache.org/jira/browse/HDFS-13108
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-13108-HDFS-7240.001.patch, 
> HDFS-13108-HDFS-7240.002.patch, HDFS-13108-HDFS-7240.003.patch, 
> HDFS-13108-HDFS-7240.005.patch, HDFS-13108-HDFS-7240.006.patch, 
> HDFS-13108-HDFS-7240.007.patch
>
>
> A. Current state
>  
> 1. The datanode host / bucket /volume should be defined in the defaultFS (eg. 
>  o3://datanode:9864/test/bucket1)
> 2. The root file system points to the bucket (eg. 'dfs -ls /' lists all the 
> keys from the bucket1)
> It works very well, but there are some limitations.
> B. Problem one 
> The current code doesn't support fully qualified locations. For example 'dfs 
> -ls o3://datanode:9864/test/bucket1/dir1' is not working.
> C.) Problem two
> I tried to fix the previous problem, but it's not trivial. The biggest 
> problem is that there is a Path.makeQualified call which could transform 
> unqualified url to qualified url. This is part of the Path.java so it's 
> common for all the Hadoop file systems.
> In the current implementations it qualifies an url with keeping the schema 
> (eg. o3:// ) and authority (eg: datanode: 9864) from the defaultfs and use 
> the relative path as the end of the qualified url. For example:
> makeQualfied(defaultUri=o3://datanode:9864/test/bucket1, path=dir1/file) will 
> return o3://datanode:9864/dir1/file which is obviously wrong (the good would 
> be o3://datanode:9864/TEST/BUCKET1/dir1/file). I tried to do a workaround 
> with using a custom makeQualified in the Ozone code and it worked from 
> command line but couldn't work with Spark which use the Hadoop api and the 
> original makeQualified path.
> D.) Solution
> We should support makeQualified calls, so we can use any path in the 
> defaultFS.
>  
> I propose to use a simplified schema as o3://bucket.volume/ 
> This is similar to the s3a  format where the pattern is s3a://bucket.region/ 
> We don't need to set the hostname of the datanode (or ksm in case of service 
> discovery) but it would be configurable with additional hadoop configuraion 
> values such as fs.o3.bucket.buckename.volumename.address=http://datanode:9864 
> (this is how the s3a works today, as I know).
> We also need to define restrictions for the volume names (in our case it 
> should not include dot any more).
> ps: some spark output
> 2018-02-03 18:43:04 WARN  Client:66 - Neither spark.yarn.jars nor 
> spark.yarn.archive is set, falling back to uploading libraries under 
> SPARK_HOME.
> 2018-02-03 18:43:05 INFO  Client:54 - Uploading resource 
> file:/tmp/spark-03119be0-9c3d-440c-8e9f-48c692412ab5/__spark_libs__244044896784490.zip
>  -> 
> o3://datanode:9864/user/hadoop/.sparkStaging/application_1517611085375_0001/__spark_libs__244044896784490.zip
> My default fs was o3://datanode:9864/test/bucket1, but spark qualified the 
> name of the home directory.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13108) Ozone: OzoneFileSystem: Simplified url schema for Ozone File System

2018-02-20 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-13108:

Attachment: HDFS-13108-HDFS-7240.006.patch

> Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
> ---
>
> Key: HDFS-13108
> URL: https://issues.apache.org/jira/browse/HDFS-13108
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-13108-HDFS-7240.001.patch, 
> HDFS-13108-HDFS-7240.002.patch, HDFS-13108-HDFS-7240.003.patch, 
> HDFS-13108-HDFS-7240.005.patch, HDFS-13108-HDFS-7240.006.patch
>
>
> A. Current state
>  
> 1. The datanode host / bucket /volume should be defined in the defaultFS (eg. 
>  o3://datanode:9864/test/bucket1)
> 2. The root file system points to the bucket (eg. 'dfs -ls /' lists all the 
> keys from the bucket1)
> It works very well, but there are some limitations.
> B. Problem one 
> The current code doesn't support fully qualified locations. For example 'dfs 
> -ls o3://datanode:9864/test/bucket1/dir1' is not working.
> C.) Problem two
> I tried to fix the previous problem, but it's not trivial. The biggest 
> problem is that there is a Path.makeQualified call which could transform 
> unqualified url to qualified url. This is part of the Path.java so it's 
> common for all the Hadoop file systems.
> In the current implementations it qualifies an url with keeping the schema 
> (eg. o3:// ) and authority (eg: datanode: 9864) from the defaultfs and use 
> the relative path as the end of the qualified url. For example:
> makeQualfied(defaultUri=o3://datanode:9864/test/bucket1, path=dir1/file) will 
> return o3://datanode:9864/dir1/file which is obviously wrong (the good would 
> be o3://datanode:9864/TEST/BUCKET1/dir1/file). I tried to do a workaround 
> with using a custom makeQualified in the Ozone code and it worked from 
> command line but couldn't work with Spark which use the Hadoop api and the 
> original makeQualified path.
> D.) Solution
> We should support makeQualified calls, so we can use any path in the 
> defaultFS.
>  
> I propose to use a simplified schema as o3://bucket.volume/ 
> This is similar to the s3a  format where the pattern is s3a://bucket.region/ 
> We don't need to set the hostname of the datanode (or ksm in case of service 
> discovery) but it would be configurable with additional hadoop configuraion 
> values such as fs.o3.bucket.buckename.volumename.address=http://datanode:9864 
> (this is how the s3a works today, as I know).
> We also need to define restrictions for the volume names (in our case it 
> should not include dot any more).
> ps: some spark output
> 2018-02-03 18:43:04 WARN  Client:66 - Neither spark.yarn.jars nor 
> spark.yarn.archive is set, falling back to uploading libraries under 
> SPARK_HOME.
> 2018-02-03 18:43:05 INFO  Client:54 - Uploading resource 
> file:/tmp/spark-03119be0-9c3d-440c-8e9f-48c692412ab5/__spark_libs__244044896784490.zip
>  -> 
> o3://datanode:9864/user/hadoop/.sparkStaging/application_1517611085375_0001/__spark_libs__244044896784490.zip
> My default fs was o3://datanode:9864/test/bucket1, but spark qualified the 
> name of the home directory.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13108) Ozone: OzoneFileSystem: Simplified url schema for Ozone File System

2018-02-16 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-13108:

Attachment: (was: HDFS-13108-HDFS-7240.004.patch)

> Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
> ---
>
> Key: HDFS-13108
> URL: https://issues.apache.org/jira/browse/HDFS-13108
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-13108-HDFS-7240.001.patch, 
> HDFS-13108-HDFS-7240.002.patch, HDFS-13108-HDFS-7240.003.patch, 
> HDFS-13108-HDFS-7240.005.patch
>
>
> A. Current state
>  
> 1. The datanode host / bucket /volume should be defined in the defaultFS (eg. 
>  o3://datanode:9864/test/bucket1)
> 2. The root file system points to the bucket (eg. 'dfs -ls /' lists all the 
> keys from the bucket1)
> It works very well, but there are some limitations.
> B. Problem one 
> The current code doesn't support fully qualified locations. For example 'dfs 
> -ls o3://datanode:9864/test/bucket1/dir1' is not working.
> C.) Problem two
> I tried to fix the previous problem, but it's not trivial. The biggest 
> problem is that there is a Path.makeQualified call which could transform 
> unqualified url to qualified url. This is part of the Path.java so it's 
> common for all the Hadoop file systems.
> In the current implementations it qualifies an url with keeping the schema 
> (eg. o3:// ) and authority (eg: datanode: 9864) from the defaultfs and use 
> the relative path as the end of the qualified url. For example:
> makeQualfied(defaultUri=o3://datanode:9864/test/bucket1, path=dir1/file) will 
> return o3://datanode:9864/dir1/file which is obviously wrong (the good would 
> be o3://datanode:9864/TEST/BUCKET1/dir1/file). I tried to do a workaround 
> with using a custom makeQualified in the Ozone code and it worked from 
> command line but couldn't work with Spark which use the Hadoop api and the 
> original makeQualified path.
> D.) Solution
> We should support makeQualified calls, so we can use any path in the 
> defaultFS.
>  
> I propose to use a simplified schema as o3://bucket.volume/ 
> This is similar to the s3a  format where the pattern is s3a://bucket.region/ 
> We don't need to set the hostname of the datanode (or ksm in case of service 
> discovery) but it would be configurable with additional hadoop configuraion 
> values such as fs.o3.bucket.buckename.volumename.address=http://datanode:9864 
> (this is how the s3a works today, as I know).
> We also need to define restrictions for the volume names (in our case it 
> should not include dot any more).
> ps: some spark output
> 2018-02-03 18:43:04 WARN  Client:66 - Neither spark.yarn.jars nor 
> spark.yarn.archive is set, falling back to uploading libraries under 
> SPARK_HOME.
> 2018-02-03 18:43:05 INFO  Client:54 - Uploading resource 
> file:/tmp/spark-03119be0-9c3d-440c-8e9f-48c692412ab5/__spark_libs__244044896784490.zip
>  -> 
> o3://datanode:9864/user/hadoop/.sparkStaging/application_1517611085375_0001/__spark_libs__244044896784490.zip
> My default fs was o3://datanode:9864/test/bucket1, but spark qualified the 
> name of the home directory.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13108) Ozone: OzoneFileSystem: Simplified url schema for Ozone File System

2018-02-16 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-13108:

Attachment: HDFS-13108-HDFS-7240.005.patch

> Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
> ---
>
> Key: HDFS-13108
> URL: https://issues.apache.org/jira/browse/HDFS-13108
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-13108-HDFS-7240.001.patch, 
> HDFS-13108-HDFS-7240.002.patch, HDFS-13108-HDFS-7240.003.patch, 
> HDFS-13108-HDFS-7240.004.patch, HDFS-13108-HDFS-7240.005.patch
>
>
> A. Current state
>  
> 1. The datanode host / bucket /volume should be defined in the defaultFS (eg. 
>  o3://datanode:9864/test/bucket1)
> 2. The root file system points to the bucket (eg. 'dfs -ls /' lists all the 
> keys from the bucket1)
> It works very well, but there are some limitations.
> B. Problem one 
> The current code doesn't support fully qualified locations. For example 'dfs 
> -ls o3://datanode:9864/test/bucket1/dir1' is not working.
> C.) Problem two
> I tried to fix the previous problem, but it's not trivial. The biggest 
> problem is that there is a Path.makeQualified call which could transform 
> unqualified url to qualified url. This is part of the Path.java so it's 
> common for all the Hadoop file systems.
> In the current implementations it qualifies an url with keeping the schema 
> (eg. o3:// ) and authority (eg: datanode: 9864) from the defaultfs and use 
> the relative path as the end of the qualified url. For example:
> makeQualfied(defaultUri=o3://datanode:9864/test/bucket1, path=dir1/file) will 
> return o3://datanode:9864/dir1/file which is obviously wrong (the good would 
> be o3://datanode:9864/TEST/BUCKET1/dir1/file). I tried to do a workaround 
> with using a custom makeQualified in the Ozone code and it worked from 
> command line but couldn't work with Spark which use the Hadoop api and the 
> original makeQualified path.
> D.) Solution
> We should support makeQualified calls, so we can use any path in the 
> defaultFS.
>  
> I propose to use a simplified schema as o3://bucket.volume/ 
> This is similar to the s3a  format where the pattern is s3a://bucket.region/ 
> We don't need to set the hostname of the datanode (or ksm in case of service 
> discovery) but it would be configurable with additional hadoop configuraion 
> values such as fs.o3.bucket.buckename.volumename.address=http://datanode:9864 
> (this is how the s3a works today, as I know).
> We also need to define restrictions for the volume names (in our case it 
> should not include dot any more).
> ps: some spark output
> 2018-02-03 18:43:04 WARN  Client:66 - Neither spark.yarn.jars nor 
> spark.yarn.archive is set, falling back to uploading libraries under 
> SPARK_HOME.
> 2018-02-03 18:43:05 INFO  Client:54 - Uploading resource 
> file:/tmp/spark-03119be0-9c3d-440c-8e9f-48c692412ab5/__spark_libs__244044896784490.zip
>  -> 
> o3://datanode:9864/user/hadoop/.sparkStaging/application_1517611085375_0001/__spark_libs__244044896784490.zip
> My default fs was o3://datanode:9864/test/bucket1, but spark qualified the 
> name of the home directory.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13108) Ozone: OzoneFileSystem: Simplified url schema for Ozone File System

2018-02-16 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-13108:

Attachment: HDFS-13108-HDFS-7240.004.patch

> Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
> ---
>
> Key: HDFS-13108
> URL: https://issues.apache.org/jira/browse/HDFS-13108
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-13108-HDFS-7240.001.patch, 
> HDFS-13108-HDFS-7240.002.patch, HDFS-13108-HDFS-7240.003.patch, 
> HDFS-13108-HDFS-7240.004.patch
>
>
> A. Current state
>  
> 1. The datanode host / bucket /volume should be defined in the defaultFS (eg. 
>  o3://datanode:9864/test/bucket1)
> 2. The root file system points to the bucket (eg. 'dfs -ls /' lists all the 
> keys from the bucket1)
> It works very well, but there are some limitations.
> B. Problem one 
> The current code doesn't support fully qualified locations. For example 'dfs 
> -ls o3://datanode:9864/test/bucket1/dir1' is not working.
> C.) Problem two
> I tried to fix the previous problem, but it's not trivial. The biggest 
> problem is that there is a Path.makeQualified call which could transform 
> unqualified url to qualified url. This is part of the Path.java so it's 
> common for all the Hadoop file systems.
> In the current implementations it qualifies an url with keeping the schema 
> (eg. o3:// ) and authority (eg: datanode: 9864) from the defaultfs and use 
> the relative path as the end of the qualified url. For example:
> makeQualfied(defaultUri=o3://datanode:9864/test/bucket1, path=dir1/file) will 
> return o3://datanode:9864/dir1/file which is obviously wrong (the good would 
> be o3://datanode:9864/TEST/BUCKET1/dir1/file). I tried to do a workaround 
> with using a custom makeQualified in the Ozone code and it worked from 
> command line but couldn't work with Spark which use the Hadoop api and the 
> original makeQualified path.
> D.) Solution
> We should support makeQualified calls, so we can use any path in the 
> defaultFS.
>  
> I propose to use a simplified schema as o3://bucket.volume/ 
> This is similar to the s3a  format where the pattern is s3a://bucket.region/ 
> We don't need to set the hostname of the datanode (or ksm in case of service 
> discovery) but it would be configurable with additional hadoop configuraion 
> values such as fs.o3.bucket.buckename.volumename.address=http://datanode:9864 
> (this is how the s3a works today, as I know).
> We also need to define restrictions for the volume names (in our case it 
> should not include dot any more).
> ps: some spark output
> 2018-02-03 18:43:04 WARN  Client:66 - Neither spark.yarn.jars nor 
> spark.yarn.archive is set, falling back to uploading libraries under 
> SPARK_HOME.
> 2018-02-03 18:43:05 INFO  Client:54 - Uploading resource 
> file:/tmp/spark-03119be0-9c3d-440c-8e9f-48c692412ab5/__spark_libs__244044896784490.zip
>  -> 
> o3://datanode:9864/user/hadoop/.sparkStaging/application_1517611085375_0001/__spark_libs__244044896784490.zip
> My default fs was o3://datanode:9864/test/bucket1, but spark qualified the 
> name of the home directory.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13108) Ozone: OzoneFileSystem: Simplified url schema for Ozone File System

2018-02-13 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-13108:

Attachment: HDFS-13108-HDFS-7240.003.patch

> Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
> ---
>
> Key: HDFS-13108
> URL: https://issues.apache.org/jira/browse/HDFS-13108
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-13108-HDFS-7240.001.patch, 
> HDFS-13108-HDFS-7240.002.patch, HDFS-13108-HDFS-7240.003.patch
>
>
> A. Current state
>  
> 1. The datanode host / bucket /volume should be defined in the defaultFS (eg. 
>  o3://datanode:9864/test/bucket1)
> 2. The root file system points to the bucket (eg. 'dfs -ls /' lists all the 
> keys from the bucket1)
> It works very well, but there are some limitations.
> B. Problem one 
> The current code doesn't support fully qualified locations. For example 'dfs 
> -ls o3://datanode:9864/test/bucket1/dir1' is not working.
> C.) Problem two
> I tried to fix the previous problem, but it's not trivial. The biggest 
> problem is that there is a Path.makeQualified call which could transform 
> unqualified url to qualified url. This is part of the Path.java so it's 
> common for all the Hadoop file systems.
> In the current implementations it qualifies an url with keeping the schema 
> (eg. o3:// ) and authority (eg: datanode: 9864) from the defaultfs and use 
> the relative path as the end of the qualified url. For example:
> makeQualfied(defaultUri=o3://datanode:9864/test/bucket1, path=dir1/file) will 
> return o3://datanode:9864/dir1/file which is obviously wrong (the good would 
> be o3://datanode:9864/TEST/BUCKET1/dir1/file). I tried to do a workaround 
> with using a custom makeQualified in the Ozone code and it worked from 
> command line but couldn't work with Spark which use the Hadoop api and the 
> original makeQualified path.
> D.) Solution
> We should support makeQualified calls, so we can use any path in the 
> defaultFS.
>  
> I propose to use a simplified schema as o3://bucket.volume/ 
> This is similar to the s3a  format where the pattern is s3a://bucket.region/ 
> We don't need to set the hostname of the datanode (or ksm in case of service 
> discovery) but it would be configurable with additional hadoop configuraion 
> values such as fs.o3.bucket.buckename.volumename.address=http://datanode:9864 
> (this is how the s3a works today, as I know).
> We also need to define restrictions for the volume names (in our case it 
> should not include dot any more).
> ps: some spark output
> 2018-02-03 18:43:04 WARN  Client:66 - Neither spark.yarn.jars nor 
> spark.yarn.archive is set, falling back to uploading libraries under 
> SPARK_HOME.
> 2018-02-03 18:43:05 INFO  Client:54 - Uploading resource 
> file:/tmp/spark-03119be0-9c3d-440c-8e9f-48c692412ab5/__spark_libs__244044896784490.zip
>  -> 
> o3://datanode:9864/user/hadoop/.sparkStaging/application_1517611085375_0001/__spark_libs__244044896784490.zip
> My default fs was o3://datanode:9864/test/bucket1, but spark qualified the 
> name of the home directory.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13108) Ozone: OzoneFileSystem: Simplified url schema for Ozone File System

2018-02-12 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-13108:

Attachment: HDFS-13108-HDFS-7240.002.patch

> Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
> ---
>
> Key: HDFS-13108
> URL: https://issues.apache.org/jira/browse/HDFS-13108
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-13108-HDFS-7240.001.patch, 
> HDFS-13108-HDFS-7240.002.patch
>
>
> A. Current state
>  
> 1. The datanode host / bucket /volume should be defined in the defaultFS (eg. 
>  o3://datanode:9864/test/bucket1)
> 2. The root file system points to the bucket (eg. 'dfs -ls /' lists all the 
> keys from the bucket1)
> It works very well, but there are some limitations.
> B. Problem one 
> The current code doesn't support fully qualified locations. For example 'dfs 
> -ls o3://datanode:9864/test/bucket1/dir1' is not working.
> C.) Problem two
> I tried to fix the previous problem, but it's not trivial. The biggest 
> problem is that there is a Path.makeQualified call which could transform 
> unqualified url to qualified url. This is part of the Path.java so it's 
> common for all the Hadoop file systems.
> In the current implementations it qualifies an url with keeping the schema 
> (eg. o3:// ) and authority (eg: datanode: 9864) from the defaultfs and use 
> the relative path as the end of the qualified url. For example:
> makeQualfied(defaultUri=o3://datanode:9864/test/bucket1, path=dir1/file) will 
> return o3://datanode:9864/dir1/file which is obviously wrong (the good would 
> be o3://datanode:9864/TEST/BUCKET1/dir1/file). I tried to do a workaround 
> with using a custom makeQualified in the Ozone code and it worked from 
> command line but couldn't work with Spark which use the Hadoop api and the 
> original makeQualified path.
> D.) Solution
> We should support makeQualified calls, so we can use any path in the 
> defaultFS.
>  
> I propose to use a simplified schema as o3://bucket.volume/ 
> This is similar to the s3a  format where the pattern is s3a://bucket.region/ 
> We don't need to set the hostname of the datanode (or ksm in case of service 
> discovery) but it would be configurable with additional hadoop configuraion 
> values such as fs.o3.bucket.buckename.volumename.address=http://datanode:9864 
> (this is how the s3a works today, as I know).
> We also need to define restrictions for the volume names (in our case it 
> should not include dot any more).
> ps: some spark output
> 2018-02-03 18:43:04 WARN  Client:66 - Neither spark.yarn.jars nor 
> spark.yarn.archive is set, falling back to uploading libraries under 
> SPARK_HOME.
> 2018-02-03 18:43:05 INFO  Client:54 - Uploading resource 
> file:/tmp/spark-03119be0-9c3d-440c-8e9f-48c692412ab5/__spark_libs__244044896784490.zip
>  -> 
> o3://datanode:9864/user/hadoop/.sparkStaging/application_1517611085375_0001/__spark_libs__244044896784490.zip
> My default fs was o3://datanode:9864/test/bucket1, but spark qualified the 
> name of the home directory.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13108) Ozone: OzoneFileSystem: Simplified url schema for Ozone File System

2018-02-11 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-13108:

Status: Patch Available  (was: Open)

I uploaded the patch. The patch itself is very small but I also modified the 
unit test to run the same tests with and wihout the defaultfs settings.

Contract tests are also passing and I successfully  submitted a simple spark 
word count using o3 file system.

> Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
> ---
>
> Key: HDFS-13108
> URL: https://issues.apache.org/jira/browse/HDFS-13108
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-13108-HDFS-7240.001.patch
>
>
> A. Current state
>  
> 1. The datanode host / bucket /volume should be defined in the defaultFS (eg. 
>  o3://datanode:9864/test/bucket1)
> 2. The root file system points to the bucket (eg. 'dfs -ls /' lists all the 
> keys from the bucket1)
> It works very well, but there are some limitations.
> B. Problem one 
> The current code doesn't support fully qualified locations. For example 'dfs 
> -ls o3://datanode:9864/test/bucket1/dir1' is not working.
> C.) Problem two
> I tried to fix the previous problem, but it's not trivial. The biggest 
> problem is that there is a Path.makeQualified call which could transform 
> unqualified url to qualified url. This is part of the Path.java so it's 
> common for all the Hadoop file systems.
> In the current implementations it qualifies an url with keeping the schema 
> (eg. o3:// ) and authority (eg: datanode: 9864) from the defaultfs and use 
> the relative path as the end of the qualified url. For example:
> makeQualfied(defaultUri=o3://datanode:9864/test/bucket1, path=dir1/file) will 
> return o3://datanode:9864/dir1/file which is obviously wrong (the good would 
> be o3://datanode:9864/TEST/BUCKET1/dir1/file). I tried to do a workaround 
> with using a custom makeQualified in the Ozone code and it worked from 
> command line but couldn't work with Spark which use the Hadoop api and the 
> original makeQualified path.
> D.) Solution
> We should support makeQualified calls, so we can use any path in the 
> defaultFS.
>  
> I propose to use a simplified schema as o3://bucket.volume/ 
> This is similar to the s3a  format where the pattern is s3a://bucket.region/ 
> We don't need to set the hostname of the datanode (or ksm in case of service 
> discovery) but it would be configurable with additional hadoop configuraion 
> values such as fs.o3.bucket.buckename.volumename.address=http://datanode:9864 
> (this is how the s3a works today, as I know).
> We also need to define restrictions for the volume names (in our case it 
> should not include dot any more).
> ps: some spark output
> 2018-02-03 18:43:04 WARN  Client:66 - Neither spark.yarn.jars nor 
> spark.yarn.archive is set, falling back to uploading libraries under 
> SPARK_HOME.
> 2018-02-03 18:43:05 INFO  Client:54 - Uploading resource 
> file:/tmp/spark-03119be0-9c3d-440c-8e9f-48c692412ab5/__spark_libs__244044896784490.zip
>  -> 
> o3://datanode:9864/user/hadoop/.sparkStaging/application_1517611085375_0001/__spark_libs__244044896784490.zip
> My default fs was o3://datanode:9864/test/bucket1, but spark qualified the 
> name of the home directory.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13108) Ozone: OzoneFileSystem: Simplified url schema for Ozone File System

2018-02-11 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-13108:

Attachment: HDFS-13108-HDFS-7240.001.patch

> Ozone: OzoneFileSystem: Simplified url schema for Ozone File System
> ---
>
> Key: HDFS-13108
> URL: https://issues.apache.org/jira/browse/HDFS-13108
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-13108-HDFS-7240.001.patch
>
>
> A. Current state
>  
> 1. The datanode host / bucket /volume should be defined in the defaultFS (eg. 
>  o3://datanode:9864/test/bucket1)
> 2. The root file system points to the bucket (eg. 'dfs -ls /' lists all the 
> keys from the bucket1)
> It works very well, but there are some limitations.
> B. Problem one 
> The current code doesn't support fully qualified locations. For example 'dfs 
> -ls o3://datanode:9864/test/bucket1/dir1' is not working.
> C.) Problem two
> I tried to fix the previous problem, but it's not trivial. The biggest 
> problem is that there is a Path.makeQualified call which could transform 
> unqualified url to qualified url. This is part of the Path.java so it's 
> common for all the Hadoop file systems.
> In the current implementations it qualifies an url with keeping the schema 
> (eg. o3:// ) and authority (eg: datanode: 9864) from the defaultfs and use 
> the relative path as the end of the qualified url. For example:
> makeQualfied(defaultUri=o3://datanode:9864/test/bucket1, path=dir1/file) will 
> return o3://datanode:9864/dir1/file which is obviously wrong (the good would 
> be o3://datanode:9864/TEST/BUCKET1/dir1/file). I tried to do a workaround 
> with using a custom makeQualified in the Ozone code and it worked from 
> command line but couldn't work with Spark which use the Hadoop api and the 
> original makeQualified path.
> D.) Solution
> We should support makeQualified calls, so we can use any path in the 
> defaultFS.
>  
> I propose to use a simplified schema as o3://bucket.volume/ 
> This is similar to the s3a  format where the pattern is s3a://bucket.region/ 
> We don't need to set the hostname of the datanode (or ksm in case of service 
> discovery) but it would be configurable with additional hadoop configuraion 
> values such as fs.o3.bucket.buckename.volumename.address=http://datanode:9864 
> (this is how the s3a works today, as I know).
> We also need to define restrictions for the volume names (in our case it 
> should not include dot any more).
> ps: some spark output
> 2018-02-03 18:43:04 WARN  Client:66 - Neither spark.yarn.jars nor 
> spark.yarn.archive is set, falling back to uploading libraries under 
> SPARK_HOME.
> 2018-02-03 18:43:05 INFO  Client:54 - Uploading resource 
> file:/tmp/spark-03119be0-9c3d-440c-8e9f-48c692412ab5/__spark_libs__244044896784490.zip
>  -> 
> o3://datanode:9864/user/hadoop/.sparkStaging/application_1517611085375_0001/__spark_libs__244044896784490.zip
> My default fs was o3://datanode:9864/test/bucket1, but spark qualified the 
> name of the home directory.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org