[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17840561#comment-17840561 ] ASF GitHub Bot commented on HADOOP-19085: - hadoop-yetus commented on PR #6602: URL: https://github.com/apache/hadoop/pull/6602#issuecomment-2075764416 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | -1 :x: | patch | 1m 04s | | https://github.com/apache/hadoop/pull/6602 does not apply to trunk. Rebase required? Wrong Branch? See https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help. | | Subsystem | Report/Notes | |--:|:-| | GITHUB PR | https://github.com/apache/hadoop/pull/6602 | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6602/1/console | | versions | git=2.44.0.windows.1 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, test >Affects Versions: 3.4.0 >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > Attachments: HADOOP-19085.001.patch, HDFS Compatibility Benchmark > Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839704#comment-17839704 ] Steve Loughran commented on HADOOP-19085: - that's really interesting. abfs has full filesystem semantics; s3 doesn't and we always trade off correctness for performance. * can you attach the results? * regarding other connectors, gcs is the obvious one > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, test >Affects Versions: 3.4.0 >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > Attachments: HADOOP-19085.001.patch, HDFS Compatibility Benchmark > Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839403#comment-17839403 ] Han Liu commented on HADOOP-19085: -- Sorry for the late reply. I tried to run the benchmark tool against S3A and ABFS. The result shows that S3A passed 140 cases over 222, and ABFS passed 155 cases. I am using Hadoop 3.2.1 as the baseline. The result for S3A: {quote}Hadoop Compatibility Report for ALL: 63.06%, PASSED 140 OVER 222 URI: (suite: org.apache.hadoop.fs.compat.suites.HdfsCompatSuiteForAll) Hadoop Version as Baseline: 3.2.1 {quote} The result for ABFS (Azure Data Lake Storage Gen2): {quote}Hadoop Compatibility Report for ALL: 69.82%, PASSED 155 OVER 222 URI: (suite: org.apache.hadoop.fs.compat.suites.HdfsCompatSuiteForAll) Hadoop Version as Baseline: 3.2.1 {quote} I am planning for more file systems to run. Any suggestions? [~ste...@apache.org] > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, test >Affects Versions: 3.4.0 >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > Attachments: HADOOP-19085.001.patch, HDFS Compatibility Benchmark > Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828279#comment-17828279 ] Steve Loughran commented on HADOOP-19085: - One thing I'd like to say is: what are the compatibility reports so far? > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, test >Affects Versions: 3.4.0 >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > Attachments: HADOOP-19085.001.patch, HDFS Compatibility Benchmark > Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828278#comment-17828278 ] Steve Loughran commented on HADOOP-19085: - bq. The patch was just committed I see that. this JIRA should be closed as fixed, the new work moved split out as toplevel or under a new uber-jira covering the work > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-19085.001.patch, HDFS Compatibility Benchmark > Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17827829#comment-17827829 ] Han Liu commented on HADOOP-19085: -- Really thanks for the code review and the final commit of the code! [~drankye] I created a new issue for further improvements of the benchmark. Improvements may include suggestions that has not been completed, such as PathCapabilities in the final report, windows support, etc. We can continue the discussion and future works there. [~ste...@apache.org] [~drankye] Link: https://issues.apache.org/jira/browse/HADOOP-19113 > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-19085.001.patch, HDFS Compatibility Benchmark > Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17827774#comment-17827774 ] Kai Zheng commented on HADOOP-19085: For the valuable suggestions mentioned in the discussion, I suggest we follow up in separate issues, [~han.liu]. > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-19085.001.patch, HDFS Compatibility Benchmark > Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17827773#comment-17827773 ] Kai Zheng commented on HADOOP-19085: The patch was just committed. Thanks [~han.liu] for the contribution, and [~ste...@apache.org] for the guidance! > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HADOOP-19085.001.patch, HDFS Compatibility Benchmark > Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825136#comment-17825136 ] Kai Zheng commented on HADOOP-19085: Thanks [~han.liu] for the update. The latest codes LGTM and +1. Would anyone take a look? > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HDFS Compatibility Benchmark Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825034#comment-17825034 ] ASF GitHub Bot commented on HADOOP-19085: - hadoop-yetus commented on PR #6602: URL: https://github.com/apache/hadoop/pull/6602#issuecomment-1987194360 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 17m 37s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 1s | | xmllint was not available. | | +0 :ok: | shelldocs | 0m 1s | | Shelldocs was not available. | | +0 :ok: | markdownlint | 0m 1s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 13 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 44m 0s | | trunk passed | | +1 :green_heart: | compile | 1m 26s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 1m 14s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 0m 52s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 17s | | trunk passed | | +1 :green_heart: | javadoc | 2m 3s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 47s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 4m 42s | | trunk passed | | +1 :green_heart: | shadedclient | 36m 28s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 37s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 46s | | the patch passed | | +1 :green_heart: | compile | 1m 21s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 1m 21s | | the patch passed | | +1 :green_heart: | compile | 1m 7s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 7s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 40s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 33s | | the patch passed | | +1 :green_heart: | shellcheck | 0m 1s | | No new issues. | | +1 :green_heart: | javadoc | 2m 14s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 2m 4s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 5m 36s | | the patch passed | | +1 :green_heart: | shadedclient | 36m 30s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 4m 49s | | hadoop-compat-bench in the patch passed. | | +1 :green_heart: | unit | 67m 19s | | hadoop-tools in the patch passed. | | +1 :green_heart: | asflicense | 0m 44s | | The patch does not generate ASF License warnings. | | | | 235m 10s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6602/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6602 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint shellcheck shelldocs spotbugs checkstyle markdownlint | | uname | Linux 4ceab242 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 65ed74cd26bee1e20a368d3b4e68a116d66a7b8c | | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825006#comment-17825006 ] Han Liu commented on HADOOP-19085: -- Thanks for the very careful review of the code by [~drankye] I found a wrong message in HdfsCompatTool.DESCRIPTION by the command {quote}grep "hdfs compatibility" {quote} and have fixed it. Any comments are welcome! > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HDFS Compatibility Benchmark Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824611#comment-17824611 ] Kai Zheng commented on HADOOP-19085: [~han.liu] Good work! Just a minor, would you grep the PR codes ""hdfs compatibility" and refine some bit? Thanks! > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HDFS Compatibility Benchmark Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824114#comment-17824114 ] ASF GitHub Bot commented on HADOOP-19085: - hadoop-yetus commented on PR #6602: URL: https://github.com/apache/hadoop/pull/6602#issuecomment-1981500882 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 19m 3s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 0s | | xmllint was not available. | | +0 :ok: | shelldocs | 0m 0s | | Shelldocs was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 13 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 49m 24s | | trunk passed | | +1 :green_heart: | compile | 1m 34s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 1m 10s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 0m 49s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 17s | | trunk passed | | +1 :green_heart: | javadoc | 1m 58s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 40s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 4m 47s | | trunk passed | | +1 :green_heart: | shadedclient | 41m 34s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 36s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 44s | | the patch passed | | +1 :green_heart: | compile | 1m 24s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 1m 24s | | the patch passed | | +1 :green_heart: | compile | 1m 3s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 3s | | the patch passed | | +1 :green_heart: | blanks | 0m 1s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 43s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 31s | | the patch passed | | +1 :green_heart: | shellcheck | 0m 0s | | No new issues. | | +1 :green_heart: | javadoc | 2m 11s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 54s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 5m 33s | | the patch passed | | +1 :green_heart: | shadedclient | 41m 38s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 4m 17s | | hadoop-compat-bench in the patch passed. | | +1 :green_heart: | unit | 65m 13s | | hadoop-tools in the patch passed. | | +1 :green_heart: | asflicense | 0m 40s | | The patch does not generate ASF License warnings. | | | | 249m 28s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6602/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6602 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint shellcheck shelldocs spotbugs checkstyle markdownlint | | uname | Linux 5cd5f9c2412e 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / e6cb2584686dc4f506ce8b14552f04b2cbbd2d51 | | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824107#comment-17824107 ] ASF GitHub Bot commented on HADOOP-19085: - hadoop-yetus commented on PR #6602: URL: https://github.com/apache/hadoop/pull/6602#issuecomment-1981464918 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 18m 6s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 0s | | xmllint was not available. | | +0 :ok: | shelldocs | 0m 1s | | Shelldocs was not available. | | +0 :ok: | markdownlint | 0m 1s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 1s | | The patch appears to include 13 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 43m 43s | | trunk passed | | +1 :green_heart: | compile | 1m 29s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 1m 13s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 0m 49s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 16s | | trunk passed | | +1 :green_heart: | javadoc | 2m 4s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 47s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 4m 45s | | trunk passed | | +1 :green_heart: | shadedclient | 36m 29s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 36s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 46s | | the patch passed | | +1 :green_heart: | compile | 1m 20s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 1m 20s | | the patch passed | | +1 :green_heart: | compile | 1m 7s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 7s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 41s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 34s | | the patch passed | | +1 :green_heart: | shellcheck | 0m 1s | | No new issues. | | +1 :green_heart: | javadoc | 2m 15s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 2m 3s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 5m 33s | | the patch passed | | +1 :green_heart: | shadedclient | 36m 51s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 4m 38s | | hadoop-compat-bench in the patch passed. | | +1 :green_heart: | unit | 67m 19s | | hadoop-tools in the patch passed. | | +1 :green_heart: | asflicense | 0m 44s | | The patch does not generate ASF License warnings. | | | | 235m 58s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6602/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6602 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint shellcheck shelldocs spotbugs checkstyle markdownlint | | uname | Linux 43aa7709eb3e 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / e6cb2584686dc4f506ce8b14552f04b2cbbd2d51 | | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824022#comment-17824022 ] Han Liu commented on HADOOP-19085: -- Thanks a lot for detailed comments from [~drankye] All suggestions are carefully considered. A new version of code has been finished, triggering a new round of check. Any thoughts are welcome! [~ste...@apache.org] [~drankye] > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HDFS Compatibility Benchmark Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823076#comment-17823076 ] Kai Zheng commented on HADOOP-19085: Hi [~han.liu] The work looks great overall. Some comments follow and would you help check? Thanks! 1. About the document HdfsCompatBenchIssue.md, maybe we could reshape it as the user doc like we have for other hadoop-tools modules. 2. For most classes in the package org/apache/hadoop/fs/compat, we could have a sub package like 'common' for them, so we could leave the main package much cleaner. 3. The tool (HdfsCompatibility.java) could be 'HdfsCompatTool'? 4. HdfsCompatFileSystemImpl could be 'HdfsCompatBasics' and moved to stay along with the function cases together. Actually we could put all the HCFS API cases together directly under the 'cases' package. 5. We could rename 'modification.t' to 'write.t', and have another file like 'directory.t' for dir related operations. > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HDFS Compatibility Benchmark Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17822396#comment-17822396 ] ASF GitHub Bot commented on HADOOP-19085: - hadoop-yetus commented on PR #6602: URL: https://github.com/apache/hadoop/pull/6602#issuecomment-1972583892 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 56s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 1s | | markdownlint was not available. | | +0 :ok: | xmllint | 0m 1s | | xmllint was not available. | | +0 :ok: | shelldocs | 0m 1s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 13 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 44m 15s | | trunk passed | | +1 :green_heart: | compile | 1m 27s | | trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | compile | 1m 12s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 0m 50s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 21s | | trunk passed | | +1 :green_heart: | javadoc | 2m 2s | | trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 46s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 4m 44s | | trunk passed | | +1 :green_heart: | shadedclient | 36m 35s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 35s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 49s | | the patch passed | | +1 :green_heart: | compile | 1m 19s | | the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javac | 1m 19s | | the patch passed | | +1 :green_heart: | compile | 1m 7s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 7s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 41s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 36s | | the patch passed | | +1 :green_heart: | shellcheck | 0m 1s | | No new issues. | | +1 :green_heart: | javadoc | 2m 15s | | the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 2m 3s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 5m 39s | | the patch passed | | +1 :green_heart: | shadedclient | 36m 43s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 4m 37s | | hadoop-compat-bench in the patch passed. | | +1 :green_heart: | unit | 66m 38s | | hadoop-tools in the patch passed. | | +1 :green_heart: | asflicense | 0m 43s | | The patch does not generate ASF License warnings. | | | | 218m 6s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6602/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6602 | | Optional Tests | dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient xmllint shellcheck shelldocs spotbugs checkstyle | | uname | Linux 9ea78207b427 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 149cc2315826443e061a4f5bf758e52a56c33b09 | | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6602/1/
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17822352#comment-17822352 ] Han Liu commented on HADOOP-19085: -- {quote}Since the Jira has been converted to a Hadoop one, would you help with a new PR to follow that? {quote} Done. [~drankye] The new PR is HADOOP-19085 Add hadoop-compat-bench #6602 ( [https://github.com/apache/hadoop/pull/6602] ). All commits are squashed to one. > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HDFS Compatibility Benchmark Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17822345#comment-17822345 ] ASF GitHub Bot commented on HADOOP-19085: - HanFreedom opened a new pull request, #6602: URL: https://github.com/apache/hadoop/pull/6602 ### Description of PR A new hadoop-compat-bench module introducing a quick HCFS compatibility assessment tool to Hadoop for FileSystem implementations, as described and discussed in HADOOP-19085. ### How was this patch tested? This is a new and standalone module module tested by its own unit tests. ### For code changes: - [ ] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HDFS Compatibility Benchmark Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17822342#comment-17822342 ] ASF GitHub Bot commented on HADOOP-19085: - HanFreedom closed pull request #6535: HDFS-17316 Add hadoop-compat-bench URL: https://github.com/apache/hadoop/pull/6535 > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HDFS Compatibility Benchmark Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821983#comment-17821983 ] ASF GitHub Bot commented on HADOOP-19085: - hadoop-yetus commented on PR #6535: URL: https://github.com/apache/hadoop/pull/6535#issuecomment-1970586476 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 22s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 1s | | markdownlint was not available. | | +0 :ok: | xmllint | 0m 0s | | xmllint was not available. | | +0 :ok: | shelldocs | 0m 0s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 13 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 30s | | trunk passed | | +1 :green_heart: | compile | 0m 50s | | trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | compile | 0m 42s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 0m 29s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 41s | | trunk passed | | +1 :green_heart: | javadoc | 1m 11s | | trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 8s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 2m 54s | | trunk passed | | +1 :green_heart: | shadedclient | 21m 23s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 21m 35s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 26s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 0m 57s | | the patch passed | | +1 :green_heart: | compile | 0m 48s | | the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javac | 0m 48s | | the patch passed | | +1 :green_heart: | compile | 0m 39s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 0m 39s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 20s | [/results-checkstyle-hadoop-tools.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6535/4/artifact/out/results-checkstyle-hadoop-tools.txt) | hadoop-tools: The patch generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) | | +1 :green_heart: | mvnsite | 0m 49s | | the patch passed | | +1 :green_heart: | shellcheck | 0m 0s | | No new issues. | | +1 :green_heart: | javadoc | 1m 18s | | the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 22s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 29s | | the patch passed | | +1 :green_heart: | shadedclient | 21m 34s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 3m 10s | | hadoop-compat-bench in the patch passed. | | +1 :green_heart: | unit | 58m 13s | | hadoop-tools in the patch passed. | | +1 :green_heart: | asflicense | 0m 29s | | The patch does not generate ASF License warnings. | | | | 153m 52s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6535/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6535 | | Optional Tests | dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient xmllint shellcheck shelldocs spotbugs checkstyle | | uname | Linux a1ca93685f37 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821727#comment-17821727 ] ASF GitHub Bot commented on HADOOP-19085: - hadoop-yetus commented on PR #6535: URL: https://github.com/apache/hadoop/pull/6535#issuecomment-1969265732 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 21s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 1s | | markdownlint was not available. | | +0 :ok: | xmllint | 0m 0s | | xmllint was not available. | | +0 :ok: | shelldocs | 0m 0s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 13 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 14s | | trunk passed | | +1 :green_heart: | compile | 0m 49s | | trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | compile | 0m 41s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 0m 31s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 45s | | trunk passed | | +1 :green_heart: | javadoc | 1m 13s | | trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 9s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 2m 52s | | trunk passed | | +1 :green_heart: | shadedclient | 21m 19s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 21m 32s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 25s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 0m 58s | | the patch passed | | +1 :green_heart: | compile | 0m 44s | | the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javac | 0m 44s | | the patch passed | | +1 :green_heart: | compile | 0m 35s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 0m 35s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 21s | [/results-checkstyle-hadoop-tools.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6535/3/artifact/out/results-checkstyle-hadoop-tools.txt) | hadoop-tools: The patch generated 49 new + 0 unchanged - 0 fixed = 49 total (was 0) | | +1 :green_heart: | mvnsite | 0m 55s | | the patch passed | | +1 :green_heart: | shellcheck | 0m 0s | | No new issues. | | +1 :green_heart: | javadoc | 1m 14s | | the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 1m 21s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 19s | | the patch passed | | +1 :green_heart: | shadedclient | 21m 30s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 3m 15s | | hadoop-compat-bench in the patch passed. | | +1 :green_heart: | unit | 58m 15s | | hadoop-tools in the patch passed. | | +1 :green_heart: | asflicense | 0m 27s | | The patch does not generate ASF License warnings. | | | | 153m 35s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6535/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6535 | | Optional Tests | dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient xmllint shellcheck shelldocs spotbugs checkstyle | | uname | Linux 1ed24bc9ea32 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | gi
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821664#comment-17821664 ] Han Liu commented on HADOOP-19085: -- Thanks for comments from [~drankye] {quote}Let's do in favor of the package "org.apache.hadoop.fs.compat" instead of "org.apache.hadoop.compat" to wrap the new codes since the compat tool focuses on the fs aspect. {quote} Good suggestion. Done. I also moved the new module hadoop-compat-bench from top-level hadoop/ to hadoop/hadoop-tools/. A new check process has been triggered. > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HDFS Compatibility Benchmark Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821490#comment-17821490 ] ASF GitHub Bot commented on HADOOP-19085: - hadoop-yetus commented on PR #6535: URL: https://github.com/apache/hadoop/pull/6535#issuecomment-1968100873 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 19s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +0 :ok: | xmllint | 0m 1s | | xmllint was not available. | | +0 :ok: | shelldocs | 0m 1s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 13 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 27s | | trunk passed | | +1 :green_heart: | compile | 8m 34s | | trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | compile | 7m 53s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 59s | | trunk passed | | +1 :green_heart: | mvnsite | 12m 13s | | trunk passed | | +1 :green_heart: | javadoc | 4m 39s | | trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 4m 59s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | -1 :x: | spotbugs | 17m 35s | [/branch-spotbugs-root-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6535/2/artifact/out/branch-spotbugs-root-warnings.html) | root in trunk has 5 extant spotbugs warnings. | | +1 :green_heart: | shadedclient | 36m 8s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 23s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 17m 11s | | the patch passed | | +1 :green_heart: | compile | 8m 27s | | the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javac | 8m 27s | | the patch passed | | +1 :green_heart: | compile | 7m 56s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 7m 56s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 1m 57s | | the patch passed | | +1 :green_heart: | mvnsite | 7m 58s | | the patch passed | | +1 :green_heart: | shellcheck | 0m 0s | | No new issues. | | +1 :green_heart: | javadoc | 4m 32s | | the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 | | +1 :green_heart: | javadoc | 5m 1s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 18m 12s | | the patch passed | | +1 :green_heart: | shadedclient | 19m 13s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 631m 23s | [/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6535/2/artifact/out/patch-unit-root.txt) | root in the patch failed. | | +1 :green_heart: | asflicense | 0m 54s | | The patch does not generate ASF License warnings. | | | | 836m 44s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.tools.TestDFSAdmin | | | hadoop.hdfs.server.datanode.TestLargeBlockReport | | | hadoop.hdfs.protocol.TestBlockListAsLongs | | | hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6535/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6535 | | Optional Tests | dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient xmllint shellcheck shelldocs spotbugs checkstyle | | uname | Linux c3a31dc9ba16 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GN
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821332#comment-17821332 ] Han Liu commented on HADOOP-19085: -- The code has been updated to a new version and all issues listed in https://github.com/apache/hadoop/pull/6535#issuecomment-1933237018 should have been resolved. A new check process has been triggered but not completed yet. Grateful for comments from [~ste...@apache.org] {quote} * I think it should be a hadoop one for more than hdfs{quote} Good suggestion. And the issue has already been changed to a hadoop-common issue. {quote} * hdfs/webhdfs work well as unit tests for the functionality * but can/should also target other stores, with s3a and abfs connectors key ones for me.{quote} Yes. The cases introduced by the benchmark are designed as probes of the HCFS APIs, checking if they are implemented and the basic functions are OK, but could not be appropriate to take the responsibility of quality assurance. Thus, there should be some differences between benchmark cases and hdfs/webhdfs unit tests. On the other hand, some of the hdfs/webhdfs unit tests are complex. Targeting them to stores other than HDFS is an essential job benefiting to the Hadoop ecosystem. Maybe we could create a new topic to discuss this change. {quote}one thing with the contract tests is we need the ability to declare when a store doesn't quite meet expectations. s3a fs lets you create files under files if you try hard; some operations raise different exceptions, permissions may be different. so a design which allows for downgrading is critical {quote} Good idea. The contract tests are directly related to core functionalities of the most import FileSystem APIs. We can have a new issue for further discussion of the display of expectation violations. > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HDFS Compatibility Benchmark Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819869#comment-17819869 ] Kai Zheng commented on HADOOP-19085: Just did a quick glance at the code. [~han.liu] Let's do in favor of the package "org.apache.hadoop.fs.compat" instead of "org.apache.hadoop.compat" to wrap the new codes since the compat tool focuses on the fs aspect. Hadoop means lots of things. Thanks. > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HDFS Compatibility Benchmark Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819866#comment-17819866 ] Kai Zheng commented on HADOOP-19085: Got it. Thanks Steve for the refresh, :). > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HDFS Compatibility Benchmark Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819620#comment-17819620 ] Steve Loughran commented on HADOOP-19085: - [~drankye] done. you have to go to project settings and assign the user to the group "contributors 1". > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Assignee: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HDFS Compatibility Benchmark Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations
[ https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819454#comment-17819454 ] Kai Zheng commented on HADOOP-19085: Hi [~han.liu], since you're working on this, would you take it? [~ste...@apache.org] could you help assign, thanks! I failed because I can't see his name. > Compatibility Benchmark over HCFS Implementations > - > > Key: HADOOP-19085 > URL: https://issues.apache.org/jira/browse/HADOOP-19085 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Han Liu >Priority: Major > Labels: pull-request-available > Attachments: HDFS Compatibility Benchmark Design.pdf > > > {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in > big data storage ecosystem, providing unified interfaces and generally clear > semantics, and has become the de-factor standard for industry storage systems > to follow and conform with. There have been a series of HCFS implementations > in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for > Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object > Storage, and more from storage service's providers on their own. > {*}Problems:{*}However, as indicated by introduction.md, there is no formal > suite to do compatibility assessment of a file system for all such HCFS > implementations. Thus, whether the functionality is well accomplished and > meets the core compatible expectations mainly relies on service provider's > own report. Meanwhile, Hadoop is also developing and new features are > continuously contributing to HCFS interfaces for existing implementations to > follow and update, in which case, Hadoop also needs a tool to quickly assess > if these features are supported or not for a specific HCFS implementation. > Besides, the known hadoop command line tool or hdfs shell is used to directly > interact with a HCFS storage system, where most commands correspond to > specific HCFS interfaces and work well. Still, there are cases that are > complicated and may not work, like expunge command. To check such commands > for an HCFS, we also need an approach to figure them out. > {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility > benchmark and provide corresponding tool to do the compatibility assessment > for an HCFS storage system. The benchmark and tool should consider both HCFS > interfaces and hdfs shell commands. Different scenarios require different > kinds of compatibilities. For such consideration, we could define different > suites in the benchmark. > *Benefits:* We intend the benchmark and tool to be useful for both storage > providers and storage users. For end users, it can be used to evalute the > compatibility level and determine if the storage system in question is > suitable for the required scenarios. For storage providers, it helps to > quickly generate an objective and reliable report about core functioins of > the storage service. As an instance, if the HCFS got a 100% on a suite named > 'tpcds', it is demonstrated that all functions needed by a tpcds program have > been well achieved. It is also a guide indicating how storage service > abilities can map to HCFS interfaces, such as storage class on S3. > Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org