I noticed there are two similar threads. The other one is https://lists.apache.org/thread/ro9t6xh9lprcvm0x8ndyshhk91qbx3po in hdfs-dev. Shall we stick to this thread for further discussions?
Jie Yang On 2026/06/08 08:02:48 Cheng Pan wrote: > Hi Dongdong Yang, > > I have some concerns about your action on PR review comments - none of the > inline comments get responses, even after I explicitly ask you to do that. > > Most of the inline comments are change requests. According to your commit > list, you tend to use AI to make changes to address the comment, AI tends to > cater to users in most cases, but are all the change requests correct or > necessary? I’d like to see some tech debate/discussion there. > > A few of the inline comments are questions and discussion, they were > completely ignored. > > Thanks, > Cheng Pan > > > > > On Jun 8, 2026, at 15:06, Yang,Dongdong(ACG CCN) <[email protected]> > > wrote: > > > > Hi everyone, > > > > Thank you Shilun for driving this discussion, and thank you Xiaoqiao for > > the review and support. > > > > Current Status: > > The PR (#8347) is actively being reviewed. We have addressed feedback > > from LuciferYang and pan3793, including code quality improvements, > > documentation fixes, and GHA workflow integration. CI is passing. > > > > Known Limitations: > > - No append support > > - No hflush/hsync (calls degrade to no-op; data is persisted on close) > > - No concat or truncate > > - No symbolic links or extended attributes > > > > Follow-up Plan: > > 1. Ensure CI remains green after rebase on latest trunk > > 2. Once community consensus is reached, merge it > > 3. Continue iterating on improvements in subsequent PRs, such as known > > limitations > > 4. Upon discovering insufficient performance, we run benchmarks (such as > > TPCDS and NNBench) on HDFS and connector, and optimize them > > > > We welcome any questions, concerns, or suggestions from the dev team. > > > > Best regards, > > Dongdong Yang. > > > > On 2026/05/26 06:40:48 Xiaoqiao He wrote: > >> Thanks Shilun for driving this progress. > >> +1 from my side, > >> a. From the PR (https://github.com/apache/hadoop/pull/8347), the code has > >> been ready now. > >> b. Both of the contributors are PMC members or committers from mature > >> community of apache. > >> I would like to hear more sound from the dev team about the following > >> plan. Good > >> Luck! > >> > >> Best Regards, > >> - He Xiaoqiao > >> > >> On Fri, May 22, 2026 at 9:33 PM slfan1989 <[email protected]> wrote: > >> > >>> Hi Hadoop community, > >>> > >>> I would like to start a discussion about adding Baidu Cloud BOS > >>> (Baidu Object Storage) as a native Hadoop-compatible filesystem connector. > >>> > >>> JIRA: https://issues.apache.org/jira/browse/HDFS-11161 > >>> PR: https://github.com/apache/hadoop/pull/8347 > >>> CI Status: +1 overall, all checks passed. > >>> > >>> I have had some offline discussions with LuciferYang and the contributors > >>> working on this connector. Based on those discussions, I am helping bring > >>> this proposal to the Hadoop community for broader review and feedback. > >>> > >>> The goal is to integrate BOS support as a native Hadoop filesystem module, > >>> similar to the existing hadoop-aws (S3A), hadoop-aliyun, and hadoop-cos > >>> connectors. > >>> > >>> 1. Background > >>> > >>> Baidu Cloud is one of the major cloud service providers in China. BOS > >>> (Baidu Object Storage) is Baidu's core object storage service and is > >>> widely > >>> used for big data analytics, machine learning, and data lake workloads. > >>> > >>> A native Hadoop connector would allow Hadoop ecosystem projects, including > >>> MapReduce, Spark, Hive, Flink, and others, to access BOS storage directly > >>> through the bos:// scheme. > >>> > >>> According to the contributors, this connector has been running in > >>> production > >>> at Baidu for around 8 years, serving both BOS users and Baidu MapReduce > >>> (BMR) workloads. > >>> > >>> 2. Implementation > >>> > >>> The proposed module is placed under: > >>> > >>> hadoop-cloud-storage-project/hadoop-bos > >>> > >>> This follows the structure of the existing cloud storage connectors. > >>> > >>> The implementation includes: > >>> > >>> - A full Hadoop FileSystem implementation with the bos:// URI scheme > >>> - Pluggable credentials provider support > >>> - Contract tests covering standard filesystem operations > >>> - Dependency shading or exclusion to avoid classpath conflicts, with > >>> shaded > >>> dependencies placed under org.apache.hadoop.fs.bos.shaded.* > >>> > >>> 3. Long-term Maintenance > >>> > >>> The following contributors have expressed commitment to maintaining this > >>> module: > >>> > >>> - yangdong2398, BOS R&D > >>> - LuciferYang, Apache Spark PMC > >>> - jackylee-ch, Apache Gluten PMC > >>> - houzhizhen, Apache HugeGraph committer > >>> - summaryzb, Apache Uniffle committer > >>> > >>> They have committed to: > >>> > >>> - Responding to issues and PRs within one week > >>> - Keeping dependencies up to date > >>> - Adapting the connector to future Hadoop API changes > >>> > >>> 4. Why Consider Integrating This into Hadoop > >>> > >>> This proposal follows a similar rationale to hadoop-aws (S3A), > >>> hadoop-aliyun, and hadoop-cos: > >>> > >>> - Users can rely on a single, consistent Hadoop distribution without > >>> managing separate connector JARs and version compatibility manually > >>> - A connector maintained within the Hadoop community is easier for users > >>> to > >>> trust and review > >>> - Shared CI helps ensure ongoing compatibility with Hadoop trunk > >>> > >>> I would like to invite feedback from the community on whether this > >>> connector > >>> is appropriate to include in Hadoop, and what additional work, review, or > >>> requirements would be needed before it can be accepted. > >>> > >>> The contributors are copied / expected to participate in this discussion > >>> and > >>> can provide more details about the implementation, production usage, and > >>> maintenance plan. > >>> > >>> Best regards, > >>> Shilun Fan. > >>> > >> > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
