Thanks Xuanwo for your work. I believe it is valuable to enlarge hadoop
ecosystem.

I am also concerned that it will involve more hard work to release and
version match,
especially for one who is not familiar with C or Rust.
Moreover, I am not aware the difference between `accept hdfs-sys as part of
hadoop
project` and `one separate project`.

I think one smooth solution is reference hadoop-thirdparty[1] which is one
hadoop
sub-project but split to separate repo and release line etc, if it is
accepted.

cc @Ayush Saxena <ayush...@gmail.com> @Wei-Chiu Chuang
<weic...@apache.org> @Iñigo
Goiri <elgo...@gmail.com> @Shilun Fan <slfan1...@foxmail.com> and other
folks, what
do you think? Thanks.

Best Regards,
- He Xiaoqiao

[1] https://github.com/apache/hadoop-thirdparty

On Wed, Dec 20, 2023 at 6:17 PM Xuanwo <xua...@apache.org> wrote:

> I'm fine to start work under a new repo, and I'm willing to help maintain
> this repo. The repo could name after hadoop-libhdfs-rust or just
> libhdfs-rust.
>
> I'm PPMC member of other ASF projects so I know how to do release and how
> to make sure the license fit the requirements. I'm willing the become the
> RM until we find more committers for this sub-project.
>
> I'm currently looking for committers willing to help me review PRs and
> validate my releases. Is there anyone interested in sponsoring me?
>
> On Tue, Jul 18, 2023, at 12:45, Xuanwo wrote:
> > > What is libdirent? How is it relevant in this context?
> >
> > Since version 3.3, libhdfs depends on the dirent.h API. However, MSVC
> does not provide this header which causes issues when building libhdfs on
> Windows platforms. To solve this problem, hdfs-sys uses libdirent - a MSVC
> port of the dirent.h API for Windows.
> >
> > Fortunately, hdfs has already done similar work in
> [native/libhdfspp/lib/x-platform]. If libhdfs-rust is accepted, we can
> migrate to use hdfs's own implementation instead.
> >
> > > How tightly coupled is it to a specific Hadoop version?
> >
> > Thanks to hdfs's stable API, there is no breakage between different
> hadoop version (only addition). So the version matrix will be like:
> >
> > - libhdfs-rust (feature flag: v2_2) can access  hadoop v2.2 ~ v3.3
> > ...
> > - libhdfs-rust (feature flag: v2_10) can access  hadoop v2.10 ~ v3.3
> > ...
> > - libhdfs-rust (feature flag: v3_3) can access  hadoop v3.3
> >
> > > The concern I have as a release manager is that it makes my life
> harder to ensure the quality of a language binding that I am not familiar
> with.
> >
> > Most of the code in libhdfs-rust is generated by [rust-bindgen], a tool
> developed by the Rust Team to automatically generate Rust FFI bindings for
> C (and some C++) libraries. Other parts are related to building and
> linking, similar to Makefile, such as finding libjvm and libhdfs.
> >
> > In general, the task that libhdfs-rust performs is simple: it provides
> an API to Rust and links it with libhdfs.so, which I believe is easy to
> test.
> >
> > [libdirect]: https://github.com/tronkko/dirent
> > [native/libhdfspp/lib/x-platform]:
> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/x-platform/dirent.h
> > [rust-bindgen]: https://github.com/rust-lang/rust-bindgen
> >
> >
> > On Tue, Jul 18, 2023, at 00:14, Wei-Chiu Chuang wrote:
> >> Inline
> >>
> >> On Sat, Jul 15, 2023 at 5:04 AM Ayush Saxena <ayush...@gmail.com>
> wrote:
> >>> Forwarding from dev@hadoop to relevant ML
> >>>
> >>> Original mail:
> https://lists.apache.org/thread/r5rcmc7lwwvkysj0320myxltsyokp9kq
> >>>
> >>> -Ayush
> >>>
> >>> On 2023/07/15 09:18:42 Xuanwo wrote:
> >>> > Hello, everyone.
> >>> >
> >>> > I'm the maintainer of [hdfs-sys]: A binding to HDFS Native C API for
> Rust. I want to know is it a good idea of accepting hdfs-sys as a part of
> hadoop project?
> >>> >
> >>> > Users of hdfs-sys for now:
> >>> >
> >>> > - [OpenDAL]: An Apache Incubator project that allows users to easily
> and efficiently retrieve data from various storage services in a unified
> way.
> >>> > - [Databend]: A modern cloud data warehouse focusing on reducing
> cost and complexity for your massive-scale analytics needs. (via OpenDAL)
> >>> > - [RisingWave]: The distributed streaming database: SQL stream
> processing with Postgres-like experience. (via OpenDAL)
> >>> > - [LakeSoul]: an end-to-end, realtime and cloud native Lakehouse
> framework
> >>> >
> >>> > Licenses information of hdfs-sys:
> >>> >
> >>> > - hdfs-sys itself licensed under Apache-2.0
> >>> > - hdfs-sys only depends on the following libs: cc@1.0.73, glob@0.3.1,
> hdfs-sys@0.3.0, java-locator@0.1.5, lazy_static@1.4.0, they are all dual
> licensed under Apache-2.0 and MIT.
> >>> >
> >>> > Works need to do if accept:
> >>> >
> >>> > - Replace libdirent with the same dirent API implemented in HDFS
> project.
> >>> > - Remove all bundled hdfs C code.
> >> What is libdirent? How is it relevant in this context?
> >>
> >> How tightly coupled is it to a specific Hadoop version? I am wondering
> if it's possible to host it in a separate Hadoop repo, if it's accepted.
> The concern I have as a release manager is that it makes my life harder to
> ensure the quality of a language binding that I am not familiar with.
> >>> >
> >>> > [hdfs-sys]: https://github.com/Xuanwo/hdfs-sys
> >>> > [OpenDAL]: https://github.com/apache/incubator-opendal
> >>> > [Databend]: https://github.com/datafuselabs/databend
> >>> > [RisingWave]: https://github.com/risingwavelabs/risingwave
> >>> > [LakeSoul]: https://github.com/lakesoul-io/LakeSoul
> >>> >
> >>> > Xuanwo
> >>> >
> >>> > ---------------------------------------------------------------------
> >>> > To unsubscribe, e-mail: dev-unsubscr...@hadoop.apache.org
> >>> > For additional commands, e-mail: dev-h...@hadoop.apache.org
> >>> >
> >>> >
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
> > Xuanwo
> >
>
> Xuanwo
>

Reply via email to