> What is libdirent? How is it relevant in this context? 

Since version 3.3, libhdfs depends on the dirent.h API. However, MSVC does not 
provide this header which causes issues when building libhdfs on Windows 
platforms. To solve this problem, hdfs-sys uses libdirent - a MSVC port of the 
dirent.h API for Windows.

Fortunately, hdfs has already done similar work in 
[native/libhdfspp/lib/x-platform]. If libhdfs-rust is accepted, we can migrate 
to use hdfs's own implementation instead.

> How tightly coupled is it to a specific Hadoop version?

Thanks to hdfs's stable API, there is no breakage between different hadoop 
version (only addition). So the version matrix will be like:

- libhdfs-rust (feature flag: v2_2) can access  hadoop v2.2 ~ v3.3
...
- libhdfs-rust (feature flag: v2_10) can access  hadoop v2.10 ~ v3.3
...
- libhdfs-rust (feature flag: v3_3) can access  hadoop v3.3

> The concern I have as a release manager is that it makes my life harder to 
> ensure the quality of a language binding that I am not familiar with.

Most of the code in libhdfs-rust is generated by [rust-bindgen], a tool 
developed by the Rust Team to automatically generate Rust FFI bindings for C 
(and some C++) libraries. Other parts are related to building and linking, 
similar to Makefile, such as finding libjvm and libhdfs.

In general, the task that libhdfs-rust performs is simple: it provides an API 
to Rust and links it with libhdfs.so, which I believe is easy to test.

[libdirect]: https://github.com/tronkko/dirent
[native/libhdfspp/lib/x-platform]: 
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/x-platform/dirent.h
[rust-bindgen]: https://github.com/rust-lang/rust-bindgen


On Tue, Jul 18, 2023, at 00:14, Wei-Chiu Chuang wrote:
> Inline
> 
> On Sat, Jul 15, 2023 at 5:04 AM Ayush Saxena <ayush...@gmail.com> wrote:
>> Forwarding from dev@hadoop to relevant ML
>> 
>> Original mail: 
>> https://lists.apache.org/thread/r5rcmc7lwwvkysj0320myxltsyokp9kq
>> 
>> -Ayush
>> 
>> On 2023/07/15 09:18:42 Xuanwo wrote:
>> > Hello, everyone.
>> >
>> > I'm the maintainer of [hdfs-sys]: A binding to HDFS Native C API for Rust. 
>> > I want to know is it a good idea of accepting hdfs-sys as a part of hadoop 
>> > project?
>> >
>> > Users of hdfs-sys for now:
>> >
>> > - [OpenDAL]: An Apache Incubator project that allows users to easily and 
>> > efficiently retrieve data from various storage services in a unified way.
>> > - [Databend]: A modern cloud data warehouse focusing on reducing cost and 
>> > complexity for your massive-scale analytics needs. (via OpenDAL)
>> > - [RisingWave]: The distributed streaming database: SQL stream processing 
>> > with Postgres-like experience. (via OpenDAL)
>> > - [LakeSoul]: an end-to-end, realtime and cloud native Lakehouse framework
>> >
>> > Licenses information of hdfs-sys:
>> >
>> > - hdfs-sys itself licensed under Apache-2.0
>> > - hdfs-sys only depends on the following libs: cc@1.0.73, glob@0.3.1, 
>> > hdfs-sys@0.3.0, java-locator@0.1.5, lazy_static@1.4.0, they are all dual 
>> > licensed under Apache-2.0 and MIT. 
>> >
>> > Works need to do if accept:
>> >
>> > - Replace libdirent with the same dirent API implemented in HDFS project.
>> > - Remove all bundled hdfs C code.
> What is libdirent? How is it relevant in this context? 
> 
> How tightly coupled is it to a specific Hadoop version? I am wondering if 
> it's possible to host it in a separate Hadoop repo, if it's accepted. The 
> concern I have as a release manager is that it makes my life harder to ensure 
> the quality of a language binding that I am not familiar with.
>> >
>> > [hdfs-sys]: https://github.com/Xuanwo/hdfs-sys
>> > [OpenDAL]: https://github.com/apache/incubator-opendal
>> > [Databend]: https://github.com/datafuselabs/databend
>> > [RisingWave]: https://github.com/risingwavelabs/risingwave
>> > [LakeSoul]: https://github.com/lakesoul-io/LakeSoul
>> >
>> > Xuanwo
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscr...@hadoop.apache.org
>> > For additional commands, e-mail: dev-h...@hadoop.apache.org
>> >
>> >
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Xuanwo

Reply via email to