There is a Java nio S3 spi that might meet your needs.
https://github.com/awslabs/aws-java-nio-spi-for-s3

Full disclosure, I am the major author of this project.


On Tue, Apr 4, 2023 at 12:30 AM arjun kashyap <[email protected]>
wrote:

> Hey all
>
> I'm working with Apache Arrow in Java and I want to know if there is
> an implementation in the java library that provides a native S3
> filesystem implementation like the one provided in the Python
> implementation of Arrow (pyarrow) which uses the S3FileSystem. I have
> gone through the Arrow Java IPC documentation and I do not see any
> such implementation there.
>
> In Python, using pyarrow, one can read a table from S3 like this:
>
> import pyarrow.parquet as pq
>
> # using a URI -> filesystem is inferred
> pq.read_table("s3://my-bucket/data.parquet")
> # using a path and filesystem
> s3 = fs.S3FileSystem(..)
> pq.read_table("my-bucket/data.parquet", filesystem=s3)
>
> I want to know if similar functionalities are implemented for Google
> Cloud Storage File System (GcsFileSystem) and Hadoop Distributed File
> System (HDFS) as well.
>
> If there is no native implementation available in Java, is there any
> upcoming or beta release planned to provide these functionalities in
> Java?
>
> Thanks & Regards
> ~ Arjun
>

Reply via email to