Hey all

I'm working with Apache Arrow in Java and I want to know if there is
an implementation in the java library that provides a native S3
filesystem implementation like the one provided in the Python
implementation of Arrow (pyarrow) which uses the S3FileSystem. I have
gone through the Arrow Java IPC documentation and I do not see any
such implementation there.

In Python, using pyarrow, one can read a table from S3 like this:

import pyarrow.parquet as pq

# using a URI -> filesystem is inferred
pq.read_table("s3://my-bucket/data.parquet")
# using a path and filesystem
s3 = fs.S3FileSystem(..)
pq.read_table("my-bucket/data.parquet", filesystem=s3)

I want to know if similar functionalities are implemented for Google
Cloud Storage File System (GcsFileSystem) and Hadoop Distributed File
System (HDFS) as well.

If there is no native implementation available in Java, is there any
upcoming or beta release planned to provide these functionalities in
Java?

Thanks & Regards
~ Arjun

Reply via email to