There is a Java nio S3 spi that might meet your needs. https://github.com/awslabs/aws-java-nio-spi-for-s3
Full disclosure, I am the major author of this project. On Tue, Apr 4, 2023 at 12:30 AM arjun kashyap <[email protected]> wrote: > Hey all > > I'm working with Apache Arrow in Java and I want to know if there is > an implementation in the java library that provides a native S3 > filesystem implementation like the one provided in the Python > implementation of Arrow (pyarrow) which uses the S3FileSystem. I have > gone through the Arrow Java IPC documentation and I do not see any > such implementation there. > > In Python, using pyarrow, one can read a table from S3 like this: > > import pyarrow.parquet as pq > > # using a URI -> filesystem is inferred > pq.read_table("s3://my-bucket/data.parquet") > # using a path and filesystem > s3 = fs.S3FileSystem(..) > pq.read_table("my-bucket/data.parquet", filesystem=s3) > > I want to know if similar functionalities are implemented for Google > Cloud Storage File System (GcsFileSystem) and Hadoop Distributed File > System (HDFS) as well. > > If there is no native implementation available in Java, is there any > upcoming or beta release planned to provide these functionalities in > Java? > > Thanks & Regards > ~ Arjun >
