Has anyone started looking into how to read data sets from S3? I started
looking into it and wondered if anyone has a design in mind.

We could implement an S3FileSystem class in pyarrow/filesystem.py. The
filesystem components could probably be written against the AWS Python SDK.

The HDFS file system and file classes, however, are implemented at least
partially in Cython & C++. Is there an advantage to doing that for S3 too?

Thanks,

Kevin

----
Kevin Moore
CEO, Quilt Data, Inc.
ke...@quiltdata.io | LinkedIn <https://www.linkedin.com/in/kevinemoore/>
(415) 497-7895


Data packages for fast, reproducible data science
quiltdata.com

Reply via email to