Hello all,
I was thinking that a filesystem with support for s3 would be great to have
in the Python SDK. If I am not wrong, it would simply involve implementing
the filesystem classes
<https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/filesystem.py>
with
s3, right?

I am not familiar enough with s3, nor with filesystems, nor with AWS in
general - but I have some outstanding questions:

   - Does this mean that we probably would need an extra [s3] target for
   installing apache_beam, like we do with [gcp]?
      - Not strictly necessary, but probably desirable...
   - How do we handle KMS in GCS filesystem?
   - Would the filesystem encapsulation make KMS support in an s3
   filesystem difficult?
   - Or even more... is the KMS support in AWS very different than in GCP?
      - I'd love comments from anyone informed around this : )
   - Is this project of an appropriate size for a GSoC student?

Thoughts?
Best
-P.

Reply via email to