Re: Support AWS SDK V2 for Flink's S3 FileSystem

David Morávek Mon, 02 Oct 2023 14:12:19 -0700

Hi Maomao,

I wonder whether it would make sense to take a stab at consolidating the S3
filesystems instead and introduce a native one. The whole Hadoop wrapper
around the S3 client exists for legacy reasons, and it adds complexity and
probably an unnecessary performance penalty.

If you take a look at the underlying presto implementation, it's actually
not too complex to adapt to Flink interfaces (since you're proposing to
maintain a copy of it anyway).

Overall, the S3 FS is probably the most used one that we have so this could
be rather high impact. It would also eliminate user confusion when choosing
the implementation to use.

WDYT?

Best,
D.

On Fri, Sep 29, 2023 at 2:41 PM Min, Maomao <mimao...@amazon.com.invalid>
wrote:

> Hi Flink Dev,
>
> I’m Maomao, a developer from AWS EMR.
>
> Recently, our team is working on adding AWS SDK V2 support for Flink’s S3
> Filesystem. During development, we found out that our work was blocked by
> Presto. This is because that Presto still uses AWS SDK V1 and won’t add
> support for AWS SDK V2 in short term. To unblock, our team proposed several
> options and I’ve created a JIRA issue as here<
> https://issues.apache.org/jira/browse/FLINK-33157>.
>
> Since our team plans to contribute this work back to the community later,
> we’d like to collect feedback from the community about the options we
> proposed in the long term so that the community won’t need to duplicate
> this work in the future.
>
> Best,
> Maomao
>
>

Re: Support AWS SDK V2 for Flink's S3 FileSystem

Reply via email to