Hi Samrat,

Thank you for putting a very detailed FLIP together!

I have a few suggestions to strengthen the proposal:
1. Could we create a "Public interfaces" section? At the moment,
proposed interfaces are to be found in multiple parts of the doc and
it makes it harder to get general direction.
2. Current PoC implementation contains more configurations than is
outlined in the FLIP, I understand that this part will be evolving and
it would be good to have a general review of public contracts as part
of the FLIP.
3. Could we call out our testing strategy on the path to production
readiness? Will we mark configurations that enable this feature
@experimental? What would be our acceptance criteria to consider it
production ready?
4. I assume we imply full state compatibility during migration through
"load with legacy hadoop, then write using new new fs", should we
expand on migration strategy to ensure that we have a clear path
forward? For example, would migration involve setting up both schemas
(s3a with legacy as recovery path + s3 with new FS as checkpoint path)
and packaging both implementations in the `plugins` directory to
perform transition?
5. CRT support is called out in the FLIP, but doesn't seem to be a
part of PoC implementation, are we going to add it as a follow up?
6. It looks like PoC implementation already supports server side
encryption for SSE-KMS, so it would be great to call this out in the
FLIP. At a glance adding support for other SSE approaches (like SSE-C
and Client side encryption) is not that straightforward with PoC
implementation as SSE-KMS. Is it worth considering it as a child FLIP
for prod migration?
7. This FLIP suggests that we want to replace Flink dependency on
hadoop/presto. Are we considering having some "uber" FLIP covering
implementation for Azure/GCP as well?
8. FLIP suggests that we can significantly decrease packaged JAR size,
could we provide guidance on the size of package SDK native FS with
shaded dependencies to strengthen this selling point?

I also have are a couple of high level questions:

1. We discuss that multipart upload has a minimum part size of 5 MB,
does it mean that we are limited to "commit" less than 5 MB of data?
Would it mean that users with low traffic would have large end to end
latency or is it still possible to "commit" the data on checkpoint and
restart multipart upload?

2. To gain trust in this new file system, we need extensive testing of
failover/recovery and ensure it doesn't lead to data loss / object
leaks / memory leaks etc. Have we already covered some of the basic
durability testing as part of PoC or is it a part of the testing plan?

Kind regards,
Alex

On Fri, 6 Feb 2026 at 09:17, Samrat Deb <[email protected]> wrote:
>
> Hi everyone,
>
> Following up on our earlier Thread[1] regarding the architectural
> fragmentation of S3 support, I would like to formally present the progress
> on introducing a native S3 filesystem for Flink.
>
> The current "dual-connector" ecosystem—split between flink-s3-fs-hadoop and
> flink-s3-fs-presto—has reached its technical limits. The Hadoop-based
> implementation introduces significant dependency bloat and persistent
> classpath conflicts, while the Presto-based connector lacks
> RecoverableWriter forcing users to manage multiple configurations for
> exactly-once sinks.
>
> To resolve this, I am proposing FLIP-555: Flink Native S3 FileSystem[2].
> This implementation is built directly on the AWS SDK for Java v2, providing
> a unified, high-performance, and Hadoop-free solution for all S3
> interactions.
>
> I have conducted benchmarking comparing the native implementation against
> the existing Presto-based filesystem. The initial results are highly
> motivating, with a visible performance gain. You can find the detailed
> performance analysis here[3]
>
> Following offline discussions with Piotr Nowojski and Gabor Somogyi, the
> POC and benchmarking results are good enough to validate that Native S3
> FileSystem would be a valuable addition to Flink.
>
> With the addition of the Native S3 FileSystem, I have also discussed
> briefly the Deprecation Strategy to ensure operational stability in the
> FLIP.
>
>
>    1.
>
>    Phase 1: Introduce flink-s3-fs-native as an optional plugin for
>    community validation.
>    2.
>
>    Phase 2: Promote the native connector to the recommended default once
>    feature parity and stability are proven.
>    3.
>
>    Phase 3: Formally deprecate the legacy Hadoop and Presto connectors in a
>    future major release.
>
> Looking forward to your feedback and suggestions on the design and
> implementation details outlined in the FLIP.
>
>
> Cheers,
> Samrat
>
>
> [1] https://lists.apache.org/thread/2bllhqlbv0pz6t95tsjbszpm9bp9911c
>
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-555%3A+Flink+Native+S3+FileSystem
>
> [3]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=406620396

Reply via email to