[ 
https://issues.apache.org/jira/browse/FLINK-39499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18076909#comment-18076909
 ] 

Efrat Levitan commented on FLINK-39499:
---------------------------------------

The main minio repo is archived as of 25.4.26, I guess the migration is now 
inevitable
I was considering [rustfs|https://github.com/rustfs/rustfs] (Apache-2.0) as an 
alternative - they proclaimed better performance. Haven't benchmarked it myself 
though.

> Replace MinIO in e2e tests with an Apache 2.0-licensed S3 alternative
> ---------------------------------------------------------------------
>
>                 Key: FLINK-39499
>                 URL: https://issues.apache.org/jira/browse/FLINK-39499
>             Project: Flink
>          Issue Type: Technical Debt
>          Components: Tests
>            Reporter: Martijn Visser
>            Priority: Major
>
> MinIO relicensed from Apache 2.0 to AGPLv3 in April 2021. While test-time 
> Docker usage doesn't strictly violate ASF's Category X policy (which governs 
> artifact inclusion), aligning dev/test tooling with Apache 2.0 is a 
> reasonable hygiene preference for an Apache project, and avoids ongoing 
> friction with contributors who want to minimize AGPL exposure in any form.
> The current setup is already broken. common_s3_minio.sh pulls 
> minio/minio:latest, which dropped the FS backend around late 2022. Tests that 
> read pre-staged files (test_batch_wordcount.sh hadoop_minio / presto_minio) 
> now fail with FileNotFoundException on s3://test-data/words. Write-only tests 
> (test_file_sink.sh s3 *) still pass because they create objects via API. The 
> breakage went unnoticed because only write-only variants are in 
> run-nightly-tests.sh.
> We need to investigate what's a proper replacement. There is 
> https://rmoff.net/2026/01/14/alternatives-to-minio-for-single-node-local-s3/ 
> which makes a comparison already. From a quick check, we could consider using 
> S3Proxy (Apache 2.0) preserves the current "file on disk = S3 object" 
> semantics via its jclouds filesystem backend. SeaweedFS (Apache 2.0) is an 
> alternative if we're willing to restructure the test to upload via S3 API. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to