Hi Russell, Thanks, this is a concrete option 3.
I agree that we should not add a custom mock just because it exists, and I also agree that maintaining S3/GCS/ADLS/STS protocol handlers is a real cost. For me the origin of the code is not the deciding point. The deciding point is whether the test approach fits Polaris and whether the maintenance scope is clear. But I think the important question is whether MinIO + SDK interceptors cover the actual test requirements here. I don't think the main need is "heavy IO" or "can we talk to an S3-compatible endpoint". The need is more specific: tests for object-storage-ops / purge-table logic that go through the real SDK/FileIO paths, but where the test can still control the object-store behavior. The cases I think we need to check are: * generated objects without preloading a bucket * synthetic listings, including very large listings, without materializing all objects * intercepted writes/deletes with assertions * metadata and conditional responses * targeted failures * roughly the same fixture shape across S3/GCS/ADLS * reasonable setup/teardown time and CI resource use * compatibility with the Iceberg FileIO configuration paths we test MinIO + ExecutionInterceptor looks useful for S3 request/response fault injection. But I don't yet see how it covers the synthetic listing / generated object side without adding another custom stub somewhere. And for the purge-table tests, that is not a side detail. Being able to simulate huge object sets and listings without creating all objects in a bucket is one of the main reasons this mock is useful. Also, for S3/GCS/ADLS this is not one interceptor model. It would be MinIO plus AWS ExecutionInterceptor, then Azure pipeline policies, then GCS request initializers/callbacks. That may still be fine, but it is also custom test-infrastructure maintenance. If the answer is "use MinIO for normal S3 behavior and add a small custom stub for synthetic listings," then I think we should compare that directly with `object-storage-mock`. At that point we are still maintaining custom test infrastructure, just split between containers, SDK interceptors, and test-local stubs. So for me, option 3 still needs to show that it covers the requirements above without becoming another set of custom stubs. If MinIO, Azurite/fake-gcs-server, plus SDK-specific interceptors/callbacks can do that cleanly across the cases we need, including maintenance effort when SDKs change, then that is a real alternative. If not, I still think the choice is between accepting `object-storage-mock`, using the Nessie test artifact, or deferring the object-storage-ops / purge-table work that needs this level of testing. Robert On Wed, Jun 3, 2026 at 5:57 AM Russell Spitzer <[email protected]> wrote: > Thanks for restarting this, Robert > > My core question is: Is this a core competency of the Polaris project, and > is it something we want to take on maintaining? > > I'm not a big fan of taking a test‑scope dependency on another project, > especially one whose primary contributors are now focused on Polaris > itself. I'm also not a big fan of bringing in a significant amount of code > that isn't part of what Polaris is here to do. > > I want to be careful that the project doesn't develop a pattern where > Nessie code is treated as a preferred default for new functionality; that's > a community‑health question worth being explicit about. My main concern, > though, is technical: Why does Polaris need this when similar projects in > the space don't? > > A quick inventory of how other Apache projects mock object storage in > tests: > > - Apache Iceberg, Druid, Flink, Hudi — MinIO via Testcontainers. > - Apache Spark, Hadoop — configurable S3 endpoint, so tests can target > MinIO, LocalStack, or real S3. > - Apache Gravitino — LocalStack via Testcontainers. > > > None of these projects, most of which have substantially heavier IO > requirements than Polaris does, have found it necessary to build or import > custom object‑store mocking infrastructure. So I think the burden of proof > for going a different direction in Polaris is high. > > On the specific interception capabilities you listed, the established > pattern in the AWS ecosystem is client‑side interception against a real > S3‑compatible backend, not a programmable mock server. AWS SDK v2 ships > ExecutionInterceptor > < > https://github.com/aws/aws-sdk-java-v2/blob/master/core/sdk-core/src/main/java/software/amazon/awssdk/core/interceptor/ExecutionInterceptor.java > > > with > hooks for modifyHttpRequest, modifyHttpResponse, beforeTransmission, > onExecutionFailure, etc. The SDK does all the real HTTP serialization and > parsing, and the test decides what bytes go in or come out at chosen > lifecycle stages. The Azure and GCS SDKs have equivalents ( > HttpPipelinePolicy, HttpRequestInitializer). > > Among Apache projects that test S3 code paths, the only one I can find that > does test‑time fault injection is Apache Hadoop, and they do it with this > pattern, not a custom mock server: > > - HADOOP‑19221 / PR #6938 <https://github.com/apache/hadoop/pull/6938> > uses > an ExecutionInterceptor to "change the status from 200 to 400 after the > targeted operation completes, putting the SDK into retry/recovery mode" > - HADOOP‑18565 / PR #5421 <https://github.com/apache/hadoop/pull/5421> > uses > interceptors in ITestS3AEndpointRegion and related tests to throw on > demand and capture state. > - InconsistentAmazonS3Client > < > https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/testing.html#Failure_Injection > > > is > the older V1‑SDK client wrapper that's shipped in hadoop-aws for the > same purpose. > > > Iceberg, Spark, and Gravitino don't ship S3 test fault‑injection > infrastructure at all, they get by entirely with MinIO/LocalStack against > the real client. So the spread across Apache projects that touch S3 is > either "no fault injection needed in tests" or "interceptor against a real > backend." Nobody has chosen to maintain a programmable multi‑cloud HTTP > mock for this. > > Walking through your specific list with MiniIO + Execution Interceptor, I > think we can cover everything except the "synthetic listings" approach. > Since that's really just a single test suite perhaps just one small test > stub could be used for that. > > So my concrete answer to option 3 is: MinIO (as Iceberg/Druid/Flink/Hudi > already use) plusExecutionInterceptor (as Apache Hadoop already uses for > the same class of test). > > Ranking the options: > > 1. Use a third‑party library/container already used by these projects. > MinIO > + ExecutionInterceptor is the standard pattern and Apache Hadoop is > direct prior art. If there are specific test scenarios that genuinely > can't > be expressed that way, I'd like to see those before moving on. > 2. Move all the code into Polaris. If we have consensus from more > community members that this is acceptable, it still makes the project > responsible for maintaining HTTP handlers for S3, GCS, ADLS, and STS a > significant ongoing burden that, per the inventory above, no comparable > project has chosen to take on. If we go here I'd want a scope limit: > only > the protocol surfaces actively exercised by Polaris tests today, with > future additions gated on shipping with the tests that need them, and a > commitment to retire Adobe S3Mock from the Polaris‑specific tests so we > don't end up with two S3 mocks in‑tree. > 3. Depend on the Nessie test jar. I'd put this last for the reasons > above. > > > On Tue, Jun 2, 2026 at 6:11 AM Alexandre Dutra <[email protected]> wrote: > > > Hi all, > > > > I'm fine with either donating the library to Polaris or reusing the > > original Nessie one, although I have a slight preference for option 1 > > (accept the donation), as I believe this would provide a faster > > time-to-market if we ever need to bring changes or new features to the > > library. > > > > Thanks, > > Alex > > > > On Tue, Jun 2, 2026 at 10:56 AM Robert Stupp <[email protected]> wrote: > > > > > > Hi Yufei, > > > > > > I think there are two separate things here. > > > > > > The immediate use case is test infrastructure for Polaris tests that > must > > > go through the real SDK/FileIO HTTP paths while still letting the test > > > control the object store's behavior. > > > > > > For the object-storage-ops / purge-table tests, this means things like > > > generated objects, synthetic listings, metadata, conditional responses, > > > intercepted writes/deletes, targeted failures, and using roughly the > same > > > fixture setup for S3/GCS/ADLS. > > > > > > That does not mean the mock should become a full object-store emulator. > > > I would keep it test-only and only add protocol behavior when a > concrete > > > Polaris test needs it. > > > > > > The other question is ownership. > > > > > > If Polaris only wants to reuse the current behavior, and people are > fine > > > with a Nessie test dependency, then using the Nessie test artifacts > > > directly is a reasonable option. > > > > > > If we expect these tests to drive Polaris-specific behavior over time, > > then > > > I think having the code in Polaris is cleaner. > > > The test utility would live next to the tests that need it, and future > > > additions would need concrete Polaris test cases. > > > > > > So I still think the choices are: > > > > > > 1. accept `object-storage-mock` into Polaris as test-only > infrastructure; > > > 2. use the Nessie test artifacts directly; > > > 3. name another concrete library, or combination of libraries, and > check > > it > > > against the requirements above; or > > > 4. defer the object-storage-ops / purge-table work that depends on this > > > level of > > > testing. > > > > > > If there is another use case or tradeoff for this test-infra decision, > > > please spell it out. > > > Otherwise I think we should pick one of these paths and keep > > > the object-storage-ops API discussion separate. > > > > > > Robert > > > > > > > > > On Tue, Jun 2, 2026 at 3:02 AM Yufei Gu <[email protected]> wrote: > > > > > > > Thanks Robert and Dmitri for raising this. > > > > > > > > One thing I'm still trying to understand better is the use case. > Could > > you > > > > share a bit more about the use cases you have in mind for consuming > the > > > > Nessie test artifacts directly? > > > > > > > > In particular, I'm interested in whether the expectation is simply to > > reuse > > > > them as stable test infrastructure, or use alternatives, or whether > > Polaris > > > > would potentially need to influence, extend, or evolve the behavior > of > > > > those test utilities independently over time. Understanding the > > anticipated > > > > use cases would help evaluate the tradeoffs. > > > > > > > > Yufei > > > > > > > > > > > > On Mon, Jun 1, 2026 at 8:33 AM Dmitri Bourlatchkov <[email protected] > > > > > > wrote: > > > > > > > > > Hi Robert, > > > > > > > > > > Thanks for the recap of previous emails! > > > > > > > > > > My personal preference would be to reuse the Nessie Object Storage > > Mock > > > > > jars in the "test" scope in Polaris (as dependencies). I believe > this > > > > > approach requires less work. > > > > > > > > > > However, your proposal for copying that code to Polaris also sounds > > good > > > > to > > > > > me. > > > > > > > > > > In general, I second your point that these testing tools are > distinct > > > > from > > > > > Adobe S3Mock in that they provide emulation/validation/assertion > > > > > capabilities more natural to the JUnit context. > > > > > > > > > > Cheers, > > > > > Dmitri. > > > > > > > > > > On Mon, Jun 1, 2026 at 6:33 AM Robert Stupp <[email protected]> > wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > I’d like to restart the object-storage-mock discussion. > > > > > > The PR discussion has gone in a few directions, and I think we > > should > > > > > > decide the test-infra question explicitly. > > > > > > > > > > > > A quick recap of where we are: > > > > > > > > > > > > - The earlier `[DISCUSS] Object store functionality` [1] thread > was > > > > about > > > > > > the > > > > > > broader object-storage-ops and purge-table work. > > > > > > - In review of that broader work [3], there was concern about > > depending > > > > > on > > > > > > Nessie > > > > > > test artifacts directly. > > > > > > - So the test utilities were split out into a separate PR [4]. > > > > > > - Review of that split-out PR then raised the other question: > > should > > > > > > Polaris > > > > > > accept and maintain that copied code, or should we use existing > > > > > libraries > > > > > > such > > > > > > as Adobe S3Mock instead? > > > > > > - The current `object-storage-mock` PR [2] is narrower than both > > > > earlier > > > > > > PRs. It > > > > > > is only about the object-storage mock test utility. > > > > > > > > > > > > So the question here is not whether to approve the full > > > > > object-storage-ops > > > > > > work. > > > > > > The question is what test infrastructure Polaris wants for > > object-store > > > > > > behavior. > > > > > > > > > > > > For the object-storage-ops and purge-table work, we need tests > > that go > > > > > > through real SDK/FileIO HTTP interactions, > > > > > > but where the test can still control and check object-store > > behavior > > > > > > precisely. > > > > > > For example: generated objects, synthetic listings, metadata, > > > > conditional > > > > > > responses, intercepted writes/deletes, and targeted failures. > > > > > > > > > > > > A filesystem fixture, a Map-backed fixture, or a normal local S3 > > > > emulator > > > > > > are all useful for other tests, but they do not give that level > of > > > > > > operation-level control. > > > > > > > > > > > > Adobe S3Mock is useful when a test needs a local S3-compatible > > service. > > > > > > The object-storage-mock is different: it exposes selected > > > > S3/GCS/ADLS/STS > > > > > > protocol surfaces while letting the test define bucket behavior > per > > > > > > operation. > > > > > > That is what lets the current object-storage-ops and purge-table > > tests > > > > > > validate real client interactions without depending on cloud > > services. > > > > > > > > > > > > Across the reviews, two reasonable concerns came up: > > > > > > > > > > > > - avoiding a Nessie test dependency; > > > > > > - avoiding unnecessary copied code. > > > > > > > > > > > > However, we need to choose a path, because the object-storage-ops > > and > > > > > > purge-table work depend on this level of testing. > > > > > > > > > > > > I see at least these options: > > > > > > > > > > > > 1. accept `object-storage-mock` into Polaris as test-only > > > > infrastructure, > > > > > > subject to the normal ASF provenance/license checks > > > > > > 2. use the Nessie test artifacts directly > > > > > > 3. identify existing libraries that satisfy the same requirements > > > > > > 4. defer the object-storage-ops / purge-table work that depends > on > > this > > > > > > testing > > > > > > until the test-infra question is resolved. > > > > > > > > > > > > My preference is option 1: keep it test-only, limit it to > protocol > > > > > behavior > > > > > > needed by Polaris tests, and require future protocol additions to > > come > > > > > with > > > > > > concrete Polaris test cases. > > > > > > > > > > > > If option 1 or 2 is not acceptable, then option 3 needs to name > the > > > > > > specific library or combination of libraries and check it against > > the > > > > > > requirements above. > > > > > > If there is another path, I would like to understand it. > > > > > > Otherwise we are effectively choosing option 4 for the work that > > > > depends > > > > > on > > > > > > these tests. > > > > > > > > > > > > Robert > > > > > > > > > > > > [1] > > https://lists.apache.org/thread/0z8nb3w58zb9s617gsoyhzlnz53rt9zx > > > > > > ([DISCUSS] Object store functionality) > > > > > > [2] https://github.com/apache/polaris/pull/4570 (Add > > > > object-storage-mock > > > > > > test utility) > > > > > > [3] https://github.com/apache/polaris/pull/3256 (Object store > > > > > > functionality) > > > > > > [4] https://github.com/apache/polaris/pull/3513 (Test libraries > > for > > > > > > storage > > > > > > operations, closed) > > > > > > > > > > > > > > > > > >
