Re: [DISCUSS] Object-storage mock testing

Alexandre Dutra Tue, 02 Jun 2026 06:12:25 -0700

Hi all,

I'm fine with either donating the library to Polaris or reusing the
original Nessie one, although I have a slight preference for option 1
(accept the donation), as I believe this would provide a faster
time-to-market if we ever need to bring changes or new features to the
library.


Thanks,
Alex

On Tue, Jun 2, 2026 at 10:56 AM Robert Stupp <[email protected]> wrote:
>
> Hi Yufei,
>
> I think there are two separate things here.
>
> The immediate use case is test infrastructure for Polaris tests that must
> go through the real SDK/FileIO HTTP paths while still letting the test
> control the object store's behavior.
>
> For the object-storage-ops / purge-table tests, this means things like
> generated objects, synthetic listings, metadata, conditional responses,
> intercepted writes/deletes, targeted failures, and using roughly the same
> fixture setup for S3/GCS/ADLS.
>
> That does not mean the mock should become a full object-store emulator.
> I would keep it test-only and only add protocol behavior when a concrete
> Polaris test needs it.
>
> The other question is ownership.
>
> If Polaris only wants to reuse the current behavior, and people are fine
> with a Nessie test dependency, then using the Nessie test artifacts
> directly is a reasonable option.
>
> If we expect these tests to drive Polaris-specific behavior over time, then
> I think having the code in Polaris is cleaner.
> The test utility would live next to the tests that need it, and future
> additions would need concrete Polaris test cases.
>
> So I still think the choices are:
>
> 1. accept `object-storage-mock` into Polaris as test-only infrastructure;
> 2. use the Nessie test artifacts directly;
> 3. name another concrete library, or combination of libraries, and check it
>    against the requirements above; or
> 4. defer the object-storage-ops / purge-table work that depends on this
> level of
>    testing.
>
> If there is another use case or tradeoff for this test-infra decision,
> please spell it out.
> Otherwise I think we should pick one of these paths and keep
> the object-storage-ops API discussion separate.
>
> Robert
>
>
> On Tue, Jun 2, 2026 at 3:02 AM Yufei Gu <[email protected]> wrote:
>
> > Thanks Robert and Dmitri for raising this.
> >
> > One thing I'm still trying to understand better is the use case. Could you
> > share a bit more about the use cases you have in mind for consuming the
> > Nessie test artifacts directly?
> >
> > In particular, I'm interested in whether the expectation is simply to reuse
> > them as stable test infrastructure, or use alternatives, or whether Polaris
> > would potentially need to influence, extend, or evolve the behavior of
> > those test utilities independently over time. Understanding the anticipated
> > use cases would help evaluate the tradeoffs.
> >
> > Yufei
> >
> >
> > On Mon, Jun 1, 2026 at 8:33 AM Dmitri Bourlatchkov <[email protected]>
> > wrote:
> >
> > > Hi Robert,
> > >
> > > Thanks for the recap of previous emails!
> > >
> > > My personal preference would be to reuse the Nessie Object Storage Mock
> > > jars in the "test" scope in Polaris (as dependencies). I believe this
> > > approach requires less work.
> > >
> > > However, your proposal for copying that code to Polaris also sounds good
> > to
> > > me.
> > >
> > > In general, I second your point that these testing tools are distinct
> > from
> > > Adobe S3Mock in that they provide emulation/validation/assertion
> > > capabilities more natural to the JUnit context.
> > >
> > > Cheers,
> > > Dmitri.
> > >
> > > On Mon, Jun 1, 2026 at 6:33 AM Robert Stupp <[email protected]> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I’d like to restart the object-storage-mock discussion.
> > > > The PR discussion has gone in a few directions, and I think we should
> > > > decide the test-infra question explicitly.
> > > >
> > > > A quick recap of where we are:
> > > >
> > > > - The earlier `[DISCUSS] Object store functionality` [1] thread was
> > about
> > > > the
> > > >   broader object-storage-ops and purge-table work.
> > > > - In review of that broader work [3], there was concern about depending
> > > on
> > > > Nessie
> > > >   test artifacts directly.
> > > > - So the test utilities were split out into a separate PR [4].
> > > > - Review of that split-out PR then raised the other question: should
> > > > Polaris
> > > >   accept and maintain that copied code, or should we use existing
> > > libraries
> > > > such
> > > >   as Adobe S3Mock instead?
> > > > - The current `object-storage-mock` PR [2] is narrower than both
> > earlier
> > > > PRs. It
> > > >   is only about the object-storage mock test utility.
> > > >
> > > > So the question here is not whether to approve the full
> > > object-storage-ops
> > > > work.
> > > > The question is what test infrastructure Polaris wants for object-store
> > > > behavior.
> > > >
> > > > For the object-storage-ops and purge-table work, we need tests that go
> > > > through real SDK/FileIO HTTP interactions,
> > > > but where the test can still control and check object-store behavior
> > > > precisely.
> > > > For example: generated objects, synthetic listings, metadata,
> > conditional
> > > > responses, intercepted writes/deletes, and targeted failures.
> > > >
> > > > A filesystem fixture, a Map-backed fixture, or a normal local S3
> > emulator
> > > > are all useful for other tests, but they do not give that level of
> > > > operation-level control.
> > > >
> > > > Adobe S3Mock is useful when a test needs a local S3-compatible service.
> > > > The object-storage-mock is different: it exposes selected
> > S3/GCS/ADLS/STS
> > > > protocol surfaces while letting the test define bucket behavior per
> > > > operation.
> > > > That is what lets the current object-storage-ops and purge-table tests
> > > > validate real client interactions without depending on cloud services.
> > > >
> > > > Across the reviews, two reasonable concerns came up:
> > > >
> > > > - avoiding a Nessie test dependency;
> > > > - avoiding unnecessary copied code.
> > > >
> > > > However, we need to choose a path, because the object-storage-ops and
> > > > purge-table work depend on this level of testing.
> > > >
> > > > I see at least these options:
> > > >
> > > > 1. accept `object-storage-mock` into Polaris as test-only
> > infrastructure,
> > > >    subject to the normal ASF provenance/license checks
> > > > 2. use the Nessie test artifacts directly
> > > > 3. identify existing libraries that satisfy the same requirements
> > > > 4. defer the object-storage-ops / purge-table work that depends on this
> > > > testing
> > > >    until the test-infra question is resolved.
> > > >
> > > > My preference is option 1: keep it test-only, limit it to protocol
> > > behavior
> > > > needed by Polaris tests, and require future protocol additions to come
> > > with
> > > > concrete Polaris test cases.
> > > >
> > > > If option 1 or 2 is not acceptable, then option 3 needs to name the
> > > > specific library or combination of libraries and check it against the
> > > > requirements above.
> > > > If there is another path, I would like to understand it.
> > > > Otherwise we are effectively choosing option 4 for the work that
> > depends
> > > on
> > > > these tests.
> > > >
> > > > Robert
> > > >
> > > > [1] https://lists.apache.org/thread/0z8nb3w58zb9s617gsoyhzlnz53rt9zx
> > > > ([DISCUSS] Object store functionality)
> > > > [2] https://github.com/apache/polaris/pull/4570 (Add
> > object-storage-mock
> > > > test utility)
> > > > [3] https://github.com/apache/polaris/pull/3256 (Object store
> > > > functionality)
> > > > [4] https://github.com/apache/polaris/pull/3513 (Test libraries for
> > > > storage
> > > > operations, closed)
> > > >
> > >
> >

Re: [DISCUSS] Object-storage mock testing

Reply via email to