Hi Yufei,

I think there are two separate things here.

The immediate use case is test infrastructure for Polaris tests that must
go through the real SDK/FileIO HTTP paths while still letting the test
control the object store's behavior.

For the object-storage-ops / purge-table tests, this means things like
generated objects, synthetic listings, metadata, conditional responses,
intercepted writes/deletes, targeted failures, and using roughly the same
fixture setup for S3/GCS/ADLS.

That does not mean the mock should become a full object-store emulator.
I would keep it test-only and only add protocol behavior when a concrete
Polaris test needs it.

The other question is ownership.

If Polaris only wants to reuse the current behavior, and people are fine
with a Nessie test dependency, then using the Nessie test artifacts
directly is a reasonable option.

If we expect these tests to drive Polaris-specific behavior over time, then
I think having the code in Polaris is cleaner.
The test utility would live next to the tests that need it, and future
additions would need concrete Polaris test cases.

So I still think the choices are:

1. accept `object-storage-mock` into Polaris as test-only infrastructure;
2. use the Nessie test artifacts directly;
3. name another concrete library, or combination of libraries, and check it
   against the requirements above; or
4. defer the object-storage-ops / purge-table work that depends on this
level of
   testing.

If there is another use case or tradeoff for this test-infra decision,
please spell it out.
Otherwise I think we should pick one of these paths and keep
the object-storage-ops API discussion separate.

Robert


On Tue, Jun 2, 2026 at 3:02 AM Yufei Gu <[email protected]> wrote:

> Thanks Robert and Dmitri for raising this.
>
> One thing I'm still trying to understand better is the use case. Could you
> share a bit more about the use cases you have in mind for consuming the
> Nessie test artifacts directly?
>
> In particular, I'm interested in whether the expectation is simply to reuse
> them as stable test infrastructure, or use alternatives, or whether Polaris
> would potentially need to influence, extend, or evolve the behavior of
> those test utilities independently over time. Understanding the anticipated
> use cases would help evaluate the tradeoffs.
>
> Yufei
>
>
> On Mon, Jun 1, 2026 at 8:33 AM Dmitri Bourlatchkov <[email protected]>
> wrote:
>
> > Hi Robert,
> >
> > Thanks for the recap of previous emails!
> >
> > My personal preference would be to reuse the Nessie Object Storage Mock
> > jars in the "test" scope in Polaris (as dependencies). I believe this
> > approach requires less work.
> >
> > However, your proposal for copying that code to Polaris also sounds good
> to
> > me.
> >
> > In general, I second your point that these testing tools are distinct
> from
> > Adobe S3Mock in that they provide emulation/validation/assertion
> > capabilities more natural to the JUnit context.
> >
> > Cheers,
> > Dmitri.
> >
> > On Mon, Jun 1, 2026 at 6:33 AM Robert Stupp <[email protected]> wrote:
> >
> > > Hi all,
> > >
> > > I’d like to restart the object-storage-mock discussion.
> > > The PR discussion has gone in a few directions, and I think we should
> > > decide the test-infra question explicitly.
> > >
> > > A quick recap of where we are:
> > >
> > > - The earlier `[DISCUSS] Object store functionality` [1] thread was
> about
> > > the
> > >   broader object-storage-ops and purge-table work.
> > > - In review of that broader work [3], there was concern about depending
> > on
> > > Nessie
> > >   test artifacts directly.
> > > - So the test utilities were split out into a separate PR [4].
> > > - Review of that split-out PR then raised the other question: should
> > > Polaris
> > >   accept and maintain that copied code, or should we use existing
> > libraries
> > > such
> > >   as Adobe S3Mock instead?
> > > - The current `object-storage-mock` PR [2] is narrower than both
> earlier
> > > PRs. It
> > >   is only about the object-storage mock test utility.
> > >
> > > So the question here is not whether to approve the full
> > object-storage-ops
> > > work.
> > > The question is what test infrastructure Polaris wants for object-store
> > > behavior.
> > >
> > > For the object-storage-ops and purge-table work, we need tests that go
> > > through real SDK/FileIO HTTP interactions,
> > > but where the test can still control and check object-store behavior
> > > precisely.
> > > For example: generated objects, synthetic listings, metadata,
> conditional
> > > responses, intercepted writes/deletes, and targeted failures.
> > >
> > > A filesystem fixture, a Map-backed fixture, or a normal local S3
> emulator
> > > are all useful for other tests, but they do not give that level of
> > > operation-level control.
> > >
> > > Adobe S3Mock is useful when a test needs a local S3-compatible service.
> > > The object-storage-mock is different: it exposes selected
> S3/GCS/ADLS/STS
> > > protocol surfaces while letting the test define bucket behavior per
> > > operation.
> > > That is what lets the current object-storage-ops and purge-table tests
> > > validate real client interactions without depending on cloud services.
> > >
> > > Across the reviews, two reasonable concerns came up:
> > >
> > > - avoiding a Nessie test dependency;
> > > - avoiding unnecessary copied code.
> > >
> > > However, we need to choose a path, because the object-storage-ops and
> > > purge-table work depend on this level of testing.
> > >
> > > I see at least these options:
> > >
> > > 1. accept `object-storage-mock` into Polaris as test-only
> infrastructure,
> > >    subject to the normal ASF provenance/license checks
> > > 2. use the Nessie test artifacts directly
> > > 3. identify existing libraries that satisfy the same requirements
> > > 4. defer the object-storage-ops / purge-table work that depends on this
> > > testing
> > >    until the test-infra question is resolved.
> > >
> > > My preference is option 1: keep it test-only, limit it to protocol
> > behavior
> > > needed by Polaris tests, and require future protocol additions to come
> > with
> > > concrete Polaris test cases.
> > >
> > > If option 1 or 2 is not acceptable, then option 3 needs to name the
> > > specific library or combination of libraries and check it against the
> > > requirements above.
> > > If there is another path, I would like to understand it.
> > > Otherwise we are effectively choosing option 4 for the work that
> depends
> > on
> > > these tests.
> > >
> > > Robert
> > >
> > > [1] https://lists.apache.org/thread/0z8nb3w58zb9s617gsoyhzlnz53rt9zx
> > > ([DISCUSS] Object store functionality)
> > > [2] https://github.com/apache/polaris/pull/4570 (Add
> object-storage-mock
> > > test utility)
> > > [3] https://github.com/apache/polaris/pull/3256 (Object store
> > > functionality)
> > > [4] https://github.com/apache/polaris/pull/3513 (Test libraries for
> > > storage
> > > operations, closed)
> > >
> >
>

Reply via email to