Hi Yong,

+1 to adding a maintenance section to the helm chart.

Cheers,
Dmitri.

On Mon, Jun 8, 2026 at 10:13 PM Yong Zheng <[email protected]> wrote:

> Hello Nándor and Dmitri,
>
> I agree this is becoming more important as we persist more data in the
> Polaris backend. Today we have at least the events tables and the persisted
> Iceberg metrics tables that need some form of cleanup and retention
> management.
>
> The admin tool approach sounds reasonable to me. It gives operators control
> over when cleanup runs and allows them to use existing scheduling
> mechanisms such as k8s crob.
>
> It would also be nice to avoid building a separate cleanup solution for
> every feature. If we go down the admin tool route, perhaps we can have a
> common maintenance framework that supports events cleanup, metrics cleanup,
> engine-specific maintenance tasks (for example, rebuilding indexes), as
> well as future maintenance operations.
>
> I am pretty open-ended on the implementation details. One thing that I
> think would be beneficial is introducing a maintenance section in the
> Polaris helm chart. That would allow operators to configure and schedule
> maintenance tasks without having to create separate one-off charts or jobs
> for each task.
>
> Thanks,
> Yong Zheng
>
>
> On Mon, Jun 8, 2026 at 8:01 PM Dmitri Bourlatchkov <[email protected]>
> wrote:
>
> > Hi Yong,
> >
> > Thanks for starting this discussion!
> >
> > From my POV the Admin tool does look like a good fit for this capability.
> > It is similar to the NoSQL maintenance task [3395].
> >
> > I believe end users could then schedule the maintenance runs according to
> > their deployment mechanics, e.g. via k8s jobs.
> >
> > I made an attempt at refactoring the Admin CLI for pluggability in terms
> of
> > sub-commands in [3947]. We could revive that PR if there's community
> > interest. The Metrics / Events maintenance tasks could then be plugged in
> > similarly to NoSQL maintenance.
> >
> > [3395] https://github.com/apache/polaris/pull/3395
> >
> > [3947] https://github.com/apache/polaris/pull/3947
> >
> > Cheers,
> > Dmitri.
> >
> > On Sun, Jun 7, 2026 at 2:34 PM Yong Zheng <[email protected]> wrote:
> >
> > > Hello,
> > >
> > > A while back Alex raised https://github.com/apache/polaris/issues/2573
> > > for requesting a mechanism to purge the events table. Recently there
> is a
> > > persisted iceberg metrics also got introduced (
> > > https://github.com/apache/polaris/pull/3385) and this created two
> tables
> > > (read and write metrics tables) which we also lack the life cycle
> > > management and tables size should grow indefinitely. We will likely
> need
> > a
> > > mechanism to handle both.
> > >
> > > I am wondering what does community thinks about this? Should this be
> part
> > > of admin tool where admins/ops should make the call on when to clean up
> > or
> > > should we have a janitor process that runs automatically (users will
> need
> > > to provide rules on what to cleanup such as time based TTL).
> > >
> > > Thanks,
> > > Yong Zheng
> > >
> >
>

Reply via email to