Hello!
I have created a PR https://github.com/apache/polaris/pull/3293 to address
this proposal.

Thanks,
Oleg

On Tue, Nov 25, 2025 at 2:00 PM Oleg Soloviov <[email protected]> wrote:

> Hi Adnan,
>
> As I see no further objections, I would like to start working on it.
>
> Regarding the attributes, the AttributeKey approach looks like a
> reasonable compromise between flexibility and type-safety, but I need to
> think it over.
>
> Thanks,
> Oleg
>
> On Mon, Nov 17, 2025 at 9:24 PM Adnan Hemani
> <[email protected]> wrote:
>
>> Hi Alex,
>>
>> >  I'm actually leaning towards an AttributeKey approach, similar to Netty
>>
>> I'm not sure this helps address the dangers around using a free-form
>> string
>> as the attribute key. But as you said, this is more of an implementation
>> detail - we can work through it together on any potential PR :)
>>
>> I think Alex and I are aligned - happy to hear any other community
>> opinions
>> on this topic, but I think we might be ready to start work on this within
>> the next few days if there are no further opinions @Oleg.
>>
>> Best,
>> Adnan Hemani
>>
>> On Mon, Nov 17, 2025 at 5:37 AM Alexandre Dutra <[email protected]>
>> wrote:
>>
>> > Hi all,
>> >
>> > > I propose the following (building on Alex's proposal) to move this
>> > conversation forward: the new method signature would be
>> > `Map<PolarisEvent.EventPropertyType, Object> attributes()`
>> >
>> > I agree about the potential benefit of strongly-typed attribute keys.
>> > While I initially suggested String for simplicity, I'm actually
>> > leaning towards an AttributeKey approach, similar to Netty [1]. The
>> > concern with using an enum is that it might restrict users from
>> > defining their own custom attributes. But that's more an
>> > implementation detail.
>> >
>> > > All other events that only generate an "after" metadata object should
>> > store their metadata in "metadataAfter" and leave "metadataBefore" as
>> > unset, just like any other unused property.
>> >
>> > I have no issues with that logic.
>> >
>> > (But I am surprised by the current design where "before" state
>> > information is included in "after" events, and "after" state
>> > information is included in "before" events. Given the substantial size
>> > of objects like TableMetadata, this dual inclusion looks redundant. It
>> > should be possible instead to correlate the before event with its
>> > after counterpart and build a before/after diff of the change, if
>> > desired. But that's a different topic.)
>> >
>> > Thanks,
>> > Alex
>> >
>> > [1]:
>> >
>> https://github.com/netty/netty/blob/fc0d763ca983c8290d087ed2887f112963d812d2/common/src/main/java/io/netty/util/AttributeKey.java#L25
>> >
>> > On Fri, Nov 14, 2025 at 6:18 PM Adnan Hemani
>> > <[email protected]> wrote:
>> > >
>> > > Hi all,
>> > >
>> > > Very sorry for the late reply - this week has been busy. I was (still
>> > > somewhat am) in favor of strongly-typed events. I had earlier
>> informed my
>> > > opinion on this given other systems which do use their events later
>> > within
>> > > their execution. It seems we do not have this use case yet - and not
>> on
>> > the
>> > > near horizon yet either, as Dmitri has noted.
>> > >
>> > > However, my one remaining concern with keeping PolarisEvents as a
>> > flattened
>> > > "bag of properties" is, unless we have comprehensive per-event testing
>> > > (which defeats the whole point of removing the strongly-typed events
>> > > structure), we may be vulnerable to typos and inconsistent naming,
>> which
>> > > could effectively render the unified filtering/pruning mechanisms
>> > useless.
>> > > As a result, I propose the following (building on Alex's proposal) to
>> > move
>> > > this conversation forward: the new method signature would be
>> > > `Map<PolarisEvent.EventPropertyType, Object> attributes()` where
>> > > EventPropertyType is an enum defined within PolarisEvent and contains
>> all
>> > > the different types of properties an event could have.
>> > >
>> > > Edge case call-out: There will be special care needed for events such
>> as
>> > > (Before/After)CommitTableEvent, which have metadata objects for before
>> > AND
>> > > after - but these can be modeled using two separate EventPropertyType
>> > > objects: one for metadataBefore and one for metadataAfter. All other
>> > events
>> > > that only generate an "after" metadata object should store their
>> metadata
>> > > in "metadataAfter" and leave "metadataBefore" as unset, just like any
>> > other
>> > > unused property. This may slightly complicate the unified
>> > filtering/pruning
>> > > logic - but this, IMO, is an acceptable balance.
>> > >
>> > > WDYT?
>> > >
>> > > Best,
>> > > Adnan Hemani
>> > >
>> > > On Fri, Nov 14, 2025 at 1:48 AM Oleg Soloviov <[email protected]>
>> wrote:
>> > >
>> > > > Hi all,
>> > > >
>> > > > It looks like we have a lazy consensus on this proposal. If that's
>> the
>> > case
>> > > > and there are no further objections, I would like to work on this
>> one.
>> > > >
>> > > > Thanks,
>> > > > Oleg
>> > > >
>> > > > On Sat, Nov 8, 2025 at 12:13 AM Dmitri Bourlatchkov <
>> [email protected]>
>> > > > wrote:
>> > > >
>> > > > > Hi Alex,
>> > > > >
>> > > > > I agree that using a flat (single class?) type hierarchy for
>> events
>> > on
>> > > > the
>> > > > > server side is reasonable. Polaris Server itself does not appear
>> to
>> > > > "read"
>> > > > > the events it produces, so maintaining the multitude of getters
>> does
>> > seem
>> > > > > like an unnecessary overhead. At the same time producing
>> > well-structured
>> > > > > payloads for delivering events to external systems (including
>> > persistence
>> > > > > in the Polaris database) can be achieved without a verbose type
>> > > > hierarchy.
>> > > > >
>> > > > > Cheers,
>> > > > > Dmitri.
>> > > > >
>> > > > > On Fri, Nov 7, 2025 at 11:30 AM Alexandre Dutra <
>> [email protected]>
>> > > > wrote:
>> > > > >
>> > > > > > Hi all,
>> > > > > >
>> > > > > > I'm writing to express my concerns about the current state of
>> the
>> > > > > > PolarisEvent API and to propose a solution.
>> > > > > >
>> > > > > > Current challenges:
>> > > > > >
>> > > > > > 1) Excessive complexity: the PolarisEvent interface currently
>> has
>> > over
>> > > > > > 150 concrete subtypes, with a corresponding number of methods in
>> > the
>> > > > > > PolarisEventListener interface. This forces each concrete
>> listener
>> > to
>> > > > > > implement all 150+ methods, even when the logic is similar or
>> > > > > > identical, leading to significant boilerplate (see example [1]
>> > from a
>> > > > > > recent PR).
>> > > > > >
>> > > > > > 2) Manual processes: afaik the current plan for event pruning
>> > (e.g.,
>> > > > > > removing sensitive or large data) is to implement this event by
>> > event.
>> > > > > > This has been a slow process so far. We only have 2-3 events
>> > > > > > implemented, we still have 147 more to go.
>> > > > > >
>> > > > > > While I generally advocate for strongly typed APIs, I believe
>> that
>> > in
>> > > > > > this specific context, the PolarisEvent hierarchy is slowing
>> down
>> > the
>> > > > > > development of event-related features.
>> > > > > >
>> > > > > > Do we need so many subtypes? Events are very short-lived
>> objects;
>> > they
>> > > > > > are created, immediately passed to a listener, and then
>> > > > > > garbage-collected. Besides, most listeners will likely apply the
>> > same
>> > > > > > logic to all events (basically: serialize and dispatch). This
>> > hints at
>> > > > > > a type hierarchy that isn't being useful to its main consumers.
>> > > > > >
>> > > > > > My proposal is to completely flatten the PolarisEvent hierarchy.
>> > > > > > Instead of numerous concrete types, we would have a single
>> > > > > > implementation. This implementation would expose the methods I'm
>> > > > > > adding in [2], including type() which allows distinguishing
>> events
>> > by
>> > > > > > type ID.
>> > > > > >
>> > > > > > It would also expose a new method: Map<String, Object>
>> > attributes().
>> > > > > >
>> > > > > > An event factory would be responsible for creating events and
>> > > > > > populating these attributes using a common set of well-defined,
>> > typed
>> > > > > > attribute keys such as "catalog_name", "table_identifier",
>> > > > > > "table_metadata", etc.
>> > > > > >
>> > > > > > This creates a schemaless-ish view of the event, which is ideal
>> for
>> > > > > > pruning and serialization. It would enable us to apply common
>> rules
>> > > > > > more efficiently. For example:
>> > > > > >
>> > > > > > 1) All events containing the "table_metdata" attribute could
>> > > > > > automatically apply a pruning logic to reduce its size.
>> > > > > >
>> > > > > > 2) All events containing a specific attribute could
>> automatically
>> > have
>> > > > > > sensitive data removed from its value.
>> > > > > >
>> > > > > > I'm curious to hear what the community thinks of this proposal.
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Alex
>> > > > > >
>> > > > > > [1]:
>> > > > > >
>> > > > >
>> > > >
>> >
>> https://github.com/vchag/polaris/blob/4c0aef587e63d5e60d657561a0a53701417f324b/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/AllEventsForwardingListener.java
>> > > > > > [2]: https://github.com/apache/polaris/pull/2998
>> > > > > >
>> > > > >
>> > > >
>> >
>>
>

Reply via email to