Hi Shammon FY,

Thanks for your comments. I’d like to share my thoughts about your comments.

1. Public Interface
Thank you for the reminder. I overlooked the correspondence between
the Public Interface of PIP and the "@Public" annotation.
My idea was that Event, Listener, and ListenerFactory are public,
while the others are non-public.

2.  Add `Factory` to create `Listener`
Great suggestion, I have already added the ListenerFactory to PIP.

3. Flink and Spark support meta-data listeners
It will be very inconvenient for users to obtain DDL information
through engines. Firstly, there are many implementations of various
engines that need to be connected. Secondly, in addition to Flink and
Spark, many engines do not support meta-data listeners. As a general
data lake, Paimon should have its own mechanism for meta-data
listeners.

4. report events such as commit/compact to an external system
CompactEvent: Currently, the compact state is a black box, and users
cannot obtain the information through SQL or API.
CommitEvent: Currently, the methods of querying through SQL or API are
based on polling, which makes it difficult for users to perceive
commit operations in a timely manner and consumes a lot of resources.

Best
Shidayang

Shammon FY <[email protected]> 于2023年8月18日周五 14:07写道:
>
> Thanks @Jocean for starting this discussion, I have some comments
>
> 1. About the public interfaces in the PIP, we should add @Public for them
> such as `Event`, `Listener` and even `CommitEvent` and other events. But
> for `Listeners`, I don't think it should be a public interface. All fields
> in the public interface for users should be `Public` too, but I found the
> information such as `ManifestEntry` in `CommitEvent` is not a public
> interface. I think you may need to reconsider which interfaces need to be
> marked with @Public and which are not.
>
> 2. In general, it is better to give a `Factory` to create `Listener` which
> should be all marked as `@Public` and you can see
> `CatalogFactory`->`Catalog` as an example.
>
> 3. Currently Flink and Spark support meta-data listeners and we can support
> reporting ddl information there, should we need to add the same listener in
> Paimon?
>
> 4. Should we need to report the events such as commit/compact to an
> external system? Currently we have some system tables and users can get
> these information by SQL or API, should the external system query these
> information regularly instead of a listener to push them?
>
> Best,
> Shammon FY
>
>
> On Tue, Aug 15, 2023 at 11:08 AM Jocean shi <[email protected]> wrote:
>
> > Hi devs:
> >
> > We would like to start a discussion about PIP-8: Introduce listeners
> > for Paimon[1].
> >
> > In production environments, users often need to perceive the state
> > changes of Paimon table,
> > such as whether a new file has been committed to the table, in which
> > partitions the committed files are,
> > the size and number of the committed files, the status and type of
> > compaction, operations like table creation, deletion, and schema
> > changes, etc.
> > So, we introduce a Listener system for Paimon.
> > Looking forward to hearing from you.
> >
> > [1]
> > https://cwiki.apache.org/confluence/display/PAIMON/PIP-8%3A+Introduce+listeners+for+Paimon
> >
> > Best
> > shidayang
> >

Reply via email to