Hi Jocean,

Thanks for your explanation. I still have some issues

1. What are the ddl events for Paimon used for? If you need to show tables
for paimon in your system, I think it's better to define table related
interfaces, and then you can implement them for Paimon, Iceberg and Hudi
instead of adding a ddl listener in them. It's more general and you can
even manage other tables such as databases, mongodb and hive.

2. If some system information in `CompactEvent` is currently missing or
there's no information about `compact`,  I think a better way is to add
this system information in Paimon, rather than adding a listener and
creating an event with the information. Then the external system can get
the information by SQL or API directly, this is a more reasonable approach.

3. Also what is the `CommitEvent` used for? Currently we have metrics for
`Commit` and jobs can report them. How about adding a customized reporter
for metrics instead of a listener for `CommitEvent`?

Best,
Shammon FY




On Mon, Aug 21, 2023 at 5:16 PM Jocean shi <[email protected]> wrote:

> Hi Shammon FY,
>
> Thanks for your comments. I’d like to share my thoughts about your
> comments.
>
> 1. Public Interface
> Thank you for the reminder. I overlooked the correspondence between
> the Public Interface of PIP and the "@Public" annotation.
> My idea was that Event, Listener, and ListenerFactory are public,
> while the others are non-public.
>
> 2.  Add `Factory` to create `Listener`
> Great suggestion, I have already added the ListenerFactory to PIP.
>
> 3. Flink and Spark support meta-data listeners
> It will be very inconvenient for users to obtain DDL information
> through engines. Firstly, there are many implementations of various
> engines that need to be connected. Secondly, in addition to Flink and
> Spark, many engines do not support meta-data listeners. As a general
> data lake, Paimon should have its own mechanism for meta-data
> listeners.
>
> 4. report events such as commit/compact to an external system
> CompactEvent: Currently, the compact state is a black box, and users
> cannot obtain the information through SQL or API.
> CommitEvent: Currently, the methods of querying through SQL or API are
> based on polling, which makes it difficult for users to perceive
> commit operations in a timely manner and consumes a lot of resources.
>
> Best
> Shidayang
>
> Shammon FY <[email protected]> 于2023年8月18日周五 14:07写道:
> >
> > Thanks @Jocean for starting this discussion, I have some comments
> >
> > 1. About the public interfaces in the PIP, we should add @Public for them
> > such as `Event`, `Listener` and even `CommitEvent` and other events. But
> > for `Listeners`, I don't think it should be a public interface. All
> fields
> > in the public interface for users should be `Public` too, but I found the
> > information such as `ManifestEntry` in `CommitEvent` is not a public
> > interface. I think you may need to reconsider which interfaces need to be
> > marked with @Public and which are not.
> >
> > 2. In general, it is better to give a `Factory` to create `Listener`
> which
> > should be all marked as `@Public` and you can see
> > `CatalogFactory`->`Catalog` as an example.
> >
> > 3. Currently Flink and Spark support meta-data listeners and we can
> support
> > reporting ddl information there, should we need to add the same listener
> in
> > Paimon?
> >
> > 4. Should we need to report the events such as commit/compact to an
> > external system? Currently we have some system tables and users can get
> > these information by SQL or API, should the external system query these
> > information regularly instead of a listener to push them?
> >
> > Best,
> > Shammon FY
> >
> >
> > On Tue, Aug 15, 2023 at 11:08 AM Jocean shi <[email protected]>
> wrote:
> >
> > > Hi devs:
> > >
> > > We would like to start a discussion about PIP-8: Introduce listeners
> > > for Paimon[1].
> > >
> > > In production environments, users often need to perceive the state
> > > changes of Paimon table,
> > > such as whether a new file has been committed to the table, in which
> > > partitions the committed files are,
> > > the size and number of the committed files, the status and type of
> > > compaction, operations like table creation, deletion, and schema
> > > changes, etc.
> > > So, we introduce a Listener system for Paimon.
> > > Looking forward to hearing from you.
> > >
> > > [1]
> > >
> https://cwiki.apache.org/confluence/display/PAIMON/PIP-8%3A+Introduce+listeners+for+Paimon
> > >
> > > Best
> > > shidayang
> > >
>

Reply via email to