Hi Shammon FY, Thanks for your comments. I’d like to share my thoughts about your comments.
1. Public Interface Thank you for the reminder. I overlooked the correspondence between the Public Interface of PIP and the "@Public" annotation. My idea was that Event, Listener, and ListenerFactory are public, while the others are non-public. 2. Add `Factory` to create `Listener` Great suggestion, I have already added the ListenerFactory to PIP. 3. Flink and Spark support meta-data listeners It will be very inconvenient for users to obtain DDL information through engines. Firstly, there are many implementations of various engines that need to be connected. Secondly, in addition to Flink and Spark, many engines do not support meta-data listeners. As a general data lake, Paimon should have its own mechanism for meta-data listeners. 4. report events such as commit/compact to an external system CompactEvent: Currently, the compact state is a black box, and users cannot obtain the information through SQL or API. CommitEvent: Currently, the methods of querying through SQL or API are based on polling, which makes it difficult for users to perceive commit operations in a timely manner and consumes a lot of resources. Best Shidayang Shammon FY <[email protected]> 于2023年8月18日周五 14:07写道: > > Thanks @Jocean for starting this discussion, I have some comments > > 1. About the public interfaces in the PIP, we should add @Public for them > such as `Event`, `Listener` and even `CommitEvent` and other events. But > for `Listeners`, I don't think it should be a public interface. All fields > in the public interface for users should be `Public` too, but I found the > information such as `ManifestEntry` in `CommitEvent` is not a public > interface. I think you may need to reconsider which interfaces need to be > marked with @Public and which are not. > > 2. In general, it is better to give a `Factory` to create `Listener` which > should be all marked as `@Public` and you can see > `CatalogFactory`->`Catalog` as an example. > > 3. Currently Flink and Spark support meta-data listeners and we can support > reporting ddl information there, should we need to add the same listener in > Paimon? > > 4. Should we need to report the events such as commit/compact to an > external system? Currently we have some system tables and users can get > these information by SQL or API, should the external system query these > information regularly instead of a listener to push them? > > Best, > Shammon FY > > > On Tue, Aug 15, 2023 at 11:08 AM Jocean shi <[email protected]> wrote: > > > Hi devs: > > > > We would like to start a discussion about PIP-8: Introduce listeners > > for Paimon[1]. > > > > In production environments, users often need to perceive the state > > changes of Paimon table, > > such as whether a new file has been committed to the table, in which > > partitions the committed files are, > > the size and number of the committed files, the status and type of > > compaction, operations like table creation, deletion, and schema > > changes, etc. > > So, we introduce a Listener system for Paimon. > > Looking forward to hearing from you. > > > > [1] > > https://cwiki.apache.org/confluence/display/PAIMON/PIP-8%3A+Introduce+listeners+for+Paimon > > > > Best > > shidayang > >
