Thanks, Ming! I am Big +1 for this.
The current Kafka log system is rarely used in production because it is too difficult. As you said, it is high in cost, complex, and prone to data inconsistency. The integration of Kafka and Paimon is not good. But in fact, we actually have a lot of needs, and I really like how your designs of Embedded and External, both of which have many scenarios. I have no problem with the big design. I think this is just a conceptual PIP, it is a huge implementation, and we need to gradually start its development. Looking forward to your and your partner's investment. Best, Jingsong On Mon, Nov 27, 2023 at 2:32 PM Ming Li <[email protected]> wrote: > > Hi devs, > > I would like to start a discussion about PIP-13: Introduce a Lightweight > LogStore [1]. > > Currently paimon only provides a LogStore implementation based on kafka, > but this involves an external service, which brings the following > limitations: > > 1. Increased management and maintenance costs; > 2. Inconsistency in data management between different storages leads to > unexpected behaviors such as data cleaning and rollback; > 3. Inconsistency in data sharding makes it difficult to implement > dynamic scaling; > 4. Redundant data storage. > > Therefore, we hope to provide a lightweight logStore in paimon that does > not rely on other services to achieve: > > 1. Out-of-the-box logStore storage, which does not rely on other > services and reduce maintenance and management costs; > 2. The data life cycle is consistent with paimon's snapshot to reduce > redundant data; > 3. Data sharding is consistent with paimon’s partitions and buckets to > support paimon’s data sharding-related features; > 4. Support second-level data visibility under at-least-once semantics. > > Looking forward to your feedback, thanks! > > [1] > https://cwiki.apache.org/confluence/display/PAIMON/PIP-13%3A+Introduce+a+Lightweight+LogStore > > > Best, > Ming Li
