Thanks for the proposal. I'm happy to act as a mentor for the project.
I respect the desire to go through the regular incubation process, and
maybe it is a good thing for the Paimon community to revisit some of the
processes and customs and develop their own style, independent of Flink as
part of the incubation process.

I have no doubt regarding the technical or "community building" ability of
the initial team.


On Mon, Feb 27, 2023 at 2:49 PM Becket Qin <becket....@gmail.com> wrote:

> I am really excited to see Paimon become an independent ASF incubation
> project, and I am happy to be a mentor of the project.
>
> Re Dave,
>
> The plan is to let Paimon eventually graduate as a TLP by itself. The
> project bootstrapped as a subproject of Flink because 1) it was designed to
> provide a stream and batch unified storage which matches the vision of
> Flink as a stream and batch unified engine and 2) the project was developed
> by the same team who is working on Flink.
>
> Now since there have been a few releases, we see strong and reasonable use
> cases from the users letting Paimon (flink-table-store) work with engines
> other than Flink, such as Spark / Trino. Continuing to keep Paimon as a
> subject of Flink might unnecessarily limit the development of the project
> and is somewhat misleading to the users. Given its scope, we believe it
> makes a lot of sense for Paimon to get incubated on its own independent of
> Flink. There has been a thorough discussion[1] and vote[2] about this among
> the Flink PMC.
>
> Cheers,
>
> Jiangjie (Becket) Qin
>
> [1] https://lists.apache.org/thread/2ybxfg3zrzn4l3tnq3w2w3xvkhk0f9jk
> [2] https://lists.apache.org/thread/95wyc51rfmsqc9osc86q7zx3491m7bvt
>
> On Fri, Feb 24, 2023 at 12:10 PM Dave Fisher <wave4d...@comcast.net>
> wrote:
>
>> An interesting proposal. Since Paimon is already part of Apache Flink
>> does the podling intend to graduate as it’s own Top Level Project? Or, is
>> the plan currently to become a subproject of Flink? I’m just curious. Were
>> there any discussions within the Flink community about incubating Paimon?
>>
>> Best Regards,
>> Dave
>>
>> Sent from my iPhone
>>
>> > On Feb 23, 2023, at 7:58 PM, Yu Li <car...@gmail.com> wrote:
>> >
>> > Revision: the hyperlink of the first reference is incorrect and please
>> use
>> > the website address directly instead of clicking it (sorry for my
>> mistake).
>> >
>> > For easier reference: https://github.com/apache/flink-table-store
>> >
>> > Best Regards,
>> > Yu
>> >
>> >
>> >> On Fri, 24 Feb 2023 at 11:48, Yu Li <car...@gmail.com> wrote:
>> >>
>> >> Hi All,
>> >>
>> >>
>> >> I would like to propose Paimon [1] as a new apache incubator project,
>> and
>> >> you can find the proposal [2] of Paimon for more details.
>> >>
>> >>
>> >> Paimon is a unified lake storage to build dynamic tables for both
>> stream
>> >> and batch processing with big data compute engines (Apache Flink,
>> Apache
>> >> Spark, Apache
>> >> Hive, Trino, etc.), supporting high-speed data ingestion and real-time
>> data query.
>> >> With the adoption of stream processing in production, there is an
>> increasing demand for storage to simultaneously support updates, deletes
>> and streaming reads,
>> >> which cannot be fully satisfied by existing lake storages. To tackle
>> these
>> >> new challenges, Paimon
>> >> natively adopts LSM (Log-Structured Merge-tree) as its underlying data
>> structure, and provides enhanced performance for data with primary keys
>> >> (besides
>> >> the common lake storage capabilities). What's more, Paimon supports
>> both batch and stream operations (reads and writes), facilitating
>> applications pursuing batch-stream-unified semantics. Specifically:
>> >>
>> >>
>> >> 1. Paimon provides excellent performance on the intensive update
>> >> / delete workload, leveraging the append-write feature of the LSM data
>> >> structure.
>> >>
>> >> 2. Paimon utilizes the ordered feature of LSM to support effective
>> filter
>> >> pushdown, and could reduce
>> >> the latency of queries with primary key filtering to milliseconds.
>> >>
>> >> 3.
>> >> Paimon supports various (row-based or row-columnar) file formats
>> including Apache Avro, Apache ORC and Apache Parquet (rows will be sorted
>> by the primary key before writing out).
>> >>
>> >> 4.
>> >> Tables provided by Paimon can be queried by various engines, including
>> Apache Flink, Apache Spark, Apache Hive, Trino, etc.
>> >>
>> >> 5.
>> >> Paimon's metadata is self-managed, stored on the distributed file
>> system and can be synchronized to Hive metastore (HMS).
>> >>
>> >> 6.
>> >> Besides the common batch read and write support, Paimon also supports
>> streaming read and change data feed.
>> >>
>> >>
>> >>
>> >> Paimon has been used by various users and companies, including
>> Alibaba, Bilibili, ByteDance and so on. Paimon is also integrated into
>> Alibaba Cloud's E-MapReduce and Realtime Compute products to provide cloud
>> services.
>> >>
>> >>
>> >> Paimon was founded in the Flink community in 2022 with the name of
>> "Flink Table Store”.
>> >> It has been developed for more than one year and produced 4 formal
>> >> releases. As its adoption expands to more computing engines, some of
>> the ecology users express their concerns about the neutrality of the
>> project. This makes us rethink the positioning of Flink Table Store, which
>> can be an independent lake storage.
>> >>
>> >>
>> >> With adequate discussions, we have got the support from the Flink
>> community to enter Apache incubation
>> >> [3] [4], with the below expectations:
>> >>
>> >> 1.
>> >> Expand Paimon's ecosystem, providing independent Java APIs to support
>> reading and writing from more big data engines such as Apache
>> >> Doris, Apache Hive, Apache Presto, Apache Spark, Trino, etc.
>> >>
>> >> 2.
>> >> Supplement key capabilities, especially streaming reads and intensive
>> updates/deletes,  for creating a unified and easy-to-use streaming data
>> warehouse (lakehouse).
>> >>
>> >> 3. Grow into a more vibrant and neutral open source community.
>> >>
>> >>
>> >> And we believe the Paimon project will provide tremendous value for the
>> >> community if it is introduced into the Apache incubator.
>> >>
>> >>
>> >> I will help this project as the champion and mentor the project
>> together
>> >> with three other mentors (many thanks):
>> >>
>> >>
>> >> * Becket Qin (j...@apache.org)
>> >>
>> >> * Robert Metzger (rmetz...@apache.org)
>> >>
>> >> * Stephan Ewen (se...@apache.org)
>> >>
>> >>
>> >> Look forward to your feedback. Thanks.
>> >>
>> >>
>> >> Best Regards,
>> >> Yu
>> >>
>> >> [1] https://github.com/apache/flink-table-store
>> >> <https://github.com/alibaba/RemoteShuffleService>
>> >>
>> >> [2]
>> https://cwiki.apache.org/confluence/display/INCUBATOR/PaimonProposal
>> >>
>> >> [3] https://lists.apache.org/thread/2ybxfg3zrzn4l3tnq3w2w3xvkhk0f9jk
>> >>
>> >> [4] https://lists.apache.org/thread/kn7c08cr4l0ynt551yfjqvzh5ns226r6
>> >>
>> >>
>> >>
>>
>>

Reply via email to