Big +1 for the 2.0 roadmap! Expanding Paimon to support multimodal data and vector search is a huge step forward. It aligns perfectly with the current trend of converging Data Lakes and AI infrastructure.
I am particularly excited about the "Unified Storage" and "Efficient Search" plans, as these will significantly lower the barrier for AI engineers to adopt Paimon. Looking forward to the detailed design documents (especially for the Vector Index). Happy to help review or contribute to these features. Best regards Jerry On Wed, Dec 17, 2025 at 9:04 AM Yunfeng Zhou <[email protected]> wrote: > Thanks for the proposal. It would definitely boost Paimon’s development in > the next era for AI. > > Best, > Yunfeng > > > 2025年12月16日 16:45,Jingsong Li <[email protected]> 写道: > > > > Hi everyone, > > > > Apache Paimon, as a high-performance data lake format, has achieved > > significant results in the field of structured data processing. With > > the deepening of AI application scenarios, the demand for fusion > > analysis of multimodal data (such as text, images, audio, video, etc.) > > is increasing day by day. We have invested a lot of development work > > in cross modal retrieval, unified storage, and efficient analysis. > > > > We plan to launch version 2.0 development, focusing on breaking > > through multimodal data support and creating a more powerful data lake > > solution. > > > > Core objective: > > - Unified Storage for structure & multimodal & vector. > > - Efficient Search for data and vectors. > > - Python ecosystem. > > > > Core features: > > 1. Data Evolution mode. > > 2. Blob Store. > > 3. Vector Store. > > 4. Python SDK & Ray Supports > > 5. Global Index framework. > > 6. Global Vector Index. > > 7. Global Invert Index. > > > > What do you think? > > > > Best, > > Jingsong > >
