Hello Beam Dev Community, This sink I/O connector will enable both streaming and batch pipelines to persist vector embeddings and metadata into Milvus collections.
This sink I/O connector introduces support for writing vector embeddings and associated metadata from both streaming and batch pipelines into Milvus collections. It aims to integrate Milvus’s vector database capabilities into Beam workflows, supporting a wide range of machine learning and similarity search use cases. It builds upon the current Beam Milvus enrichment handler: https://github.com/apache/beam/pull/35216 Here is the link to the design document: https://docs.google.com/document/d/1agpFq9dy8_7ptMxTET0X7AmGIbDeY0_hGUq-5GNVDqs/edit?usp=sharing This implementation is part of the GSoC 2025 ML Integration project being tracked here: https://github.com/apache/beam/issues/35046 I welcome any feedback, suggestions, or questions about the design approach. Thank you, Mohamed