polyzos commented on code in PR #1214: URL: https://github.com/apache/fluss/pull/1214#discussion_r2191640920
########## website/src/pages/roadmap.md: ########## @@ -17,56 +17,39 @@ --> # Fluss Roadmap - -This roadmap means to provide users and contributors with a high-level summary of ongoing efforts in the Fluss community. -The roadmap contains both efforts working in process as well as completed efforts, so that users may get a better impression of the overall status and direction of those developments. - +This roadmap means to provide users and contributors with a high-level summary of ongoing efforts in the Fluss community. The roadmap contains both efforts working in process as well as completed efforts, so that users may get a better impression of the overall status and direction of those developments. ## Kafka Protocol Compatibility - Fluss will support the Kafka network protocol to enable users to use Fluss as a drop-in replacement for Kafka. This will allow users to leverage Fluss's real-time storage capabilities while maintaining compatibility with existing Kafka applications. - ## Flink Integration - -Fluss will provide deep integration with Apache Flink, enabling users a single engine experience for building real-time analytics applications. -The integration will include: -- Support for Flink **DataStream API** to read/write data from/to Fluss. -- Support new [Delta Join](https://cwiki.apache.org/confluence/display/FLINK/FLIP-486%3A+Introduce+A+New+DeltaJoin) to address the pain-points of Stream-Stream Join. +Fluss will provide deep integration with Apache Flink, enabling users a single engine experience for building real-time analytics applications. The integration will include: +- Upgrade Flink version to 2.x +- Support for Flink DataStream API to read/write data from/to Fluss. Review Comment: we can remove this as its completed ########## website/src/pages/roadmap.md: ########## @@ -17,56 +17,39 @@ --> # Fluss Roadmap - -This roadmap means to provide users and contributors with a high-level summary of ongoing efforts in the Fluss community. -The roadmap contains both efforts working in process as well as completed efforts, so that users may get a better impression of the overall status and direction of those developments. - +This roadmap means to provide users and contributors with a high-level summary of ongoing efforts in the Fluss community. The roadmap contains both efforts working in process as well as completed efforts, so that users may get a better impression of the overall status and direction of those developments. ## Kafka Protocol Compatibility - Fluss will support the Kafka network protocol to enable users to use Fluss as a drop-in replacement for Kafka. This will allow users to leverage Fluss's real-time storage capabilities while maintaining compatibility with existing Kafka applications. - ## Flink Integration - -Fluss will provide deep integration with Apache Flink, enabling users a single engine experience for building real-time analytics applications. -The integration will include: -- Support for Flink **DataStream API** to read/write data from/to Fluss. -- Support new [Delta Join](https://cwiki.apache.org/confluence/display/FLINK/FLIP-486%3A+Introduce+A+New+DeltaJoin) to address the pain-points of Stream-Stream Join. +Fluss will provide deep integration with Apache Flink, enabling users a single engine experience for building real-time analytics applications. The integration will include: +- Upgrade Flink version to 2.x +- Support for Flink DataStream API to read/write data from/to Fluss. +- Support new Delta Join to address the pain-points of Stream-Stream Join. - More pushdown optimizations: Filter Pushdown ([#197](https://github.com/alibaba/fluss/issues/197)), Partition Pruning ([#196](https://github.com/alibaba/fluss/issues/196)), Aggregation Pushdown, etc. Review Comment: partition pruning can also be removed as completed ########## website/src/pages/roadmap.md: ########## @@ -17,56 +17,39 @@ --> # Fluss Roadmap - -This roadmap means to provide users and contributors with a high-level summary of ongoing efforts in the Fluss community. -The roadmap contains both efforts working in process as well as completed efforts, so that users may get a better impression of the overall status and direction of those developments. - +This roadmap means to provide users and contributors with a high-level summary of ongoing efforts in the Fluss community. The roadmap contains both efforts working in process as well as completed efforts, so that users may get a better impression of the overall status and direction of those developments. ## Kafka Protocol Compatibility - Fluss will support the Kafka network protocol to enable users to use Fluss as a drop-in replacement for Kafka. This will allow users to leverage Fluss's real-time storage capabilities while maintaining compatibility with existing Kafka applications. - ## Flink Integration - -Fluss will provide deep integration with Apache Flink, enabling users a single engine experience for building real-time analytics applications. -The integration will include: -- Support for Flink **DataStream API** to read/write data from/to Fluss. -- Support new [Delta Join](https://cwiki.apache.org/confluence/display/FLINK/FLIP-486%3A+Introduce+A+New+DeltaJoin) to address the pain-points of Stream-Stream Join. +Fluss will provide deep integration with Apache Flink, enabling users a single engine experience for building real-time analytics applications. The integration will include: +- Upgrade Flink version to 2.x +- Support for Flink DataStream API to read/write data from/to Fluss. +- Support new Delta Join to address the pain-points of Stream-Stream Join. - More pushdown optimizations: Filter Pushdown ([#197](https://github.com/alibaba/fluss/issues/197)), Partition Pruning ([#196](https://github.com/alibaba/fluss/issues/196)), Aggregation Pushdown, etc. - Upgrade the Rule-Based Optimization into Cost-Based Optimization in Flink SQL streaming planner with leveraging statistics in Fluss tables. - - ## Streaming Lakehouse - -- Support for Iceberg ([#102](https://github.com/alibaba/fluss/issues/102)) as Lakehouse Storage. And DeltaLake, Hudi as well. +- Support for Iceberg ([#452](https://github.com/alibaba/fluss/issues/452)) as Lakehouse Storage. And DeltaLake, Hudi as well. - Support Union Read for Spark, Trino, StarRocks. -- Avoid data shuffle in compaction service to directly compact Arrow files of Fluss into Parquet files of data lakes ([#107](https://github.com/alibaba/fluss/issues/107)). - -## ZooKeeper Removal - -Fluss currently utilizes **ZooKeeper** for cluster coordination, metadata storage, and cluster configuration management. -In upcoming releases, **ZooKeeper will be replaced** by **KvStore** for metadata storage and **Raft** for cluster coordination and ensuring consistency. -This transition aims to streamline operations and enhance system reliability. - +- Support for Lance ([#1155](https://github.com/alibaba/fluss/issues/1155)) as Lakehouse Storage to enable integration with AI/ML workflows for multi-modal data processing. +## Spark Integration +- Support for Spark connector ([#155](https://github.com/alibaba/fluss/issues/155)) to enable seamless data processing and analytics workflows. +## Python Client +- Support Python SDK to connect with Python ecosystems, including PyArrow, Pandas, Lance, and DuckDB. ## Storage Engine - - Support for complex data types: Array ([#168](https://github.com/alibaba/fluss/issues/168)), Map ([#169](https://github.com/alibaba/fluss/issues/169)), Struct ([#170](https://github.com/alibaba/fluss/issues/170)), Variant/JSON. - Support for schema evolution. -- Support for secondary index for Delta Join with Flink (~~[#65](https://github.com/alibaba/fluss/issues/65)~~). -- Support for buckets rescale. - -## Zero Disks +- Support for secondary index for Delta Join with Flink. Review Comment: i think this is also completed [here](https://github.com/apache/fluss/pull/222) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
