Hi Celeborn community,
I'd like to start a discussion about reducing the number of Spark and Flink versions we support. # Background Celeborn currently supports a very wide build matrix: - Spark: 2.4, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 4.0, 4.1 (9 profiles; Spark 2.4 also forces us to keep a Scala 2.11 build and a separate client-spark/spark-2 module) - Flink: 1.16, 1.17, 1.18, 1.19, 1.20, 2.0, 2.1, 2.2 (8 profiles, each with its own *-shaded module) This breadth has a real cost: - CI time and flakiness grow with every profile we build and test. - The release process has to cross-build and stage artifacts for the full matrix, which slows down every release. - Many of these engines are already end-of-life upstream and receive no further releases, so we are maintaining and shipping clients for versions their own communities no longer support. # Proposal Keep the most recent, actively-maintained versions and deprecate + remove the oldest ones. As a concrete starting point: Spark - Deprecate now, remove in <next major/minor>: 2.4, 3.0, 3.1 - Keep: 3.2, 3.3, 3.4, 3.5, 4.0, 4.1 - Dropping 2.4 lets us also drop the Scala 2.11 build and the dedicated spark-2 module. Flink - Deprecate now, remove in <next major/minor>: 1.16, 1.17 - Keep: 1.18, 1.19, 1.20, 2.0, 2.1, 2.2 # Suggested process 1. Mark the above versions as deprecated in the next release: a note in the docs/release notes, and optionally a startup log warning. 2. Remove the corresponding Maven profiles and modules in the release after that. 3. Keep the existing release branches as-is, so users who still run these engines can continue to use the last Celeborn release that supports them. # Open questions - Is the proposed cut line right, or should we be more / less aggressive (e.g. also drop Spark 3.2 / Flink 1.18)? - Are there users still relying on any of the versions proposed for removal? Please speak up here so we can account for that. - Which target release should carry the deprecation vs. the removal? Looking forward to your feedback. Thanks, Nicholas Jiang
