weiting-chen opened a new issue, #11827: URL: https://github.com/apache/gluten/issues/11827
### Description 2025 is almost gone. Time to define 2026 features. Feel free to add more or comment. For more information about the achievements in 2025 Roadmap, please check https://github.com/apache/gluten/issues/8226 For more information about the achievements in 2024 Roadmap, please check https://github.com/apache/gluten/issues/4709 Apache Gluten graduated to an ASF Top-Level Project in February 2026. This roadmap tracks the community's goals for 2026. --- ### Spark Compatibility - [ ] Spark 4.0/4.1 GA support — fix all disabled test suites (https://github.com/apache/gluten/issues/11550, https://github.com/apache/gluten/issues/11400) - [ ] TIMESTAMP_NTZ type support (https://github.com/apache/gluten/issues/11622, https://github.com/apache/gluten/pull/11626) - [ ] Variant shredding support for Parquet reader and writer (Spark 4.0) (https://github.com/apache/gluten/issues/11371) - [ ] Parquet type widening support (SPARK-40876) (https://github.com/apache/gluten/issues/11683, https://github.com/apache/gluten/pull/11719) - [ ] ANSI mode support (https://github.com/apache/gluten/issues/10134) - [ ] Complete remaining unsupported Spark functions (https://github.com/apache/gluten/issues/4039) - [ ] Deprecate Spark 3.2/3.3 and JDK 8 ### Performance Optimization - [ ] Execution-aware dynamic join strategy selection after filter execution (https://github.com/apache/gluten/issues/11808) - [ ] Use runtime stats to choose hash build side (https://github.com/apache/gluten/issues/11774, https://github.com/apache/gluten/pull/11775) - [ ] Bloom filter optimization — translate might_contain as subfield filter (https://github.com/apache/gluten/issues/11771, https://github.com/apache/gluten/issues/11708, https://github.com/apache/gluten/pull/11711) - [ ] Parquet metadata check limit optimization (https://github.com/apache/gluten/issues/11782) - [ ] Partial project UDF optimization (https://github.com/apache/gluten/issues/11783) - [ ] Push dynamic filters to shuffle reader with per-block column statistics (https://github.com/apache/gluten/issues/11605, https://github.com/apache/gluten/pull/11769) - [ ] Multi-core per task (https://github.com/apache/gluten/issues/7810) - [ ] Spill enhancement — streaming window functions (https://github.com/apache/gluten/issues/3030) - [ ] Pick split with most data prefetched (https://github.com/apache/gluten/issues/11821) - [ ] Complex type Row-to-Columnar optimization ### Native Engine Integration - [ ] Switch to upstream Velox official release (https://github.com/apache/gluten/issues/8782) - [ ] Upstream useful Velox PRs not merged from Gluten community (https://github.com/apache/gluten/issues/11585) - [ ] Bolt backend integration — ByteDance native engine with LLVM JIT (https://github.com/apache/gluten/pull/11261, https://github.com/apache/gluten/discussions/10929) - [ ] ClickHouse backend upgrade (https://github.com/apache/gluten/pull/11734) - [ ] Kafka read support for Velox backend (https://github.com/apache/gluten/pull/11801) ### Data Lake & File Formats - [ ] Full Iceberg support — map write configs with Velox (https://github.com/apache/gluten/issues/11703, https://github.com/apache/gluten/pull/11776) - [ ] Native Parquet write for complex types (Struct/Array/Map) (https://github.com/apache/gluten/pull/11788) - [ ] Iceberg equality delete MOR table support (https://github.com/apache/gluten/pull/8056) - [ ] Hudi MOR table support - [ ] Delta Lake feature parity - [ ] JSON file format support - [ ] ORC writer support ### GPU & Hardware Acceleration - [ ] GPU BHJ bug fix (https://github.com/apache/gluten/issues/11794) - [ ] Multi-threaded decompression in GPU shuffle reader (https://github.com/apache/gluten/issues/11779, https://github.com/apache/gluten/pull/11780) - [ ] GPU code cleanup and stabilization (https://github.com/apache/gluten/pull/11824) - [ ] ARM SVE optimization - [ ] FPGA accelerator exploration ### Flink Integration - [ ] Fix Flink memory leak with RocksDB state backend (https://github.com/apache/gluten/issues/11791) - [ ] Fix Flink CI build failures (https://github.com/apache/gluten/issues/11793) - [ ] Stabilize Flink + Velox from experimental to beta - [ ] Nexmark benchmark support (https://github.com/apache/gluten/issues/11790) ### PySpark & Python Ecosystem - [ ] PySpark Python UDF support - [ ] Arrow UDF support - [ ] Fix Python UDF/UDTF test suites on Spark 4.x (https://github.com/apache/gluten/issues/11550) ### Stability & Quality - [ ] OOM prevention and memory stability (https://github.com/apache/gluten/issues/11747, https://github.com/apache/gluten/issues/8025) - [ ] Full fuzzer support and result mismatch resolution (https://github.com/apache/gluten/issues/4652) - [ ] Timezone edge case fixes (https://github.com/apache/gluten/issues/11597) - [ ] Complex type validation in native engine (https://github.com/apache/gluten/issues/11746, https://github.com/apache/gluten/issues/11678) - [ ] Support getting C++ stack traces via GDB from Spark UI (https://github.com/apache/gluten/issues/11677) - [ ] collect_set ignoreNulls support (https://github.com/apache/gluten/issues/11826) ### Build, CI & Developer Experience - [ ] VCPKG for macOS build (https://github.com/apache/gluten/pull/11563) - [ ] IWYU tool for C++ code format checking (https://github.com/apache/gluten/pull/11287) - [ ] Cache Maven dependencies in CI (https://github.com/apache/gluten/pull/11655) - [ ] Fix ARM64/aarch64 build issues (https://github.com/apache/gluten/issues/11633, https://github.com/apache/gluten/issues/11639) - [ ] Fix glog macOS build breakage (https://github.com/apache/gluten/issues/11763) - [ ] Docker testing improvements (https://github.com/apache/gluten/issues/11501) - [ ] Remove Arrow-CSV dependency (https://github.com/apache/gluten/issues/11591) ### Community & Governance - [ ] Complete TLP graduation remaining tasks (https://github.com/apache/gluten/issues/11713) - [ ] Quarterly releases: 1.7 / 1.8 / 1.9 / 2.0 - [ ] GlutenCon 2026 - [ ] Trino integration exploration (community interest from https://github.com/apache/gluten/issues/8226) - [ ] Expand ecosystem documentation (Kyuubi, Celeborn, Velox) - [ ] openEuler OS support -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
