baibaichen opened a new issue, #11400: URL: https://github.com/apache/incubator-gluten/issues/11400
### Description After xx the general framework of unit tests are merged. We still have failed unit tests left here: | Description | Owner | Category | Cause | Affected Files | |-------------|-------|----------|-------|----------------| | [Improve `extractShuffleIds` to find `AdaptiveSparkPlanExec` anywhere in plan tree](https://github.com/apache/spark/pull/53620) 4.1.1 | ✔️ | SQL | [#52157](https://github.com/apache/spark/pull/52157) | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | Fix `SPARK-47939: Explain should work with parameterized queries` | ✔️ | TEST | https://github.com/apache/incubator-gluten/pull/11252 | `gluten-ut/spark41/.../VeloxTestSettings.scala` | | Support Checksum in Column ShuffleWriters | 🚫 | CORE | [#50230](https://github.com/apache/spark/pull/50230) | gluten-ut/spark41/.../velox/VeloxTestSettings.scala<br>gluten-ut/spark41/.../GlutenMapStatusEndToEndSuite.scala<br>**Excluded tests:**<br>- GlutenMapStatusEndToEndSuite (entire suite) | | Support `spark.sql.unionOutputPartitioning=true` | 🚫| SQL | [#51623](https://github.com/apache/spark/pull/51623) | `.github/workflows/velox_backend_x86.yml`, `gluten-ut/spark41/.../VeloxTestSettings.scala`, `tools/gluten-it/common/.../Suite.scala`<br>**Excluded tests:**<br>- GlutenBroadcastExchangeSuite.SPARK-52962<br>- GlutenDataFrameSetOperationsSuite.SPARK-52921* | | Fixes a Spark Parquet read bug where missing struct fields caused the entire struct to be read as NULL. |❓️ | PARQUET | [SPARK-53535](https://issues.apache.org/jira/browse/SPARK-53535) | `gluten-ut/spark41/.../VeloxTestSettings.scala`<br>**Excluded tests:**<br>- SPARK-53535*<br>- vectorized reader: missing all struct fields* | | Infer Variant shredding schema when writing to Parquet | ❓️ | PARQUET | [#52406](https://github.com/apache/spark/pull/52406) | gluten-ut/spark41/.../velox/VeloxTestSettings.scala<br>**Excluded test:**<br>- "infer shredding with mixed scale" in GlutenFileBasedDataSourceSuite | | NullType/VOID/UNKNOWN Type Support in Parquet |❓️ | PARQUET | [SPARK-54220](https://issues.apache.org/jira/browse/SPARK-54220) | `gluten-ut/spark41/.../VeloxTestSettings.scala`<br>**Excluded tests:**<br>- SPARK-54220* | | Update CI python to 3.10 | ❓️ | PYTHON | [#51259](https://github.com/apache/spark/pull/51259) | backends-velox/.../python/ArrowEvalPythonExecSuite.scala | | Align with Spark `split` | ❓️ | SQL | [#48470](https://github.com/apache/spark/pull/48470) | `gluten-ut/spark41/.../VeloxTestSettings.scala`, `backends-velox/.../VeloxStringFunctionsSuite.scala`<br>**Excluded tests:**<br>- GlutenRegexpExpressionsSuite.SPLIT<br>- VeloxStringFunctionsSuite: split test | | Fix additional Spark 4.1 `KeyGroupedPartitioningSuite tests` | ❓️ | SQL | [#53132](https://github.com/apache/spark/pull/53132), [#53142](https://github.com/apache/spark/pull/53142) | `gluten-ut/spark41/.../VeloxTestSettings.scala`<br>**Excluded tests:**<br>- SPARK-53322*<br>- SPARK-54439* | | Fix failing SQL tests on Spark 4.1. | ❓️ | SQL | N/A | gluten-ut/spark41/.../velox/VeloxSQLQueryTestSettings.scala<br>**Excluded tests:**<br>- cast.sql<br>- describe.sql<br>- nonansi/cast.sql<br>- nonansi/st-functions.sql<br>- scripting/randomly_generated_scripts.sql<br>- st-functions.sql<br>- type-coercion-edge-cases.sql<br>- variant-field-extractions.sql | | Support memory based thresholds for shuffle spill | ❌️ | SQL | [#47856](https://github.com/apache/spark/pull/47856) | `gluten-ut/spark41/.../VeloxTestSettings.scala`<br>**Excluded tests:**<br>- SPARK-49386: Window spill with more than the inMemoryThreshold and spillSizeThreshold<br>- SPARK-49386: test SortMergeJoin (with spill by size threshold) | | Fix additional Spark 4.1 STRUCTURED STREAMING tests. | ❌️ | SS | [#52645](https://github.com/apache/spark/pull/52645) | `gluten-ut/spark41/.../VeloxTestSettings.scala`<br>**Excluded tests:**<br>- SPARK-53942: changing the number of stateless shuffle partitions via config<br>- SPARK-53942: stateful shuffle partitions are retained from old checkpoint | | Fix additional Spark 4.1 STRUCTURED STREAMING tests. | ❌️ | SS | [#52473](https://github.com/apache/spark/pull/52473)<br>[#52870](https://github.com/apache/spark/pull/52870)<br>[#52891](https://github.com/apache/spark/pull/52891) | gluten-ut/spark41/.../velox/VeloxTestSettings.scala<br>**Excluded tests:**<br>- GlutenStreamRealTimeModeAllowlistSuite: "rtm operator allowlist", "repartition not allowed", "stateful queries not allowed"<br>- GlutenStreamRealTimeModeE2ESuite: "foreach", "to_json and from_json round-trip", "generateExec passthrough"<br>- GlutenStreamRealTimeModeSuite: "processAllAvailable" | ### Gluten version None -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
