brijrajk commented on PR #12151: URL: https://github.com/apache/gluten/pull/12151#issuecomment-4812584645
### Latest push — fix TPC-DS plan stability failures The previous push (`51185c76f`) introduced a DPP guard that broke 18 TPC-DS `check simplified sf100` tests (`tpcds-v1.4/q2`, `q10`, `q16`, etc.). Root cause and fix: **Root cause:** The guard `v: Attribute` prevented DPP/runtime-filter bloom filters from being rewritten to `VeloxBloomFilterMightContain`. When Velox tries to validate `FilterExecTransformer` with a vanilla `BloomFilterMightContain(ScalarSubquery, xxhash64(...))`, it cannot find a substrait mapping → validation fails → the filter (and the bloom filter aggregate subquery) falls back to vanilla Spark. This changes the plan structure: `FilterExecTransformer` + `RegularHashAggregateExecTransformer` becomes `Filter` + `ObjectHashAggregate`, which doesn't match the golden files. **Fix (`7b59cc4a4`):** 3-case pattern match in `BloomFilterMightContainJointRewriteRule`: 1. **User-facing** (`v: Attribute`): rewrite **both** outer `MightContain` AND inner `Aggregate` to Velox format → bytes format consistent across stages 2. **DPP/runtime-filter** (`v` is `xxhash64(...)`, bf is `ScalarSubquery`): rewrite **only** outer `MightContain` to `VeloxBloomFilterMightContain` — leave inner aggregate as vanilla `bloom_filter_agg(xxhash64(...))` since Velox handles it natively → `FilterExecTransformer` validates, simplified plan shows `bloom_filter_agg(xxhash64(...))` matching golden files 3. **Pre-computed literal bytes** (bf is not a `ScalarSubquery`): rewrite only outer `MightContain` This should make both the TPC-DS plan stability tests and the bloom filter test suites pass. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
