CodiumAI-Agent commented on PR #9729: URL: https://github.com/apache/incubator-gluten/pull/9729#issuecomment-2900444756
## PR Reviewer Guide ๐ Here are some key observations to aid the review process: <table> <tr><td> **๐ซ Ticket compliance analysis โ ** **[9728](https://github.com/apache/incubator-gluten/issues/9728) - Fully compliant** Compliant requirements: * Fix inconsistent right join keys when using `JoinAggregateToAggregateUnion`. * Ensure `id2` values are `null` when there is no matching record. * Add unit tests to verify correct null assignment of right join keys. </td></tr> <tr><td>โฑ๏ธ <strong>Estimated effort to review</strong>: 4 ๐ต๐ต๐ต๐ตโช</td></tr> <tr><td>๐งช <strong>PR contains tests</strong></td></tr> <tr><td>๐ <strong>No security concerns identified</strong></td></tr> <tr><td>โก <strong>Recommended focus areas for review</strong><br><br> <details><summary><a href='https://github.com/apache/incubator-gluten/pull/9729/files#diff-e6e40be5445408a6558af17cccae0bab087695234f075b90bc3c91f288a54298R813-R822'><strong>Indexing Logic</strong></a> The new slicing and indexing of grouping keys, aggregate expressions, and flag expressions in `buildMakeNotMatchedRowsNullProject` is complex and prone to off-by-one errors. Verify that the start and end indices correctly align for all cases. </summary> ```scala val keysNumber = analyzedAggregates.head.getGroupingKeys().length val keysStartIndex = keysNumber val aggregateExpressionsStatIndex = keysStartIndex + analyzedAggregates.length * keysNumber val flagExpressionsStartIndex = input.length - analyzedAggregates.length val dupPrimeKeys = input.slice(0, keysStartIndex) val keys = input.slice(keysStartIndex, aggregateExpressionsStatIndex) val aggregateExpressions = input.slice(aggregateExpressionsStatIndex, flagExpressionsStartIndex) val flagExpressions = input.slice(flagExpressionsStartIndex, input.length) ``` </details> <details><summary><a href='https://github.com/apache/incubator-gluten/pull/9729/files#diff-e6e40be5445408a6558af17cccae0bab087695234f075b90bc3c91f288a54298R909-R920'><strong>Performance Impact</strong></a> The added duplication of join keys (`buildJoinRightKeysProject`) and extra projection steps could degrade query performance. Consider assessing optimization impacts or combining projections. </summary> ```scala def buildJoinRightKeysProject( plan: LogicalPlan, analyzedAggregates: Seq[JoinedAggregateAnalyzer]): LogicalPlan = { val input = plan.output val keysNum = analyzedAggregates.head.getGroupingKeys.length // Make a duplication of the prime aggregate keys, and put them in the front val projectList = input.slice(0, keysNum).map { case key => RuleExpressionHelper.makeNamedExpression(key, "_dup_prime_" + key.name) } Project(projectList ++ input, plan) } ``` </details> </td></tr> </table> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
