Re: [PR] [GLUTEN-9728][CH] Fix: Inconsistent result from `JoinAggregateToAggregateUnion` [incubator-gluten]

via GitHub Thu, 22 May 2025 02:02:33 -0700


CodiumAI-Agent commented on PR #9729:
URL: 
https://github.com/apache/incubator-gluten/pull/9729#issuecomment-2900444756


   ## PR Reviewer Guide 🔍
   
   Here are some key observations to aid the review process:
   
   <table>
   <tr><td>
   
   **🎫 Ticket compliance analysis ✅**
   
   
   
   **[9728](https://github.com/apache/incubator-gluten/issues/9728) - Fully 
compliant**
   
   Compliant requirements:
   
   * Fix inconsistent right join keys when using 
`JoinAggregateToAggregateUnion`.
   * Ensure `id2` values are `null` when there is no matching record.
   * Add unit tests to verify correct null assignment of right join keys.
   
   
   
   </td></tr>
   <tr><td>⏱️&nbsp;<strong>Estimated effort to review</strong>: 4 
🔵🔵🔵🔵⚪</td></tr>
   <tr><td>🧪&nbsp;<strong>PR contains tests</strong></td></tr>
   <tr><td>🔒&nbsp;<strong>No security concerns identified</strong></td></tr>
   <tr><td>⚡&nbsp;<strong>Recommended focus areas for review</strong><br><br>
   
   <details><summary><a 
href='https://github.com/apache/incubator-gluten/pull/9729/files#diff-e6e40be5445408a6558af17cccae0bab087695234f075b90bc3c91f288a54298R813-R822'><strong>Indexing
 Logic</strong></a>
   
   The new slicing and indexing of grouping keys, aggregate expressions, and 
flag expressions in `buildMakeNotMatchedRowsNullProject` is complex and prone 
to off-by-one errors. Verify that the start and end indices correctly align for 
all cases.
   </summary>
   
   ```scala
   val keysNumber = analyzedAggregates.head.getGroupingKeys().length
   val keysStartIndex = keysNumber
   val aggregateExpressionsStatIndex = keysStartIndex + 
analyzedAggregates.length * keysNumber
   val flagExpressionsStartIndex = input.length - analyzedAggregates.length
   
   val dupPrimeKeys = input.slice(0, keysStartIndex)
   val keys = input.slice(keysStartIndex, aggregateExpressionsStatIndex)
   val aggregateExpressions = input.slice(aggregateExpressionsStatIndex, 
flagExpressionsStartIndex)
   val flagExpressions = input.slice(flagExpressionsStartIndex, input.length)
   
   ```
   
   </details>
   
   <details><summary><a 
href='https://github.com/apache/incubator-gluten/pull/9729/files#diff-e6e40be5445408a6558af17cccae0bab087695234f075b90bc3c91f288a54298R909-R920'><strong>Performance
 Impact</strong></a>
   
   The added duplication of join keys (`buildJoinRightKeysProject`) and extra 
projection steps could degrade query performance. Consider assessing 
optimization impacts or combining projections.
   </summary>
   
   ```scala
   def buildJoinRightKeysProject(
       plan: LogicalPlan,
       analyzedAggregates: Seq[JoinedAggregateAnalyzer]): LogicalPlan = {
     val input = plan.output
     val keysNum = analyzedAggregates.head.getGroupingKeys.length
     // Make a duplication of the prime aggregate keys, and put them in the 
front
     val projectList = input.slice(0, keysNum).map {
       case key =>
         RuleExpressionHelper.makeNamedExpression(key, "_dup_prime_" + key.name)
     }
     Project(projectList ++ input, plan)
   }
   ```
   
   </details>
   
   </td></tr>
   </table>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [GLUTEN-9728][CH] Fix: Inconsistent result from `JoinAggregateToAggregateUnion` [incubator-gluten]

Reply via email to