[GitHub] spark pull request #19082: [SPARK-21870][SQL] Split aggregation code into sm...

cloud-fan Tue, 23 Jan 2018 08:46:19 -0800

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19082#discussion_r163305111
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ---
    @@ -825,52 +924,92 @@ case class HashAggregateExec(
         ctx.currentVars = new 
Array[ExprCode](aggregateBufferAttributes.length) ++ input
     
         val updateRowInRegularHashMap: String = {
    -      ctx.INPUT_ROW = unsafeRowBuffer
    +      // We need to copy the aggregation row buffer to a local row first 
because each aggregate
    +      // function directly updates the buffer when it finishes.
    --- End diff --
    
    why does this matter? We should avoid unnecessary data copy as possible as 
we can.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19082: [SPARK-21870][SQL] Split aggregation code into sm...

Reply via email to