[jira] [Created] (SPARK-16523) Support Row Based Aggregation HashMap

Qifan Pu (JIRA) Wed, 13 Jul 2016 00:50:51 -0700

Qifan Pu created SPARK-16523:
--------------------------------

             Summary: Support Row Based Aggregation HashMap
                 Key: SPARK-16523
                 URL: https://issues.apache.org/jira/browse/SPARK-16523
             Project: Spark
          Issue Type: Story
          Components: SQL
            Reporter: Qifan Pu



For hash aggregation in Spark SQL, we use a fast aggregation hashmap to act as 
a "cache" in order to boost aggregation performance. Previously, the hashmap is 
backed by a `ColumnarBatch`. This has performance issues when we have wide 
schema for the aggregation table (large number of key fields or value fields). 
In this JIRA, we support another implementation of fast hashmap, which is 
backed by a `RowBatch`. We then automatically pick between the two 
implementations based on certain knobs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-16523) Support Row Based Aggregation HashMap

Reply via email to