[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-7405: - Labels: (was: TODOC14) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task > Components: Vectorization >Reporter: Matt McCline >Assignee: Matt McCline > Fix For: 0.14.0 > > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, > HIVE-7405.995.patch, HIVE-7405.996.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-7405: - Labels: TODOC14 (was: ) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task > Components: Vectorization >Reporter: Matt McCline >Assignee: Matt McCline > Labels: TODOC14 > Fix For: 0.14.0 > > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, > HIVE-7405.995.patch, HIVE-7405.996.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7405: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Matt! > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task > Components: Vectorization >Reporter: Matt McCline >Assignee: Matt McCline > Fix For: 0.14.0 > > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, > HIVE-7405.995.patch, HIVE-7405.996.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task > Components: Vectorization >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, > HIVE-7405.995.patch, HIVE-7405.996.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.996.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task > Components: Vectorization >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, > HIVE-7405.995.patch, HIVE-7405.996.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task > Components: Vectorization >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, > HIVE-7405.995.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task > Components: Vectorization >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, > HIVE-7405.995.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task > Components: Vectorization >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, > HIVE-7405.995.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.995.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task > Components: Vectorization >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, > HIVE-7405.995.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task > Components: Vectorization >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.994.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task > Components: Vectorization >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task > Components: Vectorization >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: Open) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task > Components: Vectorization >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-7405: --- Component/s: Vectorization > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task > Components: Vectorization >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.991.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, > HIVE-7405.99.patch, HIVE-7405.991.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Open (was: Patch Available) Submitted more than 12 hours ago and no result? > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.99.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.98.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.97.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, > HIVE-7405.96.patch, HIVE-7405.97.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.96.patch tez_join_hash and dynpart_sort_opt_vectorization do not fail on my laptop. Re-submit same patch again... > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.95.patch Re-run the same patch. Expect to see dynpart_sort_opt_vectorization failure go away and tez_join_hash not fail. > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.94.patch Add "Execution mode: vectorized" to tez_join_hash.q.out > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch, HIVE-7405.94.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7405: --- Status: Patch Available (was: Open) +1 > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.93.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Open (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, > HIVE-7405.93.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.92.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: (was: HIVE-7405.10.patch) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.91.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: (was: HIVE-7405.A.patch) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.10.patch, > HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, > HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, > HIVE-7405.A.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.10.patch, > HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, > HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, > HIVE-7405.A.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.A.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.10.patch, > HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, > HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, > HIVE-7405.A.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.10.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.10.patch, > HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, > HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.10.patch, > HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, > HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: Open) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.9.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch, HIVE-7405.9.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7405: --- Status: Open (was: Patch Available) Above failures needs to be looked at. Also, I see some of the query results have changed in vectorized_casts test. Was that desirable? Were earlier results incorrect? > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.8.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, > HIVE-7405.8.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.7.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.6.patch Rebased patch after HIVE-7029 commit. > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.5.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch, HIVE-7405.5.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.4.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, > HIVE-7405.4.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.3.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.2.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.1.patch > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-7405.1.patch > > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Description: Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. was: Take advantage of the fact that in most plans a reduce-side GroupBy will get the group keys in sorted order so aggregation can be done "streaming" and not require large buffering of intermediate aggregation in memory/storage. Push any case requiring large buffering -- e.g. COUNT(DISTINCT(..)) -- to part 2 of Vectorize Reduce-Side GroupBy. In theory, if there is only one COUNT(DISTINCT(..)) the optimizer could arrange for sorting on the distinct column(s) as subordinate sort key and do the count of each distinct column(s) as a "streaming" operation. Then, only multiple COUNT(DISTINCT(..)) would require large buffering. Summary: Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) (was: Vectorize Reduce-Side GroupBy) > Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) > -- > > Key: HIVE-7405 > URL: https://issues.apache.org/jira/browse/HIVE-7405 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > > Vectorize the basic case that does not have any count distinct aggregation. > Add a 4th processing mode in VectorGroupByOperator for reduce where each > input VectorizedRowBatch has only values for one key at a time. Thus, the > values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)