[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns
[ https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8171: --- Status: In Progress (was: Patch Available) Tez and Vectorized Reduce doesn't create scratch columns Key: HIVE-8171 URL: https://issues.apache.org/jira/browse/HIVE-8171 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, HIVE-8171.03.patch This query fails with ArrayIndexOutofBound exception in the reducer. {code} create table varchar_3 ( field varchar(25) ) stored as orc; insert into table varchar_3 select cint from alltypesorc limit 10; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns
[ https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8171: --- Status: Patch Available (was: In Progress) Tez and Vectorized Reduce doesn't create scratch columns Key: HIVE-8171 URL: https://issues.apache.org/jira/browse/HIVE-8171 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, HIVE-8171.03.patch, HIVE-8171.04.patch This query fails with ArrayIndexOutofBound exception in the reducer. {code} create table varchar_3 ( field varchar(25) ) stored as orc; insert into table varchar_3 select cint from alltypesorc limit 10; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns
[ https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8171: --- Attachment: HIVE-8171.04.patch Tez and Vectorized Reduce doesn't create scratch columns Key: HIVE-8171 URL: https://issues.apache.org/jira/browse/HIVE-8171 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, HIVE-8171.03.patch, HIVE-8171.04.patch This query fails with ArrayIndexOutofBound exception in the reducer. {code} create table varchar_3 ( field varchar(25) ) stored as orc; insert into table varchar_3 select cint from alltypesorc limit 10; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8240: --- Status: In Progress (was: Patch Available) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR -- Key: HIVE-8240 URL: https://issues.apache.org/jira/browse/HIVE-8240 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8240.01.patch, HIVE-8240.02.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8240: --- Attachment: HIVE-8240.04.patch VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR -- Key: HIVE-8240 URL: https://issues.apache.org/jira/browse/HIVE-8240 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8240.01.patch, HIVE-8240.02.patch, HIVE-8240.04.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8240: --- Status: Patch Available (was: In Progress) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR -- Key: HIVE-8240 URL: https://issues.apache.org/jira/browse/HIVE-8240 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8240.01.patch, HIVE-8240.02.patch, HIVE-8240.04.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns
[ https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8171: --- Fix Version/s: 0.14.0 Tez and Vectorized Reduce doesn't create scratch columns Key: HIVE-8171 URL: https://issues.apache.org/jira/browse/HIVE-8171 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, HIVE-8171.03.patch, HIVE-8171.04.patch This query fails with ArrayIndexOutofBound exception in the reducer. {code} create table varchar_3 ( field varchar(25) ) stored as orc; insert into table varchar_3 select cint from alltypesorc limit 10; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8226) Vectorize dynamic partitioning in VectorFileSinkOperator
[ https://issues.apache.org/jira/browse/HIVE-8226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8226: --- Attachment: HIVE-8226.01.patch Vectorize dynamic partitioning in VectorFileSinkOperator Key: HIVE-8226 URL: https://issues.apache.org/jira/browse/HIVE-8226 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8226.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8226) Vectorize dynamic partitioning in VectorFileSinkOperator
[ https://issues.apache.org/jira/browse/HIVE-8226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8226: --- Status: Patch Available (was: Open) Vectorize dynamic partitioning in VectorFileSinkOperator Key: HIVE-8226 URL: https://issues.apache.org/jira/browse/HIVE-8226 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8226.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8264) Math UDFs in Reducer-with-vectorization fail with ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148711#comment-14148711 ] Matt McCline commented on HIVE-8264: Could be a duplicate of HIVE-8171 that I am actively working on (patch submitted -- awaiting test results). Math UDFs in Reducer-with-vectorization fail with ArrayIndexOutOfBoundsException Key: HIVE-8264 URL: https://issues.apache.org/jira/browse/HIVE-8264 Project: Hive Issue Type: Bug Components: Tez, UDF, Vectorization Affects Versions: 0.14.0 Environment: Hive trunk - as of today Tez - 0.5.0 Hadoop - 2.5 Reporter: Thiruvel Thirumoolan Labels: mathfunction, tez, vectorization Following queries are representative of the exceptions we are seeing with trunk. These queries pass if vectorization is disabled (or if limit is removed, which means no reducer). select name, log2(0) from (select name from mytable limit 1) t; select name, rand() from (select name from mytable limit 1) t; .. similar patterns with other Math UDFs'. Exception: ], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:254) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:167) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:154) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:360) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:242) ... 16 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating null at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801) at org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator.processOp(VectorLimitOperator.java:47) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:347) ... 17 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(ConstantVectorExpression.java:102) at org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(ConstantVectorExpression.java:150) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:125) ... 22 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns
[ https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8171: --- Status: In Progress (was: Patch Available) Had to start over with change -- major changes made recently to ReduceRecordProcessor (code moved to new file). Tez and Vectorized Reduce doesn't create scratch columns Key: HIVE-8171 URL: https://issues.apache.org/jira/browse/HIVE-8171 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, HIVE-8171.03.patch This query fails with ArrayIndexOutofBound exception in the reducer. {code} create table varchar_3 ( field varchar(25) ) stored as orc; insert into table varchar_3 select cint from alltypesorc limit 10; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns
[ https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8171: --- Attachment: HIVE-8171.03.patch Tez and Vectorized Reduce doesn't create scratch columns Key: HIVE-8171 URL: https://issues.apache.org/jira/browse/HIVE-8171 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, HIVE-8171.03.patch This query fails with ArrayIndexOutofBound exception in the reducer. {code} create table varchar_3 ( field varchar(25) ) stored as orc; insert into table varchar_3 select cint from alltypesorc limit 10; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns
[ https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8171: --- Status: Patch Available (was: In Progress) Tez and Vectorized Reduce doesn't create scratch columns Key: HIVE-8171 URL: https://issues.apache.org/jira/browse/HIVE-8171 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, HIVE-8171.03.patch This query fails with ArrayIndexOutofBound exception in the reducer. {code} create table varchar_3 ( field varchar(25) ) stored as orc; insert into table varchar_3 select cint from alltypesorc limit 10; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8240: --- Attachment: HIVE-8240.02.patch VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR -- Key: HIVE-8240 URL: https://issues.apache.org/jira/browse/HIVE-8240 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8240.01.patch, HIVE-8240.02.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8240: --- Status: In Progress (was: Patch Available) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR -- Key: HIVE-8240 URL: https://issues.apache.org/jira/browse/HIVE-8240 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8240.01.patch, HIVE-8240.02.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8240: --- Status: Patch Available (was: In Progress) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR -- Key: HIVE-8240 URL: https://issues.apache.org/jira/browse/HIVE-8240 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8240.01.patch, HIVE-8240.02.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR
Matt McCline created HIVE-8240: -- Summary: VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR Key: HIVE-8240 URL: https://issues.apache.org/jira/browse/HIVE-8240 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns
[ https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8171: --- Status: In Progress (was: Patch Available) Tez and Vectorized Reduce doesn't create scratch columns Key: HIVE-8171 URL: https://issues.apache.org/jira/browse/HIVE-8171 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8171.01.patch This query fails with ArrayIndexOutofBound exception in the reducer. {code} create table varchar_3 ( field varchar(25) ) stored as orc; insert into table varchar_3 select cint from alltypesorc limit 10; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns
[ https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8171: --- Attachment: HIVE-8171.02.patch Tez and Vectorized Reduce doesn't create scratch columns Key: HIVE-8171 URL: https://issues.apache.org/jira/browse/HIVE-8171 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch This query fails with ArrayIndexOutofBound exception in the reducer. {code} create table varchar_3 ( field varchar(25) ) stored as orc; insert into table varchar_3 select cint from alltypesorc limit 10; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns
[ https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14145716#comment-14145716 ] Matt McCline commented on HIVE-8171: Got rid of the unnecessary tags in ReduceRecordProcessor. Not sure which LOG indentations you are seeing. Tez and Vectorized Reduce doesn't create scratch columns Key: HIVE-8171 URL: https://issues.apache.org/jira/browse/HIVE-8171 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch This query fails with ArrayIndexOutofBound exception in the reducer. {code} create table varchar_3 ( field varchar(25) ) stored as orc; insert into table varchar_3 select cint from alltypesorc limit 10; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns
[ https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8171: --- Status: Patch Available (was: In Progress) Tez and Vectorized Reduce doesn't create scratch columns Key: HIVE-8171 URL: https://issues.apache.org/jira/browse/HIVE-8171 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch This query fails with ArrayIndexOutofBound exception in the reducer. {code} create table varchar_3 ( field varchar(25) ) stored as orc; insert into table varchar_3 select cint from alltypesorc limit 10; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8240: --- Status: Patch Available (was: Open) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR -- Key: HIVE-8240 URL: https://issues.apache.org/jira/browse/HIVE-8240 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8240.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8240: --- Attachment: HIVE-8240.01.patch VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR -- Key: HIVE-8240 URL: https://issues.apache.org/jira/browse/HIVE-8240 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8240.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8163) With dynamic partition pruning map operator that generates the partition filters is not vectorized
[ https://issues.apache.org/jira/browse/HIVE-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143938#comment-14143938 ] Matt McCline commented on HIVE-8163: (non-binding) +1 With dynamic partition pruning map operator that generates the partition filters is not vectorized -- Key: HIVE-8163 URL: https://issues.apache.org/jira/browse/HIVE-8163 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Gunther Hagleitner Priority: Minor Labels: performance Attachments: HIVE-8163.1.patch, HIVE-8163.2.patch Vertex used to generate the partition pruning filters is not vectorized. Sample from the plan : {code} Vertices: Map 1 Map Operator Tree: TableScan alias: d3 filterExpr: ((d_quarter_name) IN ('2000Q1', '2000Q2', '2000Q3') and d_date_sk is not null) (type: boolean) Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ((d_quarter_name) IN ('2000Q1', '2000Q2', '2000Q3') and d_date_sk is not null) (type: boolean) Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d_date_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int) outputColumnNames: _col0 Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: _col0 (type: int) mode: hash outputColumnNames: _col0 Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE Dynamic Partitioning Event Operator Target Input: catalog_sales Partition key expr: cs_sold_date_sk Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE Target column: cs_sold_date_sk Target Vertex: Map 3 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8226) Vectorize dynamic partitioning in VectorFileSinkOperator
Matt McCline created HIVE-8226: -- Summary: Vectorize dynamic partitioning in VectorFileSinkOperator Key: HIVE-8226 URL: https://issues.apache.org/jira/browse/HIVE-8226 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8211) Tez and Vectorization of SUM(timestamp) not vectorized -- can't execute correctly because aggregation output is double
Matt McCline created HIVE-8211: -- Summary: Tez and Vectorization of SUM(timestamp) not vectorized -- can't execute correctly because aggregation output is double Key: HIVE-8211 URL: https://issues.apache.org/jira/browse/HIVE-8211 Project: Hive Issue Type: Bug Components: Tez, Vectorization Reporter: Matt McCline Assignee: Matt McCline Vectorization of SUM(timestamp) is currently turned off because the output of aggregation is a double (DoubleColumnVector) and the execution code is expecting a long (LongColumnVector). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8052: --- Status: In Progress (was: Patch Available) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, HIVE-8052.04.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8052: --- Attachment: HIVE-8052.05.patch Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, HIVE-8052.04.patch, HIVE-8052.05.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8052: --- Status: Patch Available (was: In Progress) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, HIVE-8052.04.patch, HIVE-8052.05.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14142748#comment-14142748 ] Matt McCline commented on HIVE-8052: Yes, good point. Added variance, var_pop, var_samp, std, stddev, stddev_pop, stddev_samp. It turns out sum(timestamp) is planned with a Double as output from the aggregation instead of BigInt and this doesn't work when the reduce-side is vectorized. So, it remains unvectorized (see HIVE-8211). Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, HIVE-8052.04.patch, HIVE-8052.05.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8052: --- Attachment: HIVE-8052.06.patch Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, HIVE-8052.04.patch, HIVE-8052.05.patch, HIVE-8052.06.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8052: --- Status: In Progress (was: Patch Available) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, HIVE-8052.04.patch, HIVE-8052.05.patch, HIVE-8052.06.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8052: --- Status: Patch Available (was: In Progress) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, HIVE-8052.04.patch, HIVE-8052.05.patch, HIVE-8052.06.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8197) Tez and Vectorization Insert into ORC Table with timestamp column erroneously repeats the last row's column value
Matt McCline created HIVE-8197: -- Summary: Tez and Vectorization Insert into ORC Table with timestamp column erroneously repeats the last row's column value Key: HIVE-8197 URL: https://issues.apache.org/jira/browse/HIVE-8197 Project: Hive Issue Type: Bug Environment: Tez and Vectorization. Reporter: Matt McCline Assignee: Matt McCline Priority: Critical In diagnosing why a only(?) a Tez and Vectorized query with min and max aggregates was always returning the last row read's column value, discovered the problem was in creating the test table {code} CREATE TABLE alltypesorc_string STORED AS ORC AS SELECT ctinyint as ctinyint, to_utc_timestamp(ctimestamp1, 'America/Los_Angeles') as ctimestamp1, CAST(to_utc_timestamp(ctimestamp1, 'America/Los_Angeles') AS STRING) as stimestamp1 FROM alltypesorc WHERE ctinyint 0 LIMIT 40; {code} I think it is related what Prasanth mentioned as a possibility: Saving a Timestamp as a Writable object that gets overwritten. One suspect is the Writable[] records array in VectorFileSinkOperator in the ProcessOp method. Or, perhaps it is in VectorReduceSinkOperator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141530#comment-14141530 ] Matt McCline commented on HIVE-8052: Problem was the loading of data into the test table (see new JIRA HIVE-8197). Turning on vectorization was moved to after the loading and MR and Tez query results now match. Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, HIVE-8052.04.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8052: --- Status: Patch Available (was: In Progress) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, HIVE-8052.04.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns
Matt McCline created HIVE-8171: -- Summary: Tez and Vectorized Reduce doesn't create scratch columns Key: HIVE-8171 URL: https://issues.apache.org/jira/browse/HIVE-8171 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical This query fails with ArrayIndexOutofBound exception in the reducer. {code} create table varchar_3 ( field varchar(25) ) stored as orc; insert into table varchar_3 select cint from alltypesorc limit 10; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns
[ https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8171: --- Attachment: HIVE-8171.01.patch Tez and Vectorized Reduce doesn't create scratch columns Key: HIVE-8171 URL: https://issues.apache.org/jira/browse/HIVE-8171 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8171.01.patch This query fails with ArrayIndexOutofBound exception in the reducer. {code} create table varchar_3 ( field varchar(25) ) stored as orc; insert into table varchar_3 select cint from alltypesorc limit 10; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns
[ https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8171: --- Status: Patch Available (was: Open) Tez and Vectorized Reduce doesn't create scratch columns Key: HIVE-8171 URL: https://issues.apache.org/jira/browse/HIVE-8171 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8171.01.patch This query fails with ArrayIndexOutofBound exception in the reducer. {code} create table varchar_3 ( field varchar(25) ) stored as orc; insert into table varchar_3 select cint from alltypesorc limit 10; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8097) Vectorized Reduce-Side [SMB] MapJoin operator fails
[ https://issues.apache.org/jira/browse/HIVE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8097: --- Attachment: HIVE-8097.02.patch Vectorized Reduce-Side [SMB] MapJoin operator fails --- Key: HIVE-8097 URL: https://issues.apache.org/jira/browse/HIVE-8097 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8097.01.patch, HIVE-8097.02.patch Fails attempting to getScratchColumnVectorTypes since mapWork is null on reduce-side. Fix by calling that method using reduceWork object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8097) Vectorized Reduce-Side [SMB] MapJoin operator fails
[ https://issues.apache.org/jira/browse/HIVE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8097: --- Status: In Progress (was: Patch Available) Made review comment changes. Still do not have a test query for it yet. Vectorized Reduce-Side [SMB] MapJoin operator fails --- Key: HIVE-8097 URL: https://issues.apache.org/jira/browse/HIVE-8097 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8097.01.patch, HIVE-8097.02.patch Fails attempting to getScratchColumnVectorTypes since mapWork is null on reduce-side. Fix by calling that method using reduceWork object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8097) Vectorized Reduce-Side [SMB] MapJoin operator fails
[ https://issues.apache.org/jira/browse/HIVE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8097: --- Status: Patch Available (was: In Progress) Vectorized Reduce-Side [SMB] MapJoin operator fails --- Key: HIVE-8097 URL: https://issues.apache.org/jira/browse/HIVE-8097 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8097.01.patch, HIVE-8097.02.patch Fails attempting to getScratchColumnVectorTypes since mapWork is null on reduce-side. Fix by calling that method using reduceWork object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8097) Vectorized Reduce-Side [SMB] MapJoin operator fails
[ https://issues.apache.org/jira/browse/HIVE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134798#comment-14134798 ] Matt McCline commented on HIVE-8097: Added q file that verifies the fix. Vectorized Reduce-Side [SMB] MapJoin operator fails --- Key: HIVE-8097 URL: https://issues.apache.org/jira/browse/HIVE-8097 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8097.01.patch, HIVE-8097.02.patch, HIVE-8097.03.patch Fails attempting to getScratchColumnVectorTypes since mapWork is null on reduce-side. Fix by calling that method using reduceWork object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8095) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable
Matt McCline created HIVE-8095: -- Summary: Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable Key: HIVE-8095 URL: https://issues.apache.org/jira/browse/HIVE-8095 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveDecimal cannot be cast to org.apache.hadoop.hive.serde2.io.HiveDecimalWritable at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$17.assignObjectValue(VectorColumnAssignFactory.java:431) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:886) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$400(VectorGroupByOperator.java:63) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.flush(VectorGroupByOperator.java:463) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.close(VectorGroupByOperator.java:369) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:924) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:583) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:224) ... 13 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8095) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable
[ https://issues.apache.org/jira/browse/HIVE-8095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8095: --- Attachment: HIVE-8095.01.patch Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable Key: HIVE-8095 URL: https://issues.apache.org/jira/browse/HIVE-8095 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8095.01.patch {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveDecimal cannot be cast to org.apache.hadoop.hive.serde2.io.HiveDecimalWritable at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$17.assignObjectValue(VectorColumnAssignFactory.java:431) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:886) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$400(VectorGroupByOperator.java:63) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.flush(VectorGroupByOperator.java:463) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.close(VectorGroupByOperator.java:369) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:924) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:583) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:224) ... 13 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8095) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable
[ https://issues.apache.org/jira/browse/HIVE-8095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8095: --- Status: Patch Available (was: Open) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable Key: HIVE-8095 URL: https://issues.apache.org/jira/browse/HIVE-8095 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8095.01.patch {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveDecimal cannot be cast to org.apache.hadoop.hive.serde2.io.HiveDecimalWritable at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$17.assignObjectValue(VectorColumnAssignFactory.java:431) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:886) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$400(VectorGroupByOperator.java:63) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.flush(VectorGroupByOperator.java:463) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.close(VectorGroupByOperator.java:369) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:924) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:583) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:224) ... 13 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8095) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable
[ https://issues.apache.org/jira/browse/HIVE-8095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8095: --- Attachment: HIVE-8095.02.patch Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable Key: HIVE-8095 URL: https://issues.apache.org/jira/browse/HIVE-8095 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8095.01.patch, HIVE-8095.02.patch {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveDecimal cannot be cast to org.apache.hadoop.hive.serde2.io.HiveDecimalWritable at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$17.assignObjectValue(VectorColumnAssignFactory.java:431) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:886) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$400(VectorGroupByOperator.java:63) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.flush(VectorGroupByOperator.java:463) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.close(VectorGroupByOperator.java:369) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:924) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:583) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:224) ... 13 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8095) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable
[ https://issues.apache.org/jira/browse/HIVE-8095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8095: --- Status: Patch Available (was: In Progress) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable Key: HIVE-8095 URL: https://issues.apache.org/jira/browse/HIVE-8095 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8095.01.patch, HIVE-8095.02.patch {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveDecimal cannot be cast to org.apache.hadoop.hive.serde2.io.HiveDecimalWritable at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$17.assignObjectValue(VectorColumnAssignFactory.java:431) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:886) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$400(VectorGroupByOperator.java:63) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.flush(VectorGroupByOperator.java:463) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.close(VectorGroupByOperator.java:369) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:924) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:583) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:224) ... 13 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8095) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable
[ https://issues.apache.org/jira/browse/HIVE-8095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8095: --- Status: In Progress (was: Patch Available) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable Key: HIVE-8095 URL: https://issues.apache.org/jira/browse/HIVE-8095 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8095.01.patch, HIVE-8095.02.patch {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.common.type.HiveDecimal cannot be cast to org.apache.hadoop.hive.serde2.io.HiveDecimalWritable at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$17.assignObjectValue(VectorColumnAssignFactory.java:431) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:886) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$400(VectorGroupByOperator.java:63) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.flush(VectorGroupByOperator.java:463) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.close(VectorGroupByOperator.java:369) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:924) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:583) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:224) ... 13 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty
[ https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8092: --- Status: In Progress (was: Patch Available) count(*) returns NULL instead of 0 when result is empty --- Key: HIVE-8092 URL: https://issues.apache.org/jira/browse/HIVE-8092 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch, HIVE-8092.03.patch In tez mode when vectorization is enabled, count returns NULL when result is empty. Expected behavior: It should return 0 count works as expected when vectorization is off -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty
[ https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133298#comment-14133298 ] Matt McCline commented on HIVE-8092: Moving new vector_count_empty.q tests to end of vectorization_short_regress.q count(*) returns NULL instead of 0 when result is empty --- Key: HIVE-8092 URL: https://issues.apache.org/jira/browse/HIVE-8092 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch, HIVE-8092.03.patch In tez mode when vectorization is enabled, count returns NULL when result is empty. Expected behavior: It should return 0 count works as expected when vectorization is off -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty
[ https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8092: --- Status: Patch Available (was: In Progress) count(*) returns NULL instead of 0 when result is empty --- Key: HIVE-8092 URL: https://issues.apache.org/jira/browse/HIVE-8092 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch, HIVE-8092.03.patch, HIVE-8092.04.patch In tez mode when vectorization is enabled, count returns NULL when result is empty. Expected behavior: It should return 0 count works as expected when vectorization is off -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty
[ https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8092: --- Attachment: HIVE-8092.04.patch count(*) returns NULL instead of 0 when result is empty --- Key: HIVE-8092 URL: https://issues.apache.org/jira/browse/HIVE-8092 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch, HIVE-8092.03.patch, HIVE-8092.04.patch In tez mode when vectorization is enabled, count returns NULL when result is empty. Expected behavior: It should return 0 count works as expected when vectorization is off -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8052: --- Status: In Progress (was: Patch Available) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8097) Vectorized Reduce-Side [SMB] MapJoin operator fails
Matt McCline created HIVE-8097: -- Summary: Vectorized Reduce-Side [SMB] MapJoin operator fails Key: HIVE-8097 URL: https://issues.apache.org/jira/browse/HIVE-8097 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Fails attempting to getScratchColumnVectorTypes since mapWork is null on reduce-side. Fix by calling that method using reduceWork object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8097) Vectorized Reduce-Side [SMB] MapJoin operator fails
[ https://issues.apache.org/jira/browse/HIVE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8097: --- Status: Patch Available (was: Open) Vectorized Reduce-Side [SMB] MapJoin operator fails --- Key: HIVE-8097 URL: https://issues.apache.org/jira/browse/HIVE-8097 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8097.01.patch Fails attempting to getScratchColumnVectorTypes since mapWork is null on reduce-side. Fix by calling that method using reduceWork object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8097) Vectorized Reduce-Side [SMB] MapJoin operator fails
[ https://issues.apache.org/jira/browse/HIVE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8097: --- Attachment: HIVE-8097.01.patch Vectorized Reduce-Side [SMB] MapJoin operator fails --- Key: HIVE-8097 URL: https://issues.apache.org/jira/browse/HIVE-8097 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8097.01.patch Fails attempting to getScratchColumnVectorTypes since mapWork is null on reduce-side. Fix by calling that method using reduceWork object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8092) Vectorized Tez count(*) returns NULL instead of 0 when result is empty
[ https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8092: --- Summary: Vectorized Tez count(*) returns NULL instead of 0 when result is empty (was: count(*) returns NULL instead of 0 when result is empty) Vectorized Tez count(*) returns NULL instead of 0 when result is empty -- Key: HIVE-8092 URL: https://issues.apache.org/jira/browse/HIVE-8092 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch, HIVE-8092.03.patch, HIVE-8092.04.patch In tez mode when vectorization is enabled, count returns NULL when result is empty. Expected behavior: It should return 0 count works as expected when vectorization is off -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty
[ https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8092: --- Status: In Progress (was: Patch Available) count(*) returns NULL instead of 0 when result is empty --- Key: HIVE-8092 URL: https://issues.apache.org/jira/browse/HIVE-8092 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8092.01.patch In tez mode when vectorization is enabled, count returns NULL when result is empty. Expected behavior: It should return 0 count works as expected when vectorization is off -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty
[ https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8092: --- Attachment: HIVE-8092.02.patch count(*) returns NULL instead of 0 when result is empty --- Key: HIVE-8092 URL: https://issues.apache.org/jira/browse/HIVE-8092 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch In tez mode when vectorization is enabled, count returns NULL when result is empty. Expected behavior: It should return 0 count works as expected when vectorization is off -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty
[ https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8092: --- Status: Patch Available (was: In Progress) count(*) returns NULL instead of 0 when result is empty --- Key: HIVE-8092 URL: https://issues.apache.org/jira/browse/HIVE-8092 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch In tez mode when vectorization is enabled, count returns NULL when result is empty. Expected behavior: It should return 0 count works as expected when vectorization is off -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty
[ https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8092: --- Status: In Progress (was: Patch Available) count(*) returns NULL instead of 0 when result is empty --- Key: HIVE-8092 URL: https://issues.apache.org/jira/browse/HIVE-8092 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch In tez mode when vectorization is enabled, count returns NULL when result is empty. Expected behavior: It should return 0 count works as expected when vectorization is off -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty
[ https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8092: --- Attachment: HIVE-8092.03.patch count(*) returns NULL instead of 0 when result is empty --- Key: HIVE-8092 URL: https://issues.apache.org/jira/browse/HIVE-8092 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch, HIVE-8092.03.patch In tez mode when vectorization is enabled, count returns NULL when result is empty. Expected behavior: It should return 0 count works as expected when vectorization is off -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty
[ https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8092: --- Status: Patch Available (was: In Progress) count(*) returns NULL instead of 0 when result is empty --- Key: HIVE-8092 URL: https://issues.apache.org/jira/browse/HIVE-8092 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch, HIVE-8092.03.patch In tez mode when vectorization is enabled, count returns NULL when result is empty. Expected behavior: It should return 0 count works as expected when vectorization is off -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty
Matt McCline created HIVE-8092: -- Summary: count(*) returns NULL instead of 0 when result is empty Key: HIVE-8092 URL: https://issues.apache.org/jira/browse/HIVE-8092 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical In tez mode when vectorization is enabled, count returns NULL when result is empty. Expected behavior: It should return 0 count works as expected when vectorization is off -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty
[ https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8092: --- Status: Patch Available (was: Open) count(*) returns NULL instead of 0 when result is empty --- Key: HIVE-8092 URL: https://issues.apache.org/jira/browse/HIVE-8092 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8092.01.patch In tez mode when vectorization is enabled, count returns NULL when result is empty. Expected behavior: It should return 0 count works as expected when vectorization is off -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty
[ https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8092: --- Attachment: HIVE-8092.01.patch count(*) returns NULL instead of 0 when result is empty --- Key: HIVE-8092 URL: https://issues.apache.org/jira/browse/HIVE-8092 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8092.01.patch In tez mode when vectorization is enabled, count returns NULL when result is empty. Expected behavior: It should return 0 count works as expected when vectorization is off -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
Matt McCline created HIVE-8052: -- Summary: Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8052: --- Attachment: HIVE-8052.01.patch Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8052: --- Status: Patch Available (was: Open) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8063) Vectorization: different results between MR and Tez for aggregation functions (variance, var_pop, var_samp, std, stddev, etc)
Matt McCline created HIVE-8063: -- Summary: Vectorization: different results between MR and Tez for aggregation functions (variance, var_pop, var_samp, std, stddev, etc) Key: HIVE-8063 URL: https://issues.apache.org/jira/browse/HIVE-8063 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical When running vectorized_timestamp_funcs.q with aggregation functions (listed in title), the results were 0 on Tez and non-zero on MR. The aggregatesDefinition table in VectorizationContext.java currently disallows timestamp and date for those functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8052: --- Status: In Progress (was: Patch Available) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8052: --- Status: Patch Available (was: In Progress) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-8052: --- Attachment: HIVE-8052.02.patch Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP --- Key: HIVE-8052 URL: https://issues.apache.org/jira/browse/HIVE-8052 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized as Long were accidentally to strict for min, max, count, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-7537) Output vectorized GROUP BY with only primitive aggregate fields as columns so downstream operators will be vectorized
[ https://issues.apache.org/jira/browse/HIVE-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline resolved HIVE-7537. Resolution: Fixed Output vectorized GROUP BY with only primitive aggregate fields as columns so downstream operators will be vectorized - Key: HIVE-7537 URL: https://issues.apache.org/jira/browse/HIVE-7537 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline When under Tez engine, see if the VectorGroupByOperator aggregrates are all primitive (e.g. sum) and batch the output rows into a VectorizedRowBatch. And, vectorize downstream operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7537) Output vectorized GROUP BY with only primitive aggregate fields as columns so downstream operators will be vectorized
[ https://issues.apache.org/jira/browse/HIVE-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7537: --- Attachment: (was: HIVE-7537.1.patch) Output vectorized GROUP BY with only primitive aggregate fields as columns so downstream operators will be vectorized - Key: HIVE-7537 URL: https://issues.apache.org/jira/browse/HIVE-7537 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline When under Tez engine, see if the VectorGroupByOperator aggregrates are all primitive (e.g. sum) and batch the output rows into a VectorizedRowBatch. And, vectorize downstream operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7537) Output vectorized GROUP BY with only primitive aggregate fields as columns so downstream operators will be vectorized
[ https://issues.apache.org/jira/browse/HIVE-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129364#comment-14129364 ] Matt McCline commented on HIVE-7537: This work was done in HIVE-7405. Output vectorized GROUP BY with only primitive aggregate fields as columns so downstream operators will be vectorized - Key: HIVE-7537 URL: https://issues.apache.org/jira/browse/HIVE-7537 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline When under Tez engine, see if the VectorGroupByOperator aggregrates are all primitive (e.g. sum) and batch the output rows into a VectorizedRowBatch. And, vectorize downstream operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7442) ql.exec.vector.expressions.gen.DecimalColAddDecimalScalar.evaluate throws ClassCastException: ...LongColumnVector cannot be cast to ...DecimalColumnVector
[ https://issues.apache.org/jira/browse/HIVE-7442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129368#comment-14129368 ] Matt McCline commented on HIVE-7442: This problem was fixed in the CHAR / VARCHAR HIVE-5760 change. ql.exec.vector.expressions.gen.DecimalColAddDecimalScalar.evaluate throws ClassCastException: ...LongColumnVector cannot be cast to ...DecimalColumnVector -- Key: HIVE-7442 URL: https://issues.apache.org/jira/browse/HIVE-7442 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Took decimal_join.q and converted it to read from ORC and turned on vectorization: vector_decimal_join.q {code} SET hive.vectorized.execution.enabled=true; -- HIVE-5292 Join on decimal columns fails create table src_dec_staging (key decimal(3,0), value string); load data local inpath '../../data/files/kv1.txt' into table src_dec_staging; create table src_dec (key decimal(3,0), value string) stored as orc; insert overwrite table src_dec select * from src_dec_staging; explain select * from src_dec a join src_dec b on a.key=b.key+450; select * from src_dec a join src_dec b on a.key=b.key+450; {code} Stack trace: {code} java.lang.Exception: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:695) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 10 more Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DecimalColAddDecimalScalar.evaluate(DecimalColAddDecimalScalar.java:60) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.FuncDecimalToLong.evaluate(FuncDecimalToLong.java:51) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.SelectColumnIsNotNull.evaluate(SelectColumnIsNotNull.java:45) at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.processOp(VectorFilterOperator.java:91) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 11 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-7442) ql.exec.vector.expressions.gen.DecimalColAddDecimalScalar.evaluate throws ClassCastException: ...LongColumnVector cannot be cast to ...DecimalColumnVector
[ https://issues.apache.org/jira/browse/HIVE-7442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline resolved HIVE-7442. Resolution: Fixed ql.exec.vector.expressions.gen.DecimalColAddDecimalScalar.evaluate throws ClassCastException: ...LongColumnVector cannot be cast to ...DecimalColumnVector -- Key: HIVE-7442 URL: https://issues.apache.org/jira/browse/HIVE-7442 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Took decimal_join.q and converted it to read from ORC and turned on vectorization: vector_decimal_join.q {code} SET hive.vectorized.execution.enabled=true; -- HIVE-5292 Join on decimal columns fails create table src_dec_staging (key decimal(3,0), value string); load data local inpath '../../data/files/kv1.txt' into table src_dec_staging; create table src_dec (key decimal(3,0), value string) stored as orc; insert overwrite table src_dec select * from src_dec_staging; explain select * from src_dec a join src_dec b on a.key=b.key+450; select * from src_dec a join src_dec b on a.key=b.key+450; {code} Stack trace: {code} java.lang.Exception: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:695) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 10 more Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DecimalColAddDecimalScalar.evaluate(DecimalColAddDecimalScalar.java:60) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.FuncDecimalToLong.evaluate(FuncDecimalToLong.java:51) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.SelectColumnIsNotNull.evaluate(SelectColumnIsNotNull.java:45) at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.processOp(VectorFilterOperator.java:91) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 11 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5701) vectorized groupby should work with vectorized reduce sink
[ https://issues.apache.org/jira/browse/HIVE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129372#comment-14129372 ] Matt McCline commented on HIVE-5701: Fixed with HIVE-7405. vectorized groupby should work with vectorized reduce sink -- Key: HIVE-5701 URL: https://issues.apache.org/jira/browse/HIVE-5701 Project: Hive Issue Type: Improvement Components: Vectorization Reporter: Sergey Shelukhin Assignee: Matt McCline As far as I understand right now vectorized group by works with regular reduce sink. [~jnp] fyi -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-5701) vectorized groupby should work with vectorized reduce sink
[ https://issues.apache.org/jira/browse/HIVE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline resolved HIVE-5701. Resolution: Fixed vectorized groupby should work with vectorized reduce sink -- Key: HIVE-5701 URL: https://issues.apache.org/jira/browse/HIVE-5701 Project: Hive Issue Type: Improvement Components: Vectorization Reporter: Sergey Shelukhin Assignee: Matt McCline As far as I understand right now vectorized group by works with regular reduce sink. [~jnp] fyi -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, HIVE-7405.995.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.996.patch Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, HIVE-7405.995.patch, HIVE-7405.996.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, HIVE-7405.995.patch, HIVE-7405.996.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, HIVE-7405.995.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.995.patch Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, HIVE-7405.995.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, HIVE-7405.995.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: In Progress) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch, HIVE-7405.991.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.994.patch Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Open (was: Patch Available) Submitted more than 12 hours ago and no result? Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.991.patch Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch, HIVE-7405.991.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: Patch Available (was: Open) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch, HIVE-7405.991.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types
[ https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-5760: --- Attachment: HIVE-5760.91.patch Add vectorized support for CHAR/VARCHAR data types -- Key: HIVE-5760 URL: https://issues.apache.org/jira/browse/HIVE-5760 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Matt McCline Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, HIVE-5760.91.patch Add support to allow queries referencing VARCHAR columns and expression results to run efficiently in vectorized mode. This should re-use the code for the STRING type to the extent possible and beneficial. Include unit tests and end-to-end tests. Consider re-using or extending existing end-to-end tests for vectorized string operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types
[ https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-5760: --- Status: Patch Available (was: In Progress) Add vectorized support for CHAR/VARCHAR data types -- Key: HIVE-5760 URL: https://issues.apache.org/jira/browse/HIVE-5760 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Matt McCline Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, HIVE-5760.91.patch Add support to allow queries referencing VARCHAR columns and expression results to run efficiently in vectorized mode. This should re-use the code for the STRING type to the extent possible and beneficial. Include unit tests and end-to-end tests. Consider re-using or extending existing end-to-end tests for vectorized string operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types
[ https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-5760: --- Status: In Progress (was: Patch Available) Add vectorized support for CHAR/VARCHAR data types -- Key: HIVE-5760 URL: https://issues.apache.org/jira/browse/HIVE-5760 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Matt McCline Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, HIVE-5760.91.patch Add support to allow queries referencing VARCHAR columns and expression results to run efficiently in vectorized mode. This should re-use the code for the STRING type to the extent possible and beneficial. Include unit tests and end-to-end tests. Consider re-using or extending existing end-to-end tests for vectorized string operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types
[ https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-5760: --- Attachment: HIVE-5760.92.patch Add vectorized support for CHAR/VARCHAR data types -- Key: HIVE-5760 URL: https://issues.apache.org/jira/browse/HIVE-5760 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Matt McCline Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, HIVE-5760.91.patch, HIVE-5760.92.patch Add support to allow queries referencing VARCHAR columns and expression results to run efficiently in vectorized mode. This should re-use the code for the STRING type to the extent possible and beneficial. Include unit tests and end-to-end tests. Consider re-using or extending existing end-to-end tests for vectorized string operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types
[ https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-5760: --- Status: Patch Available (was: In Progress) Add vectorized support for CHAR/VARCHAR data types -- Key: HIVE-5760 URL: https://issues.apache.org/jira/browse/HIVE-5760 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Matt McCline Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, HIVE-5760.91.patch, HIVE-5760.92.patch Add support to allow queries referencing VARCHAR columns and expression results to run efficiently in vectorized mode. This should re-use the code for the STRING type to the extent possible and beneficial. Include unit tests and end-to-end tests. Consider re-using or extending existing end-to-end tests for vectorized string operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Status: In Progress (was: Patch Available) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch, HIVE-7405.97.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7405: --- Attachment: HIVE-7405.98.patch Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)