[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-25 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Status: In Progress  (was: Patch Available)

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, 
 HIVE-8171.03.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-25 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Status: Patch Available  (was: In Progress)

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, 
 HIVE-8171.03.patch, HIVE-8171.04.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-25 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Attachment: HIVE-8171.04.patch

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, 
 HIVE-8171.03.patch, HIVE-8171.04.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR

2014-09-25 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8240:
---
Status: In Progress  (was: Patch Available)

 VectorColumnAssignFactory throws Incompatible Bytes vector column and 
 primitive category VARCHAR
 --

 Key: HIVE-8240
 URL: https://issues.apache.org/jira/browse/HIVE-8240
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8240.01.patch, HIVE-8240.02.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR

2014-09-25 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8240:
---
Attachment: HIVE-8240.04.patch

 VectorColumnAssignFactory throws Incompatible Bytes vector column and 
 primitive category VARCHAR
 --

 Key: HIVE-8240
 URL: https://issues.apache.org/jira/browse/HIVE-8240
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8240.01.patch, HIVE-8240.02.patch, 
 HIVE-8240.04.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR

2014-09-25 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8240:
---
Status: Patch Available  (was: In Progress)

 VectorColumnAssignFactory throws Incompatible Bytes vector column and 
 primitive category VARCHAR
 --

 Key: HIVE-8240
 URL: https://issues.apache.org/jira/browse/HIVE-8240
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8240.01.patch, HIVE-8240.02.patch, 
 HIVE-8240.04.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-25 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Fix Version/s: 0.14.0

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, 
 HIVE-8171.03.patch, HIVE-8171.04.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8226) Vectorize dynamic partitioning in VectorFileSinkOperator

2014-09-25 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8226:
---
Attachment: HIVE-8226.01.patch

 Vectorize dynamic partitioning in VectorFileSinkOperator
 

 Key: HIVE-8226
 URL: https://issues.apache.org/jira/browse/HIVE-8226
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8226.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8226) Vectorize dynamic partitioning in VectorFileSinkOperator

2014-09-25 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8226:
---
Status: Patch Available  (was: Open)

 Vectorize dynamic partitioning in VectorFileSinkOperator
 

 Key: HIVE-8226
 URL: https://issues.apache.org/jira/browse/HIVE-8226
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8226.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8264) Math UDFs in Reducer-with-vectorization fail with ArrayIndexOutOfBoundsException

2014-09-25 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148711#comment-14148711
 ] 

Matt McCline commented on HIVE-8264:


Could be a duplicate of HIVE-8171 that I am actively working on (patch 
submitted -- awaiting test results).

 Math UDFs in Reducer-with-vectorization fail with 
 ArrayIndexOutOfBoundsException
 

 Key: HIVE-8264
 URL: https://issues.apache.org/jira/browse/HIVE-8264
 Project: Hive
  Issue Type: Bug
  Components: Tez, UDF, Vectorization
Affects Versions: 0.14.0
 Environment: Hive trunk - as of today
 Tez - 0.5.0
 Hadoop - 2.5
Reporter: Thiruvel Thirumoolan
  Labels: mathfunction, tez, vectorization

 Following queries are representative of the exceptions we are seeing with 
 trunk. These queries pass if vectorization is disabled (or if limit is 
 removed, which means no reducer).
 select name, log2(0) from (select name from mytable limit 1) t;
 select name, rand() from (select name from mytable limit 1) t;
 .. similar patterns with other Math UDFs'.
 Exception:
 ], TaskAttempt 3 failed, info=[Error: Failure while running 
 task:java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:154)
   ... 14 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing vector batch (tag=0)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:360)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:242)
   ... 16 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 null
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator.processOp(VectorLimitOperator.java:47)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:347)
   ... 17 more
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(ConstantVectorExpression.java:102)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(ConstantVectorExpression.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:125)
   ... 22 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-24 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Status: In Progress  (was: Patch Available)

Had to start over with change -- major changes made recently to 
ReduceRecordProcessor (code moved to new file).

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, 
 HIVE-8171.03.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-24 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Attachment: HIVE-8171.03.patch

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, 
 HIVE-8171.03.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-24 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Status: Patch Available  (was: In Progress)

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch, 
 HIVE-8171.03.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR

2014-09-24 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8240:
---
Attachment: HIVE-8240.02.patch

 VectorColumnAssignFactory throws Incompatible Bytes vector column and 
 primitive category VARCHAR
 --

 Key: HIVE-8240
 URL: https://issues.apache.org/jira/browse/HIVE-8240
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8240.01.patch, HIVE-8240.02.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR

2014-09-24 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8240:
---
Status: In Progress  (was: Patch Available)

 VectorColumnAssignFactory throws Incompatible Bytes vector column and 
 primitive category VARCHAR
 --

 Key: HIVE-8240
 URL: https://issues.apache.org/jira/browse/HIVE-8240
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8240.01.patch, HIVE-8240.02.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR

2014-09-24 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8240:
---
Status: Patch Available  (was: In Progress)

 VectorColumnAssignFactory throws Incompatible Bytes vector column and 
 primitive category VARCHAR
 --

 Key: HIVE-8240
 URL: https://issues.apache.org/jira/browse/HIVE-8240
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8240.01.patch, HIVE-8240.02.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR

2014-09-23 Thread Matt McCline (JIRA)
Matt McCline created HIVE-8240:
--

 Summary: VectorColumnAssignFactory throws Incompatible Bytes 
vector column and primitive category VARCHAR
 Key: HIVE-8240
 URL: https://issues.apache.org/jira/browse/HIVE-8240
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Status: In Progress  (was: Patch Available)

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Attachment: HIVE-8171.02.patch

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-23 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14145716#comment-14145716
 ] 

Matt McCline commented on HIVE-8171:


Got rid of the unnecessary tags in ReduceRecordProcessor.
Not sure which LOG indentations you are seeing.

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Status: Patch Available  (was: In Progress)

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch, HIVE-8171.02.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR

2014-09-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8240:
---
Status: Patch Available  (was: Open)

 VectorColumnAssignFactory throws Incompatible Bytes vector column and 
 primitive category VARCHAR
 --

 Key: HIVE-8240
 URL: https://issues.apache.org/jira/browse/HIVE-8240
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8240.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8240) VectorColumnAssignFactory throws Incompatible Bytes vector column and primitive category VARCHAR

2014-09-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8240:
---
Attachment: HIVE-8240.01.patch

 VectorColumnAssignFactory throws Incompatible Bytes vector column and 
 primitive category VARCHAR
 --

 Key: HIVE-8240
 URL: https://issues.apache.org/jira/browse/HIVE-8240
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8240.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8163) With dynamic partition pruning map operator that generates the partition filters is not vectorized

2014-09-22 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143938#comment-14143938
 ] 

Matt McCline commented on HIVE-8163:


(non-binding) +1

 With dynamic partition pruning map operator that generates the partition 
 filters is not vectorized
 --

 Key: HIVE-8163
 URL: https://issues.apache.org/jira/browse/HIVE-8163
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Gunther Hagleitner
Priority: Minor
  Labels: performance
 Attachments: HIVE-8163.1.patch, HIVE-8163.2.patch


 Vertex used to generate the partition pruning filters is not vectorized.
 Sample from the plan :
 {code}
 Vertices:
 Map 1 
 Map Operator Tree:
 TableScan
   alias: d3
   filterExpr: ((d_quarter_name) IN ('2000Q1', '2000Q2', 
 '2000Q3') and d_date_sk is not null) (type: boolean)
   Statistics: Num rows: 73049 Data size: 81741831 Basic 
 stats: COMPLETE Column stats: NONE
   Filter Operator
 predicate: ((d_quarter_name) IN ('2000Q1', '2000Q2', 
 '2000Q3') and d_date_sk is not null) (type: boolean)
 Statistics: Num rows: 18262 Data size: 20435178 Basic 
 stats: COMPLETE Column stats: NONE
 Select Operator
   expressions: d_date_sk (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 18262 Data size: 20435178 Basic 
 stats: COMPLETE Column stats: NONE
   Reduce Output Operator
 key expressions: _col0 (type: int)
 sort order: +
 Map-reduce partition columns: _col0 (type: int)
 Statistics: Num rows: 18262 Data size: 20435178 Basic 
 stats: COMPLETE Column stats: NONE
   Select Operator
 expressions: _col0 (type: int)
 outputColumnNames: _col0
 Statistics: Num rows: 18262 Data size: 20435178 Basic 
 stats: COMPLETE Column stats: NONE
 Group By Operator
   keys: _col0 (type: int)
   mode: hash
   outputColumnNames: _col0
   Statistics: Num rows: 18262 Data size: 20435178 
 Basic stats: COMPLETE Column stats: NONE
   Dynamic Partitioning Event Operator
 Target Input: catalog_sales
 Partition key expr: cs_sold_date_sk
 Statistics: Num rows: 18262 Data size: 20435178 
 Basic stats: COMPLETE Column stats: NONE
 Target column: cs_sold_date_sk
 Target Vertex: Map 3
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8226) Vectorize dynamic partitioning in VectorFileSinkOperator

2014-09-22 Thread Matt McCline (JIRA)
Matt McCline created HIVE-8226:
--

 Summary: Vectorize dynamic partitioning in VectorFileSinkOperator
 Key: HIVE-8226
 URL: https://issues.apache.org/jira/browse/HIVE-8226
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8211) Tez and Vectorization of SUM(timestamp) not vectorized -- can't execute correctly because aggregation output is double

2014-09-21 Thread Matt McCline (JIRA)
Matt McCline created HIVE-8211:
--

 Summary: Tez and Vectorization of SUM(timestamp) not vectorized -- 
can't execute correctly because aggregation output is double
 Key: HIVE-8211
 URL: https://issues.apache.org/jira/browse/HIVE-8211
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Reporter: Matt McCline
Assignee: Matt McCline


Vectorization of SUM(timestamp) is currently turned off because the output of 
aggregation is a double (DoubleColumnVector) and the execution code is 
expecting a long (LongColumnVector).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8052:
---
Status: In Progress  (was: Patch Available)

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, 
 HIVE-8052.04.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8052:
---
Attachment: HIVE-8052.05.patch

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, 
 HIVE-8052.04.patch, HIVE-8052.05.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8052:
---
Status: Patch Available  (was: In Progress)

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, 
 HIVE-8052.04.patch, HIVE-8052.05.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-21 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14142748#comment-14142748
 ] 

Matt McCline commented on HIVE-8052:


Yes, good point.  Added variance, var_pop, var_samp, std, stddev, stddev_pop, 
stddev_samp.

It turns out sum(timestamp) is planned with a Double as output from the 
aggregation instead of BigInt and this doesn't work when the reduce-side is 
vectorized.  So, it remains unvectorized (see HIVE-8211).

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, 
 HIVE-8052.04.patch, HIVE-8052.05.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8052:
---
Attachment: HIVE-8052.06.patch

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, 
 HIVE-8052.04.patch, HIVE-8052.05.patch, HIVE-8052.06.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8052:
---
Status: In Progress  (was: Patch Available)

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, 
 HIVE-8052.04.patch, HIVE-8052.05.patch, HIVE-8052.06.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8052:
---
Status: Patch Available  (was: In Progress)

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, 
 HIVE-8052.04.patch, HIVE-8052.05.patch, HIVE-8052.06.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8197) Tez and Vectorization Insert into ORC Table with timestamp column erroneously repeats the last row's column value

2014-09-19 Thread Matt McCline (JIRA)
Matt McCline created HIVE-8197:
--

 Summary: Tez and Vectorization Insert into ORC Table with 
timestamp column erroneously repeats the last row's column value
 Key: HIVE-8197
 URL: https://issues.apache.org/jira/browse/HIVE-8197
 Project: Hive
  Issue Type: Bug
 Environment: Tez and Vectorization.
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical


In diagnosing why a only(?) a Tez and Vectorized query with min and max 
aggregates was always returning the last row read's column value, discovered 
the problem was in creating the test table

{code}
CREATE TABLE alltypesorc_string STORED AS ORC AS SELECT
  ctinyint as ctinyint,
  to_utc_timestamp(ctimestamp1, 'America/Los_Angeles') as ctimestamp1,
  CAST(to_utc_timestamp(ctimestamp1, 'America/Los_Angeles') AS STRING) as 
stimestamp1
FROM alltypesorc WHERE ctinyint  0
LIMIT 40;
{code}

I think it is related what Prasanth mentioned as a possibility: Saving a 
Timestamp as a Writable object that gets overwritten.  One suspect is the 
Writable[] records array in VectorFileSinkOperator in the ProcessOp method.  
Or, perhaps it is in VectorReduceSinkOperator.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-19 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141530#comment-14141530
 ] 

Matt McCline commented on HIVE-8052:



Problem was the loading of data into the test table (see new JIRA HIVE-8197).  
Turning on vectorization was moved to after the loading and MR and Tez query 
results now match.

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, 
 HIVE-8052.04.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-19 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8052:
---
Status: Patch Available  (was: In Progress)

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch, 
 HIVE-8052.04.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-17 Thread Matt McCline (JIRA)
Matt McCline created HIVE-8171:
--

 Summary: Tez and Vectorized Reduce doesn't create scratch columns
 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical


This query fails with ArrayIndexOutofBound exception in the reducer.

{code}
create table varchar_3 (
  field varchar(25)
) stored as orc;

insert into table varchar_3 select cint from alltypesorc limit 10;
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-17 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Attachment: HIVE-8171.01.patch

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8171) Tez and Vectorized Reduce doesn't create scratch columns

2014-09-17 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8171:
---
Status: Patch Available  (was: Open)

 Tez and Vectorized Reduce doesn't create scratch columns
 

 Key: HIVE-8171
 URL: https://issues.apache.org/jira/browse/HIVE-8171
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8171.01.patch


 This query fails with ArrayIndexOutofBound exception in the reducer.
 {code}
 create table varchar_3 (
   field varchar(25)
 ) stored as orc;
 insert into table varchar_3 select cint from alltypesorc limit 10;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8097) Vectorized Reduce-Side [SMB] MapJoin operator fails

2014-09-15 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8097:
---
Attachment: HIVE-8097.02.patch

 Vectorized Reduce-Side [SMB] MapJoin operator fails
 ---

 Key: HIVE-8097
 URL: https://issues.apache.org/jira/browse/HIVE-8097
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8097.01.patch, HIVE-8097.02.patch


 Fails attempting to getScratchColumnVectorTypes since mapWork is null on 
 reduce-side.
 Fix by calling that method using reduceWork object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8097) Vectorized Reduce-Side [SMB] MapJoin operator fails

2014-09-15 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8097:
---
Status: In Progress  (was: Patch Available)

Made review comment changes.

Still do not have a test query for it yet.

 Vectorized Reduce-Side [SMB] MapJoin operator fails
 ---

 Key: HIVE-8097
 URL: https://issues.apache.org/jira/browse/HIVE-8097
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8097.01.patch, HIVE-8097.02.patch


 Fails attempting to getScratchColumnVectorTypes since mapWork is null on 
 reduce-side.
 Fix by calling that method using reduceWork object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8097) Vectorized Reduce-Side [SMB] MapJoin operator fails

2014-09-15 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8097:
---
Status: Patch Available  (was: In Progress)

 Vectorized Reduce-Side [SMB] MapJoin operator fails
 ---

 Key: HIVE-8097
 URL: https://issues.apache.org/jira/browse/HIVE-8097
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8097.01.patch, HIVE-8097.02.patch


 Fails attempting to getScratchColumnVectorTypes since mapWork is null on 
 reduce-side.
 Fix by calling that method using reduceWork object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8097) Vectorized Reduce-Side [SMB] MapJoin operator fails

2014-09-15 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134798#comment-14134798
 ] 

Matt McCline commented on HIVE-8097:


Added q file that verifies the fix.

 Vectorized Reduce-Side [SMB] MapJoin operator fails
 ---

 Key: HIVE-8097
 URL: https://issues.apache.org/jira/browse/HIVE-8097
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8097.01.patch, HIVE-8097.02.patch, 
 HIVE-8097.03.patch


 Fails attempting to getScratchColumnVectorTypes since mapWork is null on 
 reduce-side.
 Fix by calling that method using reduceWork object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8095) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable

2014-09-14 Thread Matt McCline (JIRA)
Matt McCline created HIVE-8095:
--

 Summary: Tez and Vectorized GROUP BY: ClassCastException: 
...HiveDecimal cannot be cast to ...HiveDecimalWritable
 Key: HIVE-8095
 URL: https://issues.apache.org/jira/browse/HIVE-8095
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical



{code}
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.common.type.HiveDecimal cannot be cast to 
org.apache.hadoop.hive.serde2.io.HiveDecimalWritable
at 
org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$17.assignObjectValue(VectorColumnAssignFactory.java:431)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:886)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$400(VectorGroupByOperator.java:63)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.flush(VectorGroupByOperator.java:463)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.close(VectorGroupByOperator.java:369)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:924)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:583)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:224)
... 13 more
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8095) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable

2014-09-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8095:
---
Attachment: HIVE-8095.01.patch

 Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be 
 cast to ...HiveDecimalWritable
 

 Key: HIVE-8095
 URL: https://issues.apache.org/jira/browse/HIVE-8095
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8095.01.patch


 {code}
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.common.type.HiveDecimal cannot be cast to 
 org.apache.hadoop.hive.serde2.io.HiveDecimalWritable
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$17.assignObjectValue(VectorColumnAssignFactory.java:431)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:886)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$400(VectorGroupByOperator.java:63)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.flush(VectorGroupByOperator.java:463)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.close(VectorGroupByOperator.java:369)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:924)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:583)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:224)
   ... 13 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8095) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable

2014-09-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8095:
---
Status: Patch Available  (was: Open)

 Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be 
 cast to ...HiveDecimalWritable
 

 Key: HIVE-8095
 URL: https://issues.apache.org/jira/browse/HIVE-8095
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8095.01.patch


 {code}
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.common.type.HiveDecimal cannot be cast to 
 org.apache.hadoop.hive.serde2.io.HiveDecimalWritable
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$17.assignObjectValue(VectorColumnAssignFactory.java:431)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:886)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$400(VectorGroupByOperator.java:63)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.flush(VectorGroupByOperator.java:463)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.close(VectorGroupByOperator.java:369)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:924)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:583)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:224)
   ... 13 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8095) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable

2014-09-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8095:
---
Attachment: HIVE-8095.02.patch

 Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be 
 cast to ...HiveDecimalWritable
 

 Key: HIVE-8095
 URL: https://issues.apache.org/jira/browse/HIVE-8095
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8095.01.patch, HIVE-8095.02.patch


 {code}
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.common.type.HiveDecimal cannot be cast to 
 org.apache.hadoop.hive.serde2.io.HiveDecimalWritable
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$17.assignObjectValue(VectorColumnAssignFactory.java:431)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:886)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$400(VectorGroupByOperator.java:63)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.flush(VectorGroupByOperator.java:463)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.close(VectorGroupByOperator.java:369)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:924)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:583)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:224)
   ... 13 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8095) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable

2014-09-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8095:
---
Status: Patch Available  (was: In Progress)

 Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be 
 cast to ...HiveDecimalWritable
 

 Key: HIVE-8095
 URL: https://issues.apache.org/jira/browse/HIVE-8095
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8095.01.patch, HIVE-8095.02.patch


 {code}
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.common.type.HiveDecimal cannot be cast to 
 org.apache.hadoop.hive.serde2.io.HiveDecimalWritable
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$17.assignObjectValue(VectorColumnAssignFactory.java:431)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:886)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$400(VectorGroupByOperator.java:63)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.flush(VectorGroupByOperator.java:463)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.close(VectorGroupByOperator.java:369)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:924)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:583)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:224)
   ... 13 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8095) Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be cast to ...HiveDecimalWritable

2014-09-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8095:
---
Status: In Progress  (was: Patch Available)

 Tez and Vectorized GROUP BY: ClassCastException: ...HiveDecimal cannot be 
 cast to ...HiveDecimalWritable
 

 Key: HIVE-8095
 URL: https://issues.apache.org/jira/browse/HIVE-8095
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8095.01.patch, HIVE-8095.02.patch


 {code}
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.common.type.HiveDecimal cannot be cast to 
 org.apache.hadoop.hive.serde2.io.HiveDecimalWritable
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$17.assignObjectValue(VectorColumnAssignFactory.java:431)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:886)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$400(VectorGroupByOperator.java:63)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.flush(VectorGroupByOperator.java:463)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.close(VectorGroupByOperator.java:369)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:924)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:583)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:595)
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:224)
   ... 13 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty

2014-09-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8092:
---
Status: In Progress  (was: Patch Available)

 count(*) returns NULL instead of 0 when result is empty
 ---

 Key: HIVE-8092
 URL: https://issues.apache.org/jira/browse/HIVE-8092
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch, 
 HIVE-8092.03.patch


 In tez mode when vectorization is enabled, count returns NULL when result is 
 empty.
 Expected behavior: It should return 0
 count works as expected when vectorization is off



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty

2014-09-14 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133298#comment-14133298
 ] 

Matt McCline commented on HIVE-8092:


Moving new vector_count_empty.q tests to end of vectorization_short_regress.q

 count(*) returns NULL instead of 0 when result is empty
 ---

 Key: HIVE-8092
 URL: https://issues.apache.org/jira/browse/HIVE-8092
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch, 
 HIVE-8092.03.patch


 In tez mode when vectorization is enabled, count returns NULL when result is 
 empty.
 Expected behavior: It should return 0
 count works as expected when vectorization is off



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty

2014-09-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8092:
---
Status: Patch Available  (was: In Progress)

 count(*) returns NULL instead of 0 when result is empty
 ---

 Key: HIVE-8092
 URL: https://issues.apache.org/jira/browse/HIVE-8092
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch, 
 HIVE-8092.03.patch, HIVE-8092.04.patch


 In tez mode when vectorization is enabled, count returns NULL when result is 
 empty.
 Expected behavior: It should return 0
 count works as expected when vectorization is off



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty

2014-09-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8092:
---
Attachment: HIVE-8092.04.patch

 count(*) returns NULL instead of 0 when result is empty
 ---

 Key: HIVE-8092
 URL: https://issues.apache.org/jira/browse/HIVE-8092
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch, 
 HIVE-8092.03.patch, HIVE-8092.04.patch


 In tez mode when vectorization is enabled, count returns NULL when result is 
 empty.
 Expected behavior: It should return 0
 count works as expected when vectorization is off



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8052:
---
Status: In Progress  (was: Patch Available)

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8097) Vectorized Reduce-Side [SMB] MapJoin operator fails

2014-09-14 Thread Matt McCline (JIRA)
Matt McCline created HIVE-8097:
--

 Summary: Vectorized Reduce-Side [SMB] MapJoin operator fails
 Key: HIVE-8097
 URL: https://issues.apache.org/jira/browse/HIVE-8097
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical


Fails attempting to getScratchColumnVectorTypes since mapWork is null on 
reduce-side.

Fix by calling that method using reduceWork object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8097) Vectorized Reduce-Side [SMB] MapJoin operator fails

2014-09-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8097:
---
Status: Patch Available  (was: Open)

 Vectorized Reduce-Side [SMB] MapJoin operator fails
 ---

 Key: HIVE-8097
 URL: https://issues.apache.org/jira/browse/HIVE-8097
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8097.01.patch


 Fails attempting to getScratchColumnVectorTypes since mapWork is null on 
 reduce-side.
 Fix by calling that method using reduceWork object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8097) Vectorized Reduce-Side [SMB] MapJoin operator fails

2014-09-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8097:
---
Attachment: HIVE-8097.01.patch

 Vectorized Reduce-Side [SMB] MapJoin operator fails
 ---

 Key: HIVE-8097
 URL: https://issues.apache.org/jira/browse/HIVE-8097
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8097.01.patch


 Fails attempting to getScratchColumnVectorTypes since mapWork is null on 
 reduce-side.
 Fix by calling that method using reduceWork object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8092) Vectorized Tez count(*) returns NULL instead of 0 when result is empty

2014-09-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8092:
---
Summary: Vectorized Tez count(*) returns NULL instead of 0 when result is 
empty  (was: count(*) returns NULL instead of 0 when result is empty)

 Vectorized Tez count(*) returns NULL instead of 0 when result is empty
 --

 Key: HIVE-8092
 URL: https://issues.apache.org/jira/browse/HIVE-8092
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch, 
 HIVE-8092.03.patch, HIVE-8092.04.patch


 In tez mode when vectorization is enabled, count returns NULL when result is 
 empty.
 Expected behavior: It should return 0
 count works as expected when vectorization is off



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty

2014-09-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8092:
---
Status: In Progress  (was: Patch Available)

 count(*) returns NULL instead of 0 when result is empty
 ---

 Key: HIVE-8092
 URL: https://issues.apache.org/jira/browse/HIVE-8092
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8092.01.patch


 In tez mode when vectorization is enabled, count returns NULL when result is 
 empty.
 Expected behavior: It should return 0
 count works as expected when vectorization is off



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty

2014-09-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8092:
---
Attachment: HIVE-8092.02.patch

 count(*) returns NULL instead of 0 when result is empty
 ---

 Key: HIVE-8092
 URL: https://issues.apache.org/jira/browse/HIVE-8092
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch


 In tez mode when vectorization is enabled, count returns NULL when result is 
 empty.
 Expected behavior: It should return 0
 count works as expected when vectorization is off



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty

2014-09-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8092:
---
Status: Patch Available  (was: In Progress)

 count(*) returns NULL instead of 0 when result is empty
 ---

 Key: HIVE-8092
 URL: https://issues.apache.org/jira/browse/HIVE-8092
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch


 In tez mode when vectorization is enabled, count returns NULL when result is 
 empty.
 Expected behavior: It should return 0
 count works as expected when vectorization is off



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty

2014-09-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8092:
---
Status: In Progress  (was: Patch Available)

 count(*) returns NULL instead of 0 when result is empty
 ---

 Key: HIVE-8092
 URL: https://issues.apache.org/jira/browse/HIVE-8092
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch


 In tez mode when vectorization is enabled, count returns NULL when result is 
 empty.
 Expected behavior: It should return 0
 count works as expected when vectorization is off



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty

2014-09-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8092:
---
Attachment: HIVE-8092.03.patch

 count(*) returns NULL instead of 0 when result is empty
 ---

 Key: HIVE-8092
 URL: https://issues.apache.org/jira/browse/HIVE-8092
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch, 
 HIVE-8092.03.patch


 In tez mode when vectorization is enabled, count returns NULL when result is 
 empty.
 Expected behavior: It should return 0
 count works as expected when vectorization is off



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty

2014-09-13 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8092:
---
Status: Patch Available  (was: In Progress)

 count(*) returns NULL instead of 0 when result is empty
 ---

 Key: HIVE-8092
 URL: https://issues.apache.org/jira/browse/HIVE-8092
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8092.01.patch, HIVE-8092.02.patch, 
 HIVE-8092.03.patch


 In tez mode when vectorization is enabled, count returns NULL when result is 
 empty.
 Expected behavior: It should return 0
 count works as expected when vectorization is off



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty

2014-09-12 Thread Matt McCline (JIRA)
Matt McCline created HIVE-8092:
--

 Summary: count(*) returns NULL instead of 0 when result is empty
 Key: HIVE-8092
 URL: https://issues.apache.org/jira/browse/HIVE-8092
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical


In tez mode when vectorization is enabled, count returns NULL when result is 
empty.

Expected behavior: It should return 0

count works as expected when vectorization is off



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty

2014-09-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8092:
---
Status: Patch Available  (was: Open)

 count(*) returns NULL instead of 0 when result is empty
 ---

 Key: HIVE-8092
 URL: https://issues.apache.org/jira/browse/HIVE-8092
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8092.01.patch


 In tez mode when vectorization is enabled, count returns NULL when result is 
 empty.
 Expected behavior: It should return 0
 count works as expected when vectorization is off



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8092) count(*) returns NULL instead of 0 when result is empty

2014-09-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8092:
---
Attachment: HIVE-8092.01.patch

 count(*) returns NULL instead of 0 when result is empty
 ---

 Key: HIVE-8092
 URL: https://issues.apache.org/jira/browse/HIVE-8092
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8092.01.patch


 In tez mode when vectorization is enabled, count returns NULL when result is 
 empty.
 Expected behavior: It should return 0
 count works as expected when vectorization is off



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-11 Thread Matt McCline (JIRA)
Matt McCline created HIVE-8052:
--

 Summary: Vectorization: min() on TimeStamp datatype fails with 
error Vector aggregate not implemented: min for type: TIMESTAMP
 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical


Changes in HIVE-5760 to make explicit when timestamp and date can be vectorized 
as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-11 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8052:
---
Attachment: HIVE-8052.01.patch

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-11 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8052:
---
Status: Patch Available  (was: Open)

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8063) Vectorization: different results between MR and Tez for aggregation functions (variance, var_pop, var_samp, std, stddev, etc)

2014-09-11 Thread Matt McCline (JIRA)
Matt McCline created HIVE-8063:
--

 Summary: Vectorization: different results between MR and Tez for 
aggregation functions (variance, var_pop, var_samp, std, stddev, etc)
 Key: HIVE-8063
 URL: https://issues.apache.org/jira/browse/HIVE-8063
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical



When running vectorized_timestamp_funcs.q with aggregation functions (listed in 
title), the results were 0 on Tez and non-zero on MR.  The aggregatesDefinition 
table in VectorizationContext.java currently disallows timestamp and date for 
those functions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-11 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8052:
---
Status: In Progress  (was: Patch Available)

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-11 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8052:
---
Status: Patch Available  (was: In Progress)

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8052) Vectorization: min() on TimeStamp datatype fails with error Vector aggregate not implemented: min for type: TIMESTAMP

2014-09-11 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8052:
---
Attachment: HIVE-8052.02.patch

 Vectorization: min() on TimeStamp datatype fails with error Vector aggregate 
 not implemented: min for type: TIMESTAMP
 ---

 Key: HIVE-8052
 URL: https://issues.apache.org/jira/browse/HIVE-8052
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-8052.01.patch, HIVE-8052.02.patch


 Changes in HIVE-5760 to make explicit when timestamp and date can be 
 vectorized as Long were accidentally to strict for min, max, count, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-7537) Output vectorized GROUP BY with only primitive aggregate fields as columns so downstream operators will be vectorized

2014-09-10 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-7537.

Resolution: Fixed

 Output vectorized GROUP BY with only primitive aggregate fields as columns so 
 downstream operators will be vectorized
 -

 Key: HIVE-7537
 URL: https://issues.apache.org/jira/browse/HIVE-7537
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline

 When under Tez engine, see if the VectorGroupByOperator aggregrates are all 
 primitive (e.g. sum) and batch the output rows into a VectorizedRowBatch.  
 And, vectorize downstream operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7537) Output vectorized GROUP BY with only primitive aggregate fields as columns so downstream operators will be vectorized

2014-09-10 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7537:
---
Attachment: (was: HIVE-7537.1.patch)

 Output vectorized GROUP BY with only primitive aggregate fields as columns so 
 downstream operators will be vectorized
 -

 Key: HIVE-7537
 URL: https://issues.apache.org/jira/browse/HIVE-7537
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline

 When under Tez engine, see if the VectorGroupByOperator aggregrates are all 
 primitive (e.g. sum) and batch the output rows into a VectorizedRowBatch.  
 And, vectorize downstream operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7537) Output vectorized GROUP BY with only primitive aggregate fields as columns so downstream operators will be vectorized

2014-09-10 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129364#comment-14129364
 ] 

Matt McCline commented on HIVE-7537:


This work was done in HIVE-7405.

 Output vectorized GROUP BY with only primitive aggregate fields as columns so 
 downstream operators will be vectorized
 -

 Key: HIVE-7537
 URL: https://issues.apache.org/jira/browse/HIVE-7537
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline

 When under Tez engine, see if the VectorGroupByOperator aggregrates are all 
 primitive (e.g. sum) and batch the output rows into a VectorizedRowBatch.  
 And, vectorize downstream operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7442) ql.exec.vector.expressions.gen.DecimalColAddDecimalScalar.evaluate throws ClassCastException: ...LongColumnVector cannot be cast to ...DecimalColumnVector

2014-09-10 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129368#comment-14129368
 ] 

Matt McCline commented on HIVE-7442:


This problem was fixed in the CHAR / VARCHAR HIVE-5760 change.

 ql.exec.vector.expressions.gen.DecimalColAddDecimalScalar.evaluate throws 
 ClassCastException: ...LongColumnVector cannot be cast to 
 ...DecimalColumnVector
 --

 Key: HIVE-7442
 URL: https://issues.apache.org/jira/browse/HIVE-7442
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline

 Took decimal_join.q and converted it to read from ORC and turned on 
 vectorization:
 vector_decimal_join.q
 {code}
 SET hive.vectorized.execution.enabled=true;
 -- HIVE-5292 Join on decimal columns fails
 create table src_dec_staging (key decimal(3,0), value string);
 load data local inpath '../../data/files/kv1.txt' into table src_dec_staging;
 create table src_dec (key decimal(3,0), value string) stored as orc;
 insert overwrite table src_dec select * from src_dec_staging;
 explain select * from src_dec a join src_dec b on a.key=b.key+450;
 select * from src_dec a join src_dec b on a.key=b.key+450;
 {code}
 Stack trace:
 {code}
 java.lang.Exception: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row 
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
 Caused by: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row 
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
   at java.lang.Thread.run(Thread.java:695)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row 
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 10 more
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to 
 org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DecimalColAddDecimalScalar.evaluate(DecimalColAddDecimalScalar.java:60)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.FuncDecimalToLong.evaluate(FuncDecimalToLong.java:51)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.SelectColumnIsNotNull.evaluate(SelectColumnIsNotNull.java:45)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.processOp(VectorFilterOperator.java:91)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 11 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-7442) ql.exec.vector.expressions.gen.DecimalColAddDecimalScalar.evaluate throws ClassCastException: ...LongColumnVector cannot be cast to ...DecimalColumnVector

2014-09-10 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-7442.

Resolution: Fixed

 ql.exec.vector.expressions.gen.DecimalColAddDecimalScalar.evaluate throws 
 ClassCastException: ...LongColumnVector cannot be cast to 
 ...DecimalColumnVector
 --

 Key: HIVE-7442
 URL: https://issues.apache.org/jira/browse/HIVE-7442
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline

 Took decimal_join.q and converted it to read from ORC and turned on 
 vectorization:
 vector_decimal_join.q
 {code}
 SET hive.vectorized.execution.enabled=true;
 -- HIVE-5292 Join on decimal columns fails
 create table src_dec_staging (key decimal(3,0), value string);
 load data local inpath '../../data/files/kv1.txt' into table src_dec_staging;
 create table src_dec (key decimal(3,0), value string) stored as orc;
 insert overwrite table src_dec select * from src_dec_staging;
 explain select * from src_dec a join src_dec b on a.key=b.key+450;
 select * from src_dec a join src_dec b on a.key=b.key+450;
 {code}
 Stack trace:
 {code}
 java.lang.Exception: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row 
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
 Caused by: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row 
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
   at java.lang.Thread.run(Thread.java:695)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row 
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 10 more
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to 
 org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DecimalColAddDecimalScalar.evaluate(DecimalColAddDecimalScalar.java:60)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.FuncDecimalToLong.evaluate(FuncDecimalToLong.java:51)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.SelectColumnIsNotNull.evaluate(SelectColumnIsNotNull.java:45)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.processOp(VectorFilterOperator.java:91)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 11 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5701) vectorized groupby should work with vectorized reduce sink

2014-09-10 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129372#comment-14129372
 ] 

Matt McCline commented on HIVE-5701:


Fixed with HIVE-7405.

 vectorized groupby should work with vectorized reduce sink
 --

 Key: HIVE-5701
 URL: https://issues.apache.org/jira/browse/HIVE-5701
 Project: Hive
  Issue Type: Improvement
  Components: Vectorization
Reporter: Sergey Shelukhin
Assignee: Matt McCline

 As far as I understand right now vectorized group by works with regular 
 reduce sink. [~jnp] fyi



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-5701) vectorized groupby should work with vectorized reduce sink

2014-09-10 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-5701.

Resolution: Fixed

 vectorized groupby should work with vectorized reduce sink
 --

 Key: HIVE-5701
 URL: https://issues.apache.org/jira/browse/HIVE-5701
 Project: Hive
  Issue Type: Improvement
  Components: Vectorization
Reporter: Sergey Shelukhin
Assignee: Matt McCline

 As far as I understand right now vectorized group by works with regular 
 reduce sink. [~jnp] fyi



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-09 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Status: In Progress  (was: Patch Available)

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, 
 HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, 
 HIVE-7405.995.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-09 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Attachment: HIVE-7405.996.patch

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, 
 HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, 
 HIVE-7405.995.patch, HIVE-7405.996.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-09 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Status: Patch Available  (was: In Progress)

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, 
 HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, 
 HIVE-7405.995.patch, HIVE-7405.996.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-08 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Status: In Progress  (was: Patch Available)

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, 
 HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, 
 HIVE-7405.995.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-08 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Attachment: HIVE-7405.995.patch

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, 
 HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, 
 HIVE-7405.995.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-08 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Status: Patch Available  (was: In Progress)

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, 
 HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, 
 HIVE-7405.995.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-06 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Status: Patch Available  (was: In Progress)

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, 
 HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-05 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Status: In Progress  (was: Patch Available)

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, 
 HIVE-7405.99.patch, HIVE-7405.991.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-05 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Attachment: HIVE-7405.994.patch

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, 
 HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-04 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Status: Open  (was: Patch Available)

Submitted more than 12 hours ago and no result?

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, HIVE-7405.99.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-04 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Attachment: HIVE-7405.991.patch

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, 
 HIVE-7405.99.patch, HIVE-7405.991.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-04 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Status: Patch Available  (was: Open)

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, 
 HIVE-7405.99.patch, HIVE-7405.991.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types

2014-09-03 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-5760:
---
Attachment: HIVE-5760.91.patch

 Add vectorized support for CHAR/VARCHAR data types
 --

 Key: HIVE-5760
 URL: https://issues.apache.org/jira/browse/HIVE-5760
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Matt McCline
 Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, 
 HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, 
 HIVE-5760.91.patch


 Add support to allow queries referencing VARCHAR columns and expression 
 results to run efficiently in vectorized mode. This should re-use the code 
 for the STRING type to the extent possible and beneficial. Include unit tests 
 and end-to-end tests. Consider re-using or extending existing end-to-end 
 tests for vectorized string operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types

2014-09-03 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-5760:
---
Status: Patch Available  (was: In Progress)

 Add vectorized support for CHAR/VARCHAR data types
 --

 Key: HIVE-5760
 URL: https://issues.apache.org/jira/browse/HIVE-5760
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Matt McCline
 Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, 
 HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, 
 HIVE-5760.91.patch


 Add support to allow queries referencing VARCHAR columns and expression 
 results to run efficiently in vectorized mode. This should re-use the code 
 for the STRING type to the extent possible and beneficial. Include unit tests 
 and end-to-end tests. Consider re-using or extending existing end-to-end 
 tests for vectorized string operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types

2014-09-03 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-5760:
---
Status: In Progress  (was: Patch Available)

 Add vectorized support for CHAR/VARCHAR data types
 --

 Key: HIVE-5760
 URL: https://issues.apache.org/jira/browse/HIVE-5760
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Matt McCline
 Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, 
 HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, 
 HIVE-5760.91.patch


 Add support to allow queries referencing VARCHAR columns and expression 
 results to run efficiently in vectorized mode. This should re-use the code 
 for the STRING type to the extent possible and beneficial. Include unit tests 
 and end-to-end tests. Consider re-using or extending existing end-to-end 
 tests for vectorized string operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types

2014-09-03 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-5760:
---
Attachment: HIVE-5760.92.patch

 Add vectorized support for CHAR/VARCHAR data types
 --

 Key: HIVE-5760
 URL: https://issues.apache.org/jira/browse/HIVE-5760
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Matt McCline
 Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, 
 HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, 
 HIVE-5760.91.patch, HIVE-5760.92.patch


 Add support to allow queries referencing VARCHAR columns and expression 
 results to run efficiently in vectorized mode. This should re-use the code 
 for the STRING type to the extent possible and beneficial. Include unit tests 
 and end-to-end tests. Consider re-using or extending existing end-to-end 
 tests for vectorized string operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types

2014-09-03 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-5760:
---
Status: Patch Available  (was: In Progress)

 Add vectorized support for CHAR/VARCHAR data types
 --

 Key: HIVE-5760
 URL: https://issues.apache.org/jira/browse/HIVE-5760
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Matt McCline
 Attachments: HIVE-5760.1.patch, HIVE-5760.2.patch, HIVE-5760.3.patch, 
 HIVE-5760.4.patch, HIVE-5760.5.patch, HIVE-5760.7.patch, HIVE-5760.8.patch, 
 HIVE-5760.91.patch, HIVE-5760.92.patch


 Add support to allow queries referencing VARCHAR columns and expression 
 results to run efficiently in vectorized mode. This should re-use the code 
 for the STRING type to the extent possible and beneficial. Include unit tests 
 and end-to-end tests. Consider re-using or extending existing end-to-end 
 tests for vectorized string operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-03 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Status: In Progress  (was: Patch Available)

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-03 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---
Attachment: HIVE-7405.98.patch

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2   3   4   5   6   7   8   9   >