[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6662:
---

Status: Open  (was: Patch Available)

 Vector Join operations with DATE columns fail
 -

 Key: HIVE-6662
 URL: https://issues.apache.org/jira/browse/HIVE-6662
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 0.13.0

 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch


 Trying to generate a DATE column as part of a JOIN's output throws an 
 exception
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 Long vector column and primitive category DATE
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6662:
---

Attachment: HIVE-6662.2.patch

 Vector Join operations with DATE columns fail
 -

 Key: HIVE-6662
 URL: https://issues.apache.org/jira/browse/HIVE-6662
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 0.13.0

 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch


 Trying to generate a DATE column as part of a JOIN's output throws an 
 exception
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 Long vector column and primitive category DATE
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6662:
---

Status: Patch Available  (was: Open)

Submitting same patch again for pre-commit tests.

 Vector Join operations with DATE columns fail
 -

 Key: HIVE-6662
 URL: https://issues.apache.org/jira/browse/HIVE-6662
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 0.13.0

 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch


 Trying to generate a DATE column as part of a JOIN's output throws an 
 exception
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 Long vector column and primitive category DATE
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951483#comment-13951483
 ] 

Jitendra Nath Pandey commented on HIVE-6188:


+1

 Document hive.metastore.try.direct.sql  hive.metastore.try.direct.sql.ddl
 --

 Key: HIVE-6188
 URL: https://issues.apache.org/jira/browse/HIVE-6188
 Project: Hive
  Issue Type: Improvement
  Components: Documentation
Reporter: Lefty Leverenz
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6188.patch


 The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl 
 configuration properties need to be documented in hive-default.xml.template 
 and the wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6776) Vectorization: Optimize timestamp to string-timestamp comparision using long comparisons.

2014-03-28 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-6776:
--

 Summary: Vectorization: Optimize timestamp to string-timestamp 
comparision using long comparisons. 
 Key: HIVE-6776
 URL: https://issues.apache.org/jira/browse/HIVE-6776
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey


The timestamp to string-timestamp comparison currently (HIVE-6752) happens 
using string-string comparison because timestamp is cast to string. This can be 
optimized by casting string-timestamps to timestamps and using long comparisons 
instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951562#comment-13951562
 ] 

Jitendra Nath Pandey commented on HIVE-6752:


Thanks for the review Eric, I have filed following jiras:
HIVE-6776 : To optimize timestamp comparisons.
HIVE-6777: Comments and readability in VectorizedHashKeyWrapper.

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, 
 HIVE-6752.4.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6777) VectorizedHashKeyWrapper: comments and readability.

2014-03-28 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-6777:
--

 Summary: VectorizedHashKeyWrapper: comments and readability.
 Key: HIVE-6777
 URL: https://issues.apache.org/jira/browse/HIVE-6777
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey


We should add more comments in VectorizedHashKeyWrapper particularly to explain 
the logic behind using offsets for different datatypes.
Also consider helper functions for offset calculation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951570#comment-13951570
 ] 

Jitendra Nath Pandey commented on HIVE-6752:


[~rhbutani] This bug affects hive-0.13 and is important for decimal/date to 
work correctly in vectorization, therefore the patch should be ported to 
branch-0.13 as well.

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, 
 HIVE-6752.4.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951571#comment-13951571
 ] 

Jitendra Nath Pandey commented on HIVE-6752:


I ran the tests locally for the latest patch, and tests pass.

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, 
 HIVE-6752.4.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6752:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
 Release Note: I have committed this to trunk and branch-0.13
   Status: Resolved  (was: Patch Available)

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0

 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, 
 HIVE-6752.4.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6662) Vector Join operations with DATE columns fail

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951765#comment-13951765
 ] 

Jitendra Nath Pandey commented on HIVE-6662:


The failed test has no relation to the patch. I will commit it shortly.
[~rhbutani] This should be committed to branch-0.13 as well, otherwise vector 
join on DATE column in hive-0.13 will not work.

 Vector Join operations with DATE columns fail
 -

 Key: HIVE-6662
 URL: https://issues.apache.org/jira/browse/HIVE-6662
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 0.13.0

 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch


 Trying to generate a DATE column as part of a JOIN's output throws an 
 exception
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 Long vector column and primitive category DATE
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6662:
---

Affects Version/s: 0.13.0

 Vector Join operations with DATE columns fail
 -

 Key: HIVE-6662
 URL: https://issues.apache.org/jira/browse/HIVE-6662
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch


 Trying to generate a DATE column as part of a JOIN's output throws an 
 exception
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 Long vector column and primitive category DATE
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6662:
---

Fix Version/s: (was: 0.13.0)

 Vector Join operations with DATE columns fail
 -

 Key: HIVE-6662
 URL: https://issues.apache.org/jira/browse/HIVE-6662
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch


 Trying to generate a DATE column as part of a JOIN's output throws an 
 exception
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 Long vector column and primitive category DATE
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6662:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

I have committed this to trunk and branch-0.13. Thanks to [~gopalv] !

 Vector Join operations with DATE columns fail
 -

 Key: HIVE-6662
 URL: https://issues.apache.org/jira/browse/HIVE-6662
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 0.13.0

 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch


 Trying to generate a DATE column as part of a JOIN's output throws an 
 exception
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 Long vector column and primitive category DATE
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-27 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6752:
---

Attachment: HIVE-6752.1.patch

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-27 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6752:
---

Status: Patch Available  (was: Open)

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-27 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948981#comment-13948981
 ] 

Jitendra Nath Pandey commented on HIVE-6752:


Review board entry: https://reviews.apache.org/r/19718/

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-27 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6752:
---

Status: Patch Available  (was: Open)

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-27 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6752:
---

Status: Open  (was: Patch Available)

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-27 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6752:
---

Attachment: HIVE-6752.2.patch

Updated patch addressing the comments. Also updated test to make with pass with 
hadoop-2 because hadoop-2 produces results in different order.

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6662) Vector Join operations with DATE columns fail

2014-03-27 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950169#comment-13950169
 ] 

Jitendra Nath Pandey commented on HIVE-6662:


+1.

 Vector Join operations with DATE columns fail
 -

 Key: HIVE-6662
 URL: https://issues.apache.org/jira/browse/HIVE-6662
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 0.13.0

 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch


 Trying to generate a DATE column as part of a JOIN's output throws an 
 exception
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 Long vector column and primitive category DATE
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6060) Define API for RecordUpdater and UpdateReader

2014-03-26 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948233#comment-13948233
 ] 

Jitendra Nath Pandey commented on HIVE-6060:


+1 for the latest patch.

 Define API for RecordUpdater and UpdateReader
 -

 Key: HIVE-6060
 URL: https://issues.apache.org/jira/browse/HIVE-6060
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, 
 HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, 
 acid-io.patch, h-5317.patch, h-5317.patch, h-5317.patch, h-6060.patch, 
 h-6060.patch


 We need to define some new APIs for how Hive interacts with the file formats 
 since it needs to be much richer than the current RecordReader and 
 RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6708) ConstantVectorExpression should create copies of data objects rather than referencing them

2014-03-26 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948248#comment-13948248
 ] 

Jitendra Nath Pandey commented on HIVE-6708:


+1 for patch 4.

 ConstantVectorExpression should create copies of data objects rather than 
 referencing them
 --

 Key: HIVE-6708
 URL: https://issues.apache.org/jira/browse/HIVE-6708
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-6708-1.patch, HIVE-6708-3.patch, HIVE-6708-4.patch, 
 HIVE-6708.2.patch


 1. ConstantVectorExpression vector should be updated for bytecolumnvectors 
 and decimalColumnVectors. The current code changes the reference to the 
 vector which might be shared across multiple columns
 2. VectorizationContext.foldConstantsForUnaryExpression(ExprNodeDesc 
 exprDesc) has a minor bug as to when to constant fold the expression.
 The following code should replace the corresponding piece of code in the 
 trunk.
 ..
 GenericUDF gudf = ((ExprNodeGenericFuncDesc) exprDesc).getGenericUDF();
 if (gudf instanceof GenericUDFOPNegative || gudf instanceof 
 GenericUDFOPPositive
 || castExpressionUdfs.contains(gudf.getClass())
 ... 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6708) ConstantVectorExpression should create copies of data objects rather than referencing them

2014-03-26 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948262#comment-13948262
 ] 

Jitendra Nath Pandey commented on HIVE-6708:


[~rhbutani] This bug is causing some correctness issues, therefore this fix 
should also go to hive-0.13.

 ConstantVectorExpression should create copies of data objects rather than 
 referencing them
 --

 Key: HIVE-6708
 URL: https://issues.apache.org/jira/browse/HIVE-6708
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-6708-1.patch, HIVE-6708-3.patch, HIVE-6708-4.patch, 
 HIVE-6708.2.patch


 1. ConstantVectorExpression vector should be updated for bytecolumnvectors 
 and decimalColumnVectors. The current code changes the reference to the 
 vector which might be shared across multiple columns
 2. VectorizationContext.foldConstantsForUnaryExpression(ExprNodeDesc 
 exprDesc) has a minor bug as to when to constant fold the expression.
 The following code should replace the corresponding piece of code in the 
 trunk.
 ..
 GenericUDF gudf = ((ExprNodeGenericFuncDesc) exprDesc).getGenericUDF();
 if (gudf instanceof GenericUDFOPNegative || gudf instanceof 
 GenericUDFOPPositive
 || castExpressionUdfs.contains(gudf.getClass())
 ... 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6708) ConstantVectorExpression should create copies of data objects rather than referencing them

2014-03-26 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6708:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I have committed this to trunk and branch-0.13. Thanks to [~hari.s]!

 ConstantVectorExpression should create copies of data objects rather than 
 referencing them
 --

 Key: HIVE-6708
 URL: https://issues.apache.org/jira/browse/HIVE-6708
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6708-1.patch, HIVE-6708-3.patch, HIVE-6708-4.patch, 
 HIVE-6708.2.patch


 1. ConstantVectorExpression vector should be updated for bytecolumnvectors 
 and decimalColumnVectors. The current code changes the reference to the 
 vector which might be shared across multiple columns
 2. VectorizationContext.foldConstantsForUnaryExpression(ExprNodeDesc 
 exprDesc) has a minor bug as to when to constant fold the expression.
 The following code should replace the corresponding piece of code in the 
 trunk.
 ..
 GenericUDF gudf = ((ExprNodeGenericFuncDesc) exprDesc).getGenericUDF();
 if (gudf instanceof GenericUDFOPNegative || gudf instanceof 
 GenericUDFOPPositive
 || castExpressionUdfs.contains(gudf.getClass())
 ... 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6762) VectorGroupBy operator should produce vectorized row batch and let a terminal operator convert to row mode.

2014-03-26 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-6762:
--

 Summary: VectorGroupBy operator should produce vectorized row 
batch and let a terminal operator convert to row mode.
 Key: HIVE-6762
 URL: https://issues.apache.org/jira/browse/HIVE-6762
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey


VectorGroupBy operator should produce vectorized row batch and let a terminal 
operator convert to row mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6349) Column name map is broken

2014-03-25 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946850#comment-13946850
 ] 

Jitendra Nath Pandey commented on HIVE-6349:


[~rhbutani] This is a major bug and should be fixed in hive-0.13 as well.

 Column name map is broken 
 --

 Key: HIVE-6349
 URL: https://issues.apache.org/jira/browse/HIVE-6349
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6349.1.patch


 Following query results in exception at run time in vector mode.
 {code}
 explain select n_name from supplier_orc s join ( select n_name, n_nationkey 
 from nation_orc n join region_orc r on n.n_regionkey = r.r_regionkey and 
 r.r_name = 'XYZ') n1 on s.s_nationkey = n1.n_nationkey;
 {code}
 Here n_name is a string and all other fields are int.
 The stack trace:
 {code}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:260)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector cannot be cast to 
 org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:116)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:280)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:133)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:246)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:253)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:574)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:585)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:234)
   ... 8 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6349) Column name map is broken

2014-03-25 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6349:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

I have committed this to trunk and branch-0.13

 Column name map is broken 
 --

 Key: HIVE-6349
 URL: https://issues.apache.org/jira/browse/HIVE-6349
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0

 Attachments: HIVE-6349.1.patch


 Following query results in exception at run time in vector mode.
 {code}
 explain select n_name from supplier_orc s join ( select n_name, n_nationkey 
 from nation_orc n join region_orc r on n.n_regionkey = r.r_regionkey and 
 r.r_name = 'XYZ') n1 on s.s_nationkey = n1.n_nationkey;
 {code}
 Here n_name is a string and all other fields are int.
 The stack trace:
 {code}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:260)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector cannot be cast to 
 org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:116)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:280)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:133)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:246)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:253)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:574)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:585)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:234)
   ... 8 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-25 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-6752:
--

 Summary: Vectorized Between and IN expressions don't work with 
decimal, date types.
 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5817) column name to index mapping in VectorizationContext is broken

2014-03-24 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945418#comment-13945418
 ] 

Jitendra Nath Pandey commented on HIVE-5817:


bq. It seems like I cannot make a repro for select... even though there are 
name collisions, the mappings are correct. Perhaps when someone finds a bug we 
can solve it .
HIVE-6349 patch includes a test that reproduces the problem for vectorized 
select. The patch also fixes the issue.

 column name to index mapping in VectorizationContext is broken
 --

 Key: HIVE-5817
 URL: https://issues.apache.org/jira/browse/HIVE-5817
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.13.0
Reporter: Sergey Shelukhin
Assignee: Remus Rusanu
Priority: Critical
 Fix For: 0.13.0

 Attachments: HIVE-5817-uniquecols.broken.patch, 
 HIVE-5817.00-broken.patch, HIVE-5817.4.patch, HIVE-5817.5.patch, 
 HIVE-5817.6.patch


 Columns coming from different operators may have the same internal names 
 (_colNN). There exists a query in the form {{select b.cb, a.ca from a JOIN 
 b ON ... JOIN x ON ...;}}  (distilled from a more complex query), which runs 
 ok w/o vectorization. With vectorization, it will run ok for most ca, but for 
 some ca it will fail (or can probably return incorrect results). That is 
 because when building column-to-VRG-index map in VectorizationContext, 
 internal column name for ca that the first map join operator adds to the 
 mapping may be the same as internal name for cb that the 2nd one tries to 
 add. 2nd VMJ doesn't add it (see code in ctor), and when it's time for it to 
 output stuff, it retrieves wrong index from the map by name, and then wrong 
 vector from VRG.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6731) Non-staged mapjoin optimization doesn't work with vectorization.

2014-03-24 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-6731:
--

 Summary: Non-staged mapjoin optimization doesn't work with 
vectorization.
 Key: HIVE-6731
 URL: https://issues.apache.org/jira/browse/HIVE-6731
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey


This issue relates to HIVE-6144. The non-staged map join optimization with 
hive.auto.convert.join.use.nonstaged=true, doesn't work with vectorization 
turned on. The query works but hashtables are still created in local mode in a 
separate stage. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6735) Make scalable dynamic partitioning work in vectorized mode

2014-03-24 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946085#comment-13946085
 ] 

Jitendra Nath Pandey commented on HIVE-6735:


+1 LGTM

 Make scalable dynamic partitioning work in vectorized mode
 --

 Key: HIVE-6735
 URL: https://issues.apache.org/jira/browse/HIVE-6735
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.0, 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
 Fix For: 0.13.0, 0.14.0

 Attachments: HIVE-6735.1.patch


 HIVE-6455 added support for scalable dynamic partitioning. This is subtask to 
 make HIVE-6455 work with vectorized operators.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys

2014-03-23 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6222:
---

Fix Version/s: (was: 0.14.0)
   0.13.0

 Make Vector Group By operator abandon grouping if too many distinct keys
 

 Key: HIVE-6222
 URL: https://issues.apache.org/jira/browse/HIVE-6222
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: vectorization
 Fix For: 0.13.0

 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, 
 HIVE-6222.4.patch, HIVE-6222.5.patch


 Row mode GBY is becoming a pass-through if not enough aggregation occurs on 
 the map side, relying on the shuffle+reduce side to do the work. Have VGBY do 
 the same.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys

2014-03-23 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944520#comment-13944520
 ] 

Jitendra Nath Pandey commented on HIVE-6222:


I have committed this to branch-0.13 as well.

 Make Vector Group By operator abandon grouping if too many distinct keys
 

 Key: HIVE-6222
 URL: https://issues.apache.org/jira/browse/HIVE-6222
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: vectorization
 Fix For: 0.13.0

 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, 
 HIVE-6222.4.patch, HIVE-6222.5.patch


 Row mode GBY is becoming a pass-through if not enough aggregation occurs on 
 the map side, relying on the shuffle+reduce side to do the work. Have VGBY do 
 the same.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6349) Column name map is broken

2014-03-23 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6349:
---

Attachment: HIVE-6349.1.patch

The attached patch makes VectorSelectOperator implement 
VectorizedRegionContext. Therefore, VectorSelectOperator also gives out a 
vectorization context with updated column map. However, VectorSelectOperator 
doesn't create a new row batch, therefore it re-uses the same Output Column 
Manager from its parent's vectorization context.This is important because then, 
it doesn't have to allocate scratch columns.

 Column name map is broken 
 --

 Key: HIVE-6349
 URL: https://issues.apache.org/jira/browse/HIVE-6349
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6349.1.patch


 Following query results in exception at run time in vector mode.
 {code}
 explain select n_name from supplier_orc s join ( select n_name, n_nationkey 
 from nation_orc n join region_orc r on n.n_regionkey = r.r_regionkey and 
 r.r_name = 'XYZ') n1 on s.s_nationkey = n1.n_nationkey;
 {code}
 Here n_name is a string and all other fields are int.
 The stack trace:
 {code}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:260)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector cannot be cast to 
 org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:116)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:280)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:133)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:246)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:253)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:574)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:585)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:234)
   ... 8 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6349) Column name map is broken

2014-03-23 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6349:
---

Status: Patch Available  (was: Open)

 Column name map is broken 
 --

 Key: HIVE-6349
 URL: https://issues.apache.org/jira/browse/HIVE-6349
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6349.1.patch


 Following query results in exception at run time in vector mode.
 {code}
 explain select n_name from supplier_orc s join ( select n_name, n_nationkey 
 from nation_orc n join region_orc r on n.n_regionkey = r.r_regionkey and 
 r.r_name = 'XYZ') n1 on s.s_nationkey = n1.n_nationkey;
 {code}
 Here n_name is a string and all other fields are int.
 The stack trace:
 {code}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:260)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector cannot be cast to 
 org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:116)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:280)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:133)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:246)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:253)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:574)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:585)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:234)
   ... 8 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys

2014-03-22 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944142#comment-13944142
 ] 

Jitendra Nath Pandey commented on HIVE-6222:


[~rhbutani] For too many distinct keys, map side grouping can become too slow 
in vectorized mode under memory pressure. This affects hive-0.13 as well, 
therefore we should port it to branch-0.13 as well.

 Make Vector Group By operator abandon grouping if too many distinct keys
 

 Key: HIVE-6222
 URL: https://issues.apache.org/jira/browse/HIVE-6222
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: vectorization
 Fix For: 0.14.0

 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, 
 HIVE-6222.4.patch, HIVE-6222.5.patch


 Row mode GBY is becoming a pass-through if not enough aggregation occurs on 
 the map side, relying on the shuffle+reduce side to do the work. Have VGBY do 
 the same.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6060) Define API for RecordUpdater and UpdateReader

2014-03-21 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943341#comment-13943341
 ] 

Jitendra Nath Pandey commented on HIVE-6060:


OrcInputFormat#getRecordReader must check for vectorized mode before returning 
any reader. It seems this patch has moved the check down which introduces a 
scenario where non-vectorized record reader will be returned in vectorized 
mode, which would cause the query to fail.

 Define API for RecordUpdater and UpdateReader
 -

 Key: HIVE-6060
 URL: https://issues.apache.org/jira/browse/HIVE-6060
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, 
 HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, acid-io.patch, 
 h-5317.patch, h-5317.patch, h-5317.patch, h-6060.patch, h-6060.patch


 We need to define some new APIs for how Hive interacts with the file formats 
 since it needs to be much richer than the current RecordReader and 
 RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6708) ConstantVectorExpression should create copies of data objects rather than referencing them

2014-03-21 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943523#comment-13943523
 ] 

Jitendra Nath Pandey commented on HIVE-6708:


+1

 ConstantVectorExpression should create copies of data objects rather than 
 referencing them
 --

 Key: HIVE-6708
 URL: https://issues.apache.org/jira/browse/HIVE-6708
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-6708-1.patch, HIVE-6708.2.patch


 1. ConstantVectorExpression vector should be updated for bytecolumnvectors 
 and decimalColumnVectors. The current code changes the reference to the 
 vector which might be shared across multiple columns
 2. VectorizationContext.foldConstantsForUnaryExpression(ExprNodeDesc 
 exprDesc) has a minor bug as to when to constant fold the expression.
 The following code should replace the corresponding piece of code in the 
 trunk.
 ..
 GenericUDF gudf = ((ExprNodeGenericFuncDesc) exprDesc).getGenericUDF();
 if (gudf instanceof GenericUDFOPNegative || gudf instanceof 
 GenericUDFOPPositive
 || castExpressionUdfs.contains(gudf.getClass())
 ... 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6701) Analyze table compute statistics for decimal columns.

2014-03-20 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6701:
---

Attachment: HIVE-6701.1.patch

This patch only adds decimal support in the udf GenericUDAFComputeStats.
The changes to metastore and thrift APIs are still needed.

 Analyze table compute statistics for decimal columns.
 -

 Key: HIVE-6701
 URL: https://issues.apache.org/jira/browse/HIVE-6701
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6701.1.patch


 Analyze table should compute statistics for decimal columns as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6701) Analyze table compute statistics for decimal columns.

2014-03-19 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-6701:
--

 Summary: Analyze table compute statistics for decimal columns.
 Key: HIVE-6701
 URL: https://issues.apache.org/jira/browse/HIVE-6701
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


Analyze table should compute statistics for decimal columns as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-18 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939608#comment-13939608
 ] 

Jitendra Nath Pandey commented on HIVE-6639:


I have committed this to trunk.

[~rhbutani] This bug affects hive-0.13 and fails queries having partitioned 
columns but no filters. This should be fixed in branch-0.13 as well.


 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk. The exception stacktrace is :
 {code}
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116)
   ... 42 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

I have committed this to trunk and branch-0.13.

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0

 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk. The exception stacktrace is :
 {code}
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116)
   ... 42 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Affects Version/s: 0.13.0

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0

 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk. The exception stacktrace is :
 {code}
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116)
   ... 42 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6518) Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered

2014-03-17 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6518:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

I have committed this trunk. Thanks to Gopal!

[~rhbutani] This is an important fix to vector group by because the aggregates 
must flush more aggressively in case of GC. Therefore, I intend to commit it to 
branch-0.13. as well.

 Add a GC canary to the VectorGroupByOperator to flush whenever a GC is 
 triggered
 

 Key: HIVE-6518
 URL: https://issues.apache.org/jira/browse/HIVE-6518
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-6518.1-tez.patch, HIVE-6518.2-tez.patch, 
 HIVE-6518.2.patch, HIVE-6518.3.patch


 The current VectorGroupByOperator implementation flushes the in-memory hashes 
 when the maximum entries or fraction of memory is hit.
 This works for most cases, but there are some corner cases where we hit GC 
 ovehead limits or heap size limits before either of those conditions are 
 reached due to the rest of the pipeline.
 This patch adds a SoftReference as a GC canary. If the soft reference is 
 dead, then a full GC pass happened sometime in the near past  the 
 aggregation hashtables should be flushed immediately before another full GC 
 is triggered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6518) Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered

2014-03-17 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938186#comment-13938186
 ] 

Jitendra Nath Pandey commented on HIVE-6518:


Committed to branch-0.13 as well. 

 Add a GC canary to the VectorGroupByOperator to flush whenever a GC is 
 triggered
 

 Key: HIVE-6518
 URL: https://issues.apache.org/jira/browse/HIVE-6518
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6518.1-tez.patch, HIVE-6518.2-tez.patch, 
 HIVE-6518.2.patch, HIVE-6518.3.patch


 The current VectorGroupByOperator implementation flushes the in-memory hashes 
 when the maximum entries or fraction of memory is hit.
 This works for most cases, but there are some corner cases where we hit GC 
 ovehead limits or heap size limits before either of those conditions are 
 reached due to the rest of the pipeline.
 This patch adds a SoftReference as a GC canary. If the soft reference is 
 dead, then a full GC pass happened sometime in the near past  the 
 aggregation hashtables should be flushed immediately before another full GC 
 is triggered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6518) Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered

2014-03-17 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6518:
---

Fix Version/s: (was: 0.14.0)
   0.13.0

 Add a GC canary to the VectorGroupByOperator to flush whenever a GC is 
 triggered
 

 Key: HIVE-6518
 URL: https://issues.apache.org/jira/browse/HIVE-6518
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6518.1-tez.patch, HIVE-6518.2-tez.patch, 
 HIVE-6518.2.patch, HIVE-6518.3.patch


 The current VectorGroupByOperator implementation flushes the in-memory hashes 
 when the maximum entries or fraction of memory is hit.
 This works for most cases, but there are some corner cases where we hit GC 
 ovehead limits or heap size limits before either of those conditions are 
 reached due to the rest of the pipeline.
 This patch adds a SoftReference as a GC canary. If the soft reference is 
 dead, then a full GC pass happened sometime in the near past  the 
 aggregation hashtables should be flushed immediately before another full GC 
 is triggered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.

2014-03-17 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938194#comment-13938194
 ] 

Jitendra Nath Pandey commented on HIVE-6680:


Ran tests locally. Only failures were show_create_table_serde.q and 
metadata_only_queries_with_filters.q which are unrelated to this patch.

 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
 --

 Key: HIVE-6680
 URL: https://issues.apache.org/jira/browse/HIVE-6680
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6680.1.patch, HIVE-6680.1.patch


 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-17 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938196#comment-13938196
 ] 

Jitendra Nath Pandey commented on HIVE-6664:


Ran tests locally. Only failures were show_create_table_serde.q and 
metadata_only_queries_with_filters.q which are unrelated to this patch.

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch, HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-17 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938195#comment-13938195
 ] 

Jitendra Nath Pandey commented on HIVE-6639:


Ran tests locally. Only failures were show_create_table_serde.q and 
metadata_only_queries_with_filters.q which are unrelated to this patch.

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk. The exception stacktrace is :
 {code}
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116)
   ... 42 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-17 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938193#comment-13938193
 ] 

Jitendra Nath Pandey commented on HIVE-6649:


Ran tests locally. Only failures were show_create_table_serde.q and 
metadata_only_queries_with_filters.q which are unrelated to this patch.

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch, 
 HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-17 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938218#comment-13938218
 ] 

Jitendra Nath Pandey commented on HIVE-6649:


I have committed this to trunk.

[~rhbutani] This bug affects hive-0.13 and fails several vectorized queries on 
DATE. This should be fixed in branch-0.13 as well.

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch, 
 HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-17 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938407#comment-13938407
 ] 

Jitendra Nath Pandey commented on HIVE-6664:


I have committed this to trunk.

[~rhbutani] This bug affects hive-0.13 and causes different results than 
row-mode execution. This should be fixed in branch-0.13 as well.


 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch, HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.

2014-03-17 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938411#comment-13938411
 ] 

Jitendra Nath Pandey commented on HIVE-6680:


Committed to trunk. It is a correctness bug affecting hive-0.13, therefore I 
will port it to hive-13 branch as well.

 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
 --

 Key: HIVE-6680
 URL: https://issues.apache.org/jira/browse/HIVE-6680
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6680.1.patch, HIVE-6680.1.patch


 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-17 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0

 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch, 
 HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-17 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6664:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0

 Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch, HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.

2014-03-17 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6680:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
 --

 Key: HIVE-6680
 URL: https://issues.apache.org/jira/browse/HIVE-6680
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0

 Attachments: HIVE-6680.1.patch, HIVE-6680.1.patch


 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-6680:
--

 Summary: Decimal128#update(Decimal128 o, short scale) should 
adjust the unscaled value.
 Key: HIVE-6680
 URL: https://issues.apache.org/jira/browse/HIVE-6680
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6680:
---

Attachment: HIVE-6680.1.patch

Attached patch fixes the issue.

 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
 --

 Key: HIVE-6680
 URL: https://issues.apache.org/jira/browse/HIVE-6680
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6680.1.patch


 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Status: Open  (was: Patch Available)

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Attachment: HIVE-6649.2.patch

Uploading same patch to trigger pre-commit build.

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Status: Patch Available  (was: Open)

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6664:
---

Attachment: HIVE-6664.1.patch

Uploading same patch to trigger pre-commit build.

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6664:
---

Status: Open  (was: Patch Available)

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6664:
---

Status: Patch Available  (was: Open)

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Attachment: HIVE-6639.5.patch

Uploading same patch to trigger pre-commit build.

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Status: Open  (was: Patch Available)

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Attachment: HIVE-6639.6.patch

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Status: Patch Available  (was: Open)

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6664:
---

Status: Open  (was: Patch Available)

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Status: Patch Available  (was: Open)

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch, 
 HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6680:
---

Status: Patch Available  (was: Open)

 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
 --

 Key: HIVE-6680
 URL: https://issues.apache.org/jira/browse/HIVE-6680
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6680.1.patch, HIVE-6680.1.patch


 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6680:
---

Attachment: HIVE-6680.1.patch

 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
 --

 Key: HIVE-6680
 URL: https://issues.apache.org/jira/browse/HIVE-6680
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6680.1.patch, HIVE-6680.1.patch


 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Description: 
The vectorized plan generation finds the list of partitioning columns from 
pruned-partition-list using table scan operator. In some cases the list is 
coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
table is partitioned on ss_store_sk. The exception stacktrace is :

{code}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116)
... 42 more

{code}


  was:The vectorized plan generation finds the list of partitioning columns 
from pruned-partition-list using table scan operator. In some cases the list is 
coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
table is partitioned on ss_store_sk.


 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk. The exception stacktrace is :
 {code}
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116)
   ... 42 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6664:
---

Status: Patch Available  (was: Open)

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch, HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Status: Open  (was: Patch Available)

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch, 
 HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6664:
---

Attachment: HIVE-6664.1.patch

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch, HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-6664:
--

 Summary: Vectorized variance computation differs from row mode 
computation.
 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


Following query can show the difference:
select count(ss_sales_price), sum(ss_sales_price), avg(ss_sales_price), 
var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), 
stddev_samp(ss_sales_price) from store_sales

The reason for the difference is that row mode converts the decimal value to 
double upfront to calculate sum of values. But the vector mode performs local 
aggregate sum as decimal and converts into double only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6664:
---

Attachment: HIVE-6664.1.patch

Attached patch fixes the issue.

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch


 Following query can show the difference:
 select count(ss_sales_price), sum(ss_sales_price), avg(ss_sales_price), 
 var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values. But the vector mode performs local 
 aggregate sum as decimal and converts into double only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6664:
---

Description: 
Following query can show the difference:
select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.

The reason for the difference is that row mode converts the decimal value to 
double upfront to calculate sum of values, when computing variance. But the 
vector mode performs local aggregate sum as decimal and converts into double 
only at flush.

  was:
Following query can show the difference:
select count(ss_sales_price), sum(ss_sales_price), avg(ss_sales_price), 
var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), 
stddev_samp(ss_sales_price) from store_sales

The reason for the difference is that row mode converts the decimal value to 
double upfront to calculate sum of values. But the vector mode performs local 
aggregate sum as decimal and converts into double only at flush.


 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6664:
---

Status: Patch Available  (was: Open)

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Status: Open  (was: Patch Available)

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Description: The vectorized plan generation finds the list of partitioning 
columns from pruned-partition-list using table scan operator. In some cases the 
list is coming as null. TPCDS query 27 can reproduce this issue if the 
store_sales table is partitioned on ss_store_sk.  (was: The vectorized plan 
generation finds the list of partitioning columns from pruned-partition-list 
using table scan operator. In some cases the list is coming as null. )

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934793#comment-13934793
 ] 

Jitendra Nath Pandey commented on HIVE-6664:


Review board : https://reviews.apache.org/r/19216/

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Status: Open  (was: Patch Available)

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Attachment: HIVE-6649.2.patch

Review board: https://reviews.apache.org/r/19218/

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Status: Patch Available  (was: Open)

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Status: Patch Available  (was: Open)

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Attachment: HIVE-6639.5.patch

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934805#comment-13934805
 ] 

Jitendra Nath Pandey commented on HIVE-6639:


Review board: https://reviews.apache.org/r/19219/

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6662) Vector Join operations with DATE columns fail

2014-03-14 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935401#comment-13935401
 ] 

Jitendra Nath Pandey commented on HIVE-6662:


Please use DateWritable#getDays, the date representation is number of days 
since epoch.

 Vector Join operations with DATE columns fail
 -

 Key: HIVE-6662
 URL: https://issues.apache.org/jira/browse/HIVE-6662
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-6662.1.patch


 Trying to generate a DATE column as part of a JOIN's output throws an 
 exception
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 Long vector column and primitive category DATE
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-13 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-6649:
--

 Summary: Vectorization: some date expressions throw exception.
 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


Query ran with hive.vectorized.execution.enabled=true:
{code}
select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
   datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
   datediff(date_add(dt, 2), date_sub(dt, 2))
from vectortab10korc limit 1;
{code}
fails with the following error:
{noformat}
Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
datediff(date_add(dt, 2), date_sub(dt, 2))
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
... 9 more
Caused by: java.lang.NullPointerException
at java.lang.String.checkBounds(String.java:400)
at java.lang.String.init(String.java:569)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
... 13 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-13 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Attachment: HIVE-6649.1.patch

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-13 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Status: Patch Available  (was: Open)

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-13 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Status: Patch Available  (was: Open)

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-13 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

Status: Open  (was: Patch Available)

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-13 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Status: Patch Available  (was: Open)

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch


 Vectorization: Partition column names are not picked up causing an NPE.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-13 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Status: Open  (was: Patch Available)

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch


 Vectorization: Partition column names are not picked up causing an NPE.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-13 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Status: Patch Available  (was: Open)

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch


 Vectorization: Partition column names are not picked up causing an NPE.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-13 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Attachment: HIVE-6639.3.patch

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch


 Vectorization: Partition column names are not picked up causing an NPE.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


<    1   2   3   4   5   6   7   8   >