[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail
[ https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6662: --- Status: Open (was: Patch Available) Vector Join operations with DATE columns fail - Key: HIVE-6662 URL: https://issues.apache.org/jira/browse/HIVE-6662 Project: Hive Issue Type: Bug Reporter: Gopal V Assignee: Gopal V Fix For: 0.13.0 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch Trying to generate a DATE column as part of a JOIN's output throws an exception {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible Long vector column and primitive category DATE at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail
[ https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6662: --- Attachment: HIVE-6662.2.patch Vector Join operations with DATE columns fail - Key: HIVE-6662 URL: https://issues.apache.org/jira/browse/HIVE-6662 Project: Hive Issue Type: Bug Reporter: Gopal V Assignee: Gopal V Fix For: 0.13.0 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch Trying to generate a DATE column as part of a JOIN's output throws an exception {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible Long vector column and primitive category DATE at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail
[ https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6662: --- Status: Patch Available (was: Open) Submitting same patch again for pre-commit tests. Vector Join operations with DATE columns fail - Key: HIVE-6662 URL: https://issues.apache.org/jira/browse/HIVE-6662 Project: Hive Issue Type: Bug Reporter: Gopal V Assignee: Gopal V Fix For: 0.13.0 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch Trying to generate a DATE column as part of a JOIN's output throws an exception {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible Long vector column and primitive category DATE at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl
[ https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951483#comment-13951483 ] Jitendra Nath Pandey commented on HIVE-6188: +1 Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl -- Key: HIVE-6188 URL: https://issues.apache.org/jira/browse/HIVE-6188 Project: Hive Issue Type: Improvement Components: Documentation Reporter: Lefty Leverenz Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.13.0 Attachments: HIVE-6188.patch The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl configuration properties need to be documented in hive-default.xml.template and the wiki. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6776) Vectorization: Optimize timestamp to string-timestamp comparision using long comparisons.
Jitendra Nath Pandey created HIVE-6776: -- Summary: Vectorization: Optimize timestamp to string-timestamp comparision using long comparisons. Key: HIVE-6776 URL: https://issues.apache.org/jira/browse/HIVE-6776 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey The timestamp to string-timestamp comparison currently (HIVE-6752) happens using string-string comparison because timestamp is cast to string. This can be optimized by casting string-timestamps to timestamps and using long comparisons instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951562#comment-13951562 ] Jitendra Nath Pandey commented on HIVE-6752: Thanks for the review Eric, I have filed following jiras: HIVE-6776 : To optimize timestamp comparisons. HIVE-6777: Comments and readability in VectorizedHashKeyWrapper. Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, HIVE-6752.4.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6777) VectorizedHashKeyWrapper: comments and readability.
Jitendra Nath Pandey created HIVE-6777: -- Summary: VectorizedHashKeyWrapper: comments and readability. Key: HIVE-6777 URL: https://issues.apache.org/jira/browse/HIVE-6777 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey We should add more comments in VectorizedHashKeyWrapper particularly to explain the logic behind using offsets for different datatypes. Also consider helper functions for offset calculation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951570#comment-13951570 ] Jitendra Nath Pandey commented on HIVE-6752: [~rhbutani] This bug affects hive-0.13 and is important for decimal/date to work correctly in vectorization, therefore the patch should be ported to branch-0.13 as well. Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, HIVE-6752.4.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951571#comment-13951571 ] Jitendra Nath Pandey commented on HIVE-6752: I ran the tests locally for the latest patch, and tests pass. Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, HIVE-6752.4.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6752: --- Resolution: Fixed Fix Version/s: 0.13.0 Release Note: I have committed this to trunk and branch-0.13 Status: Resolved (was: Patch Available) Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: 0.13.0 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, HIVE-6752.4.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6662) Vector Join operations with DATE columns fail
[ https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951765#comment-13951765 ] Jitendra Nath Pandey commented on HIVE-6662: The failed test has no relation to the patch. I will commit it shortly. [~rhbutani] This should be committed to branch-0.13 as well, otherwise vector join on DATE column in hive-0.13 will not work. Vector Join operations with DATE columns fail - Key: HIVE-6662 URL: https://issues.apache.org/jira/browse/HIVE-6662 Project: Hive Issue Type: Bug Reporter: Gopal V Assignee: Gopal V Fix For: 0.13.0 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch Trying to generate a DATE column as part of a JOIN's output throws an exception {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible Long vector column and primitive category DATE at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail
[ https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6662: --- Affects Version/s: 0.13.0 Vector Join operations with DATE columns fail - Key: HIVE-6662 URL: https://issues.apache.org/jira/browse/HIVE-6662 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch Trying to generate a DATE column as part of a JOIN's output throws an exception {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible Long vector column and primitive category DATE at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail
[ https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6662: --- Fix Version/s: (was: 0.13.0) Vector Join operations with DATE columns fail - Key: HIVE-6662 URL: https://issues.apache.org/jira/browse/HIVE-6662 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch Trying to generate a DATE column as part of a JOIN's output throws an exception {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible Long vector column and primitive category DATE at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail
[ https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6662: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) I have committed this to trunk and branch-0.13. Thanks to [~gopalv] ! Vector Join operations with DATE columns fail - Key: HIVE-6662 URL: https://issues.apache.org/jira/browse/HIVE-6662 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Gopal V Assignee: Gopal V Fix For: 0.13.0 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch Trying to generate a DATE column as part of a JOIN's output throws an exception {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible Long vector column and primitive category DATE at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6752: --- Attachment: HIVE-6752.1.patch Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6752: --- Status: Patch Available (was: Open) Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948981#comment-13948981 ] Jitendra Nath Pandey commented on HIVE-6752: Review board entry: https://reviews.apache.org/r/19718/ Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6752: --- Status: Patch Available (was: Open) Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6752: --- Status: Open (was: Patch Available) Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6752: --- Attachment: HIVE-6752.2.patch Updated patch addressing the comments. Also updated test to make with pass with hadoop-2 because hadoop-2 produces results in different order. Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6662) Vector Join operations with DATE columns fail
[ https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950169#comment-13950169 ] Jitendra Nath Pandey commented on HIVE-6662: +1. Vector Join operations with DATE columns fail - Key: HIVE-6662 URL: https://issues.apache.org/jira/browse/HIVE-6662 Project: Hive Issue Type: Bug Reporter: Gopal V Assignee: Gopal V Fix For: 0.13.0 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch Trying to generate a DATE column as part of a JOIN's output throws an exception {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible Long vector column and primitive category DATE at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6060) Define API for RecordUpdater and UpdateReader
[ https://issues.apache.org/jira/browse/HIVE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948233#comment-13948233 ] Jitendra Nath Pandey commented on HIVE-6060: +1 for the latest patch. Define API for RecordUpdater and UpdateReader - Key: HIVE-6060 URL: https://issues.apache.org/jira/browse/HIVE-6060 Project: Hive Issue Type: Sub-task Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, acid-io.patch, h-5317.patch, h-5317.patch, h-5317.patch, h-6060.patch, h-6060.patch We need to define some new APIs for how Hive interacts with the file formats since it needs to be much richer than the current RecordReader and RecordWriter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6708) ConstantVectorExpression should create copies of data objects rather than referencing them
[ https://issues.apache.org/jira/browse/HIVE-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948248#comment-13948248 ] Jitendra Nath Pandey commented on HIVE-6708: +1 for patch 4. ConstantVectorExpression should create copies of data objects rather than referencing them -- Key: HIVE-6708 URL: https://issues.apache.org/jira/browse/HIVE-6708 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-6708-1.patch, HIVE-6708-3.patch, HIVE-6708-4.patch, HIVE-6708.2.patch 1. ConstantVectorExpression vector should be updated for bytecolumnvectors and decimalColumnVectors. The current code changes the reference to the vector which might be shared across multiple columns 2. VectorizationContext.foldConstantsForUnaryExpression(ExprNodeDesc exprDesc) has a minor bug as to when to constant fold the expression. The following code should replace the corresponding piece of code in the trunk. .. GenericUDF gudf = ((ExprNodeGenericFuncDesc) exprDesc).getGenericUDF(); if (gudf instanceof GenericUDFOPNegative || gudf instanceof GenericUDFOPPositive || castExpressionUdfs.contains(gudf.getClass()) ... -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6708) ConstantVectorExpression should create copies of data objects rather than referencing them
[ https://issues.apache.org/jira/browse/HIVE-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948262#comment-13948262 ] Jitendra Nath Pandey commented on HIVE-6708: [~rhbutani] This bug is causing some correctness issues, therefore this fix should also go to hive-0.13. ConstantVectorExpression should create copies of data objects rather than referencing them -- Key: HIVE-6708 URL: https://issues.apache.org/jira/browse/HIVE-6708 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-6708-1.patch, HIVE-6708-3.patch, HIVE-6708-4.patch, HIVE-6708.2.patch 1. ConstantVectorExpression vector should be updated for bytecolumnvectors and decimalColumnVectors. The current code changes the reference to the vector which might be shared across multiple columns 2. VectorizationContext.foldConstantsForUnaryExpression(ExprNodeDesc exprDesc) has a minor bug as to when to constant fold the expression. The following code should replace the corresponding piece of code in the trunk. .. GenericUDF gudf = ((ExprNodeGenericFuncDesc) exprDesc).getGenericUDF(); if (gudf instanceof GenericUDFOPNegative || gudf instanceof GenericUDFOPPositive || castExpressionUdfs.contains(gudf.getClass()) ... -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6708) ConstantVectorExpression should create copies of data objects rather than referencing them
[ https://issues.apache.org/jira/browse/HIVE-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6708: --- Resolution: Fixed Status: Resolved (was: Patch Available) I have committed this to trunk and branch-0.13. Thanks to [~hari.s]! ConstantVectorExpression should create copies of data objects rather than referencing them -- Key: HIVE-6708 URL: https://issues.apache.org/jira/browse/HIVE-6708 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6708-1.patch, HIVE-6708-3.patch, HIVE-6708-4.patch, HIVE-6708.2.patch 1. ConstantVectorExpression vector should be updated for bytecolumnvectors and decimalColumnVectors. The current code changes the reference to the vector which might be shared across multiple columns 2. VectorizationContext.foldConstantsForUnaryExpression(ExprNodeDesc exprDesc) has a minor bug as to when to constant fold the expression. The following code should replace the corresponding piece of code in the trunk. .. GenericUDF gudf = ((ExprNodeGenericFuncDesc) exprDesc).getGenericUDF(); if (gudf instanceof GenericUDFOPNegative || gudf instanceof GenericUDFOPPositive || castExpressionUdfs.contains(gudf.getClass()) ... -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6762) VectorGroupBy operator should produce vectorized row batch and let a terminal operator convert to row mode.
Jitendra Nath Pandey created HIVE-6762: -- Summary: VectorGroupBy operator should produce vectorized row batch and let a terminal operator convert to row mode. Key: HIVE-6762 URL: https://issues.apache.org/jira/browse/HIVE-6762 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey VectorGroupBy operator should produce vectorized row batch and let a terminal operator convert to row mode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6349) Column name map is broken
[ https://issues.apache.org/jira/browse/HIVE-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946850#comment-13946850 ] Jitendra Nath Pandey commented on HIVE-6349: [~rhbutani] This is a major bug and should be fixed in hive-0.13 as well. Column name map is broken -- Key: HIVE-6349 URL: https://issues.apache.org/jira/browse/HIVE-6349 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6349.1.patch Following query results in exception at run time in vector mode. {code} explain select n_name from supplier_orc s join ( select n_name, n_nationkey from nation_orc n join region_orc r on n.n_regionkey = r.r_regionkey and r.r_name = 'XYZ') n1 on s.s_nationkey = n1.n_nationkey; {code} Here n_name is a string and all other fields are int. The stack trace: {code} java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:260) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.LongColumnVector at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:116) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:280) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:133) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:246) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:253) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:574) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:585) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:234) ... 8 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6349) Column name map is broken
[ https://issues.apache.org/jira/browse/HIVE-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6349: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) I have committed this to trunk and branch-0.13 Column name map is broken -- Key: HIVE-6349 URL: https://issues.apache.org/jira/browse/HIVE-6349 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: 0.13.0 Attachments: HIVE-6349.1.patch Following query results in exception at run time in vector mode. {code} explain select n_name from supplier_orc s join ( select n_name, n_nationkey from nation_orc n join region_orc r on n.n_regionkey = r.r_regionkey and r.r_name = 'XYZ') n1 on s.s_nationkey = n1.n_nationkey; {code} Here n_name is a string and all other fields are int. The stack trace: {code} java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:260) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.LongColumnVector at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:116) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:280) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:133) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:246) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:253) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:574) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:585) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:234) ... 8 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
Jitendra Nath Pandey created HIVE-6752: -- Summary: Vectorized Between and IN expressions don't work with decimal, date types. Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5817) column name to index mapping in VectorizationContext is broken
[ https://issues.apache.org/jira/browse/HIVE-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945418#comment-13945418 ] Jitendra Nath Pandey commented on HIVE-5817: bq. It seems like I cannot make a repro for select... even though there are name collisions, the mappings are correct. Perhaps when someone finds a bug we can solve it . HIVE-6349 patch includes a test that reproduces the problem for vectorized select. The patch also fixes the issue. column name to index mapping in VectorizationContext is broken -- Key: HIVE-5817 URL: https://issues.apache.org/jira/browse/HIVE-5817 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.13.0 Reporter: Sergey Shelukhin Assignee: Remus Rusanu Priority: Critical Fix For: 0.13.0 Attachments: HIVE-5817-uniquecols.broken.patch, HIVE-5817.00-broken.patch, HIVE-5817.4.patch, HIVE-5817.5.patch, HIVE-5817.6.patch Columns coming from different operators may have the same internal names (_colNN). There exists a query in the form {{select b.cb, a.ca from a JOIN b ON ... JOIN x ON ...;}} (distilled from a more complex query), which runs ok w/o vectorization. With vectorization, it will run ok for most ca, but for some ca it will fail (or can probably return incorrect results). That is because when building column-to-VRG-index map in VectorizationContext, internal column name for ca that the first map join operator adds to the mapping may be the same as internal name for cb that the 2nd one tries to add. 2nd VMJ doesn't add it (see code in ctor), and when it's time for it to output stuff, it retrieves wrong index from the map by name, and then wrong vector from VRG. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6731) Non-staged mapjoin optimization doesn't work with vectorization.
Jitendra Nath Pandey created HIVE-6731: -- Summary: Non-staged mapjoin optimization doesn't work with vectorization. Key: HIVE-6731 URL: https://issues.apache.org/jira/browse/HIVE-6731 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey This issue relates to HIVE-6144. The non-staged map join optimization with hive.auto.convert.join.use.nonstaged=true, doesn't work with vectorization turned on. The query works but hashtables are still created in local mode in a separate stage. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6735) Make scalable dynamic partitioning work in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946085#comment-13946085 ] Jitendra Nath Pandey commented on HIVE-6735: +1 LGTM Make scalable dynamic partitioning work in vectorized mode -- Key: HIVE-6735 URL: https://issues.apache.org/jira/browse/HIVE-6735 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0, 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Fix For: 0.13.0, 0.14.0 Attachments: HIVE-6735.1.patch HIVE-6455 added support for scalable dynamic partitioning. This is subtask to make HIVE-6455 work with vectorized operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6222: --- Fix Version/s: (was: 0.14.0) 0.13.0 Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Fix For: 0.13.0 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, HIVE-6222.4.patch, HIVE-6222.5.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944520#comment-13944520 ] Jitendra Nath Pandey commented on HIVE-6222: I have committed this to branch-0.13 as well. Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Fix For: 0.13.0 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, HIVE-6222.4.patch, HIVE-6222.5.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6349) Column name map is broken
[ https://issues.apache.org/jira/browse/HIVE-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6349: --- Attachment: HIVE-6349.1.patch The attached patch makes VectorSelectOperator implement VectorizedRegionContext. Therefore, VectorSelectOperator also gives out a vectorization context with updated column map. However, VectorSelectOperator doesn't create a new row batch, therefore it re-uses the same Output Column Manager from its parent's vectorization context.This is important because then, it doesn't have to allocate scratch columns. Column name map is broken -- Key: HIVE-6349 URL: https://issues.apache.org/jira/browse/HIVE-6349 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6349.1.patch Following query results in exception at run time in vector mode. {code} explain select n_name from supplier_orc s join ( select n_name, n_nationkey from nation_orc n join region_orc r on n.n_regionkey = r.r_regionkey and r.r_name = 'XYZ') n1 on s.s_nationkey = n1.n_nationkey; {code} Here n_name is a string and all other fields are int. The stack trace: {code} java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:260) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.LongColumnVector at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:116) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:280) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:133) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:246) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:253) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:574) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:585) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:234) ... 8 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6349) Column name map is broken
[ https://issues.apache.org/jira/browse/HIVE-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6349: --- Status: Patch Available (was: Open) Column name map is broken -- Key: HIVE-6349 URL: https://issues.apache.org/jira/browse/HIVE-6349 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6349.1.patch Following query results in exception at run time in vector mode. {code} explain select n_name from supplier_orc s join ( select n_name, n_nationkey from nation_orc n join region_orc r on n.n_regionkey = r.r_regionkey and r.r_name = 'XYZ') n1 on s.s_nationkey = n1.n_nationkey; {code} Here n_name is a string and all other fields are int. The stack trace: {code} java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:260) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.LongColumnVector at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:116) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:280) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:133) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:246) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:253) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:574) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:585) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:234) ... 8 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944142#comment-13944142 ] Jitendra Nath Pandey commented on HIVE-6222: [~rhbutani] For too many distinct keys, map side grouping can become too slow in vectorized mode under memory pressure. This affects hive-0.13 as well, therefore we should port it to branch-0.13 as well. Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Fix For: 0.14.0 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, HIVE-6222.4.patch, HIVE-6222.5.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6060) Define API for RecordUpdater and UpdateReader
[ https://issues.apache.org/jira/browse/HIVE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943341#comment-13943341 ] Jitendra Nath Pandey commented on HIVE-6060: OrcInputFormat#getRecordReader must check for vectorized mode before returning any reader. It seems this patch has moved the check down which introduces a scenario where non-vectorized record reader will be returned in vectorized mode, which would cause the query to fail. Define API for RecordUpdater and UpdateReader - Key: HIVE-6060 URL: https://issues.apache.org/jira/browse/HIVE-6060 Project: Hive Issue Type: Sub-task Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, acid-io.patch, h-5317.patch, h-5317.patch, h-5317.patch, h-6060.patch, h-6060.patch We need to define some new APIs for how Hive interacts with the file formats since it needs to be much richer than the current RecordReader and RecordWriter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6708) ConstantVectorExpression should create copies of data objects rather than referencing them
[ https://issues.apache.org/jira/browse/HIVE-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943523#comment-13943523 ] Jitendra Nath Pandey commented on HIVE-6708: +1 ConstantVectorExpression should create copies of data objects rather than referencing them -- Key: HIVE-6708 URL: https://issues.apache.org/jira/browse/HIVE-6708 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-6708-1.patch, HIVE-6708.2.patch 1. ConstantVectorExpression vector should be updated for bytecolumnvectors and decimalColumnVectors. The current code changes the reference to the vector which might be shared across multiple columns 2. VectorizationContext.foldConstantsForUnaryExpression(ExprNodeDesc exprDesc) has a minor bug as to when to constant fold the expression. The following code should replace the corresponding piece of code in the trunk. .. GenericUDF gudf = ((ExprNodeGenericFuncDesc) exprDesc).getGenericUDF(); if (gudf instanceof GenericUDFOPNegative || gudf instanceof GenericUDFOPPositive || castExpressionUdfs.contains(gudf.getClass()) ... -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6701) Analyze table compute statistics for decimal columns.
[ https://issues.apache.org/jira/browse/HIVE-6701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6701: --- Attachment: HIVE-6701.1.patch This patch only adds decimal support in the udf GenericUDAFComputeStats. The changes to metastore and thrift APIs are still needed. Analyze table compute statistics for decimal columns. - Key: HIVE-6701 URL: https://issues.apache.org/jira/browse/HIVE-6701 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6701.1.patch Analyze table should compute statistics for decimal columns as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6701) Analyze table compute statistics for decimal columns.
Jitendra Nath Pandey created HIVE-6701: -- Summary: Analyze table compute statistics for decimal columns. Key: HIVE-6701 URL: https://issues.apache.org/jira/browse/HIVE-6701 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Analyze table should compute statistics for decimal columns as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939608#comment-13939608 ] Jitendra Nath Pandey commented on HIVE-6639: I have committed this to trunk. [~rhbutani] This bug affects hive-0.13 and fails queries having partitioned columns but no filters. This should be fixed in branch-0.13 as well. Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. The exception stacktrace is : {code} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116) ... 42 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) I have committed this to trunk and branch-0.13. Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: 0.13.0 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. The exception stacktrace is : {code} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116) ... 42 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Affects Version/s: 0.13.0 Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: 0.13.0 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. The exception stacktrace is : {code} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116) ... 42 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6518) Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered
[ https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6518: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) I have committed this trunk. Thanks to Gopal! [~rhbutani] This is an important fix to vector group by because the aggregates must flush more aggressively in case of GC. Therefore, I intend to commit it to branch-0.13. as well. Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered Key: HIVE-6518 URL: https://issues.apache.org/jira/browse/HIVE-6518 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0 Reporter: Gopal V Assignee: Gopal V Priority: Minor Fix For: 0.14.0 Attachments: HIVE-6518.1-tez.patch, HIVE-6518.2-tez.patch, HIVE-6518.2.patch, HIVE-6518.3.patch The current VectorGroupByOperator implementation flushes the in-memory hashes when the maximum entries or fraction of memory is hit. This works for most cases, but there are some corner cases where we hit GC ovehead limits or heap size limits before either of those conditions are reached due to the rest of the pipeline. This patch adds a SoftReference as a GC canary. If the soft reference is dead, then a full GC pass happened sometime in the near past the aggregation hashtables should be flushed immediately before another full GC is triggered. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6518) Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered
[ https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938186#comment-13938186 ] Jitendra Nath Pandey commented on HIVE-6518: Committed to branch-0.13 as well. Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered Key: HIVE-6518 URL: https://issues.apache.org/jira/browse/HIVE-6518 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0 Reporter: Gopal V Assignee: Gopal V Priority: Minor Fix For: 0.13.0 Attachments: HIVE-6518.1-tez.patch, HIVE-6518.2-tez.patch, HIVE-6518.2.patch, HIVE-6518.3.patch The current VectorGroupByOperator implementation flushes the in-memory hashes when the maximum entries or fraction of memory is hit. This works for most cases, but there are some corner cases where we hit GC ovehead limits or heap size limits before either of those conditions are reached due to the rest of the pipeline. This patch adds a SoftReference as a GC canary. If the soft reference is dead, then a full GC pass happened sometime in the near past the aggregation hashtables should be flushed immediately before another full GC is triggered. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6518) Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered
[ https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6518: --- Fix Version/s: (was: 0.14.0) 0.13.0 Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered Key: HIVE-6518 URL: https://issues.apache.org/jira/browse/HIVE-6518 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0 Reporter: Gopal V Assignee: Gopal V Priority: Minor Fix For: 0.13.0 Attachments: HIVE-6518.1-tez.patch, HIVE-6518.2-tez.patch, HIVE-6518.2.patch, HIVE-6518.3.patch The current VectorGroupByOperator implementation flushes the in-memory hashes when the maximum entries or fraction of memory is hit. This works for most cases, but there are some corner cases where we hit GC ovehead limits or heap size limits before either of those conditions are reached due to the rest of the pipeline. This patch adds a SoftReference as a GC canary. If the soft reference is dead, then a full GC pass happened sometime in the near past the aggregation hashtables should be flushed immediately before another full GC is triggered. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
[ https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938194#comment-13938194 ] Jitendra Nath Pandey commented on HIVE-6680: Ran tests locally. Only failures were show_create_table_serde.q and metadata_only_queries_with_filters.q which are unrelated to this patch. Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value. -- Key: HIVE-6680 URL: https://issues.apache.org/jira/browse/HIVE-6680 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6680.1.patch, HIVE-6680.1.patch Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938196#comment-13938196 ] Jitendra Nath Pandey commented on HIVE-6664: Ran tests locally. Only failures were show_create_table_serde.q and metadata_only_queries_with_filters.q which are unrelated to this patch. Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch, HIVE-6664.1.patch Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938195#comment-13938195 ] Jitendra Nath Pandey commented on HIVE-6639: Ran tests locally. Only failures were show_create_table_serde.q and metadata_only_queries_with_filters.q which are unrelated to this patch. Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. The exception stacktrace is : {code} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116) ... 42 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938193#comment-13938193 ] Jitendra Nath Pandey commented on HIVE-6649: Ran tests locally. Only failures were show_create_table_serde.q and metadata_only_queries_with_filters.q which are unrelated to this patch. Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch, HIVE-6649.2.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938218#comment-13938218 ] Jitendra Nath Pandey commented on HIVE-6649: I have committed this to trunk. [~rhbutani] This bug affects hive-0.13 and fails several vectorized queries on DATE. This should be fixed in branch-0.13 as well. Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch, HIVE-6649.2.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938407#comment-13938407 ] Jitendra Nath Pandey commented on HIVE-6664: I have committed this to trunk. [~rhbutani] This bug affects hive-0.13 and causes different results than row-mode execution. This should be fixed in branch-0.13 as well. Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch, HIVE-6664.1.patch Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
[ https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938411#comment-13938411 ] Jitendra Nath Pandey commented on HIVE-6680: Committed to trunk. It is a correctness bug affecting hive-0.13, therefore I will port it to hive-13 branch as well. Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value. -- Key: HIVE-6680 URL: https://issues.apache.org/jira/browse/HIVE-6680 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6680.1.patch, HIVE-6680.1.patch Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6649: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: 0.13.0 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch, HIVE-6649.2.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6664: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: 0.13.0 Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch, HIVE-6664.1.patch Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
[ https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6680: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value. -- Key: HIVE-6680 URL: https://issues.apache.org/jira/browse/HIVE-6680 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: 0.13.0 Attachments: HIVE-6680.1.patch, HIVE-6680.1.patch Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
Jitendra Nath Pandey created HIVE-6680: -- Summary: Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value. Key: HIVE-6680 URL: https://issues.apache.org/jira/browse/HIVE-6680 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
[ https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6680: --- Attachment: HIVE-6680.1.patch Attached patch fixes the issue. Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value. -- Key: HIVE-6680 URL: https://issues.apache.org/jira/browse/HIVE-6680 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6680.1.patch Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6649: --- Status: Open (was: Patch Available) Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6649: --- Attachment: HIVE-6649.2.patch Uploading same patch to trigger pre-commit build. Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6649: --- Status: Patch Available (was: Open) Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6664: --- Attachment: HIVE-6664.1.patch Uploading same patch to trigger pre-commit build. Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6664: --- Status: Open (was: Patch Available) Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6664: --- Status: Patch Available (was: Open) Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Attachment: HIVE-6639.5.patch Uploading same patch to trigger pre-commit build. Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch, HIVE-6639.5.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Status: Open (was: Patch Available) Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch, HIVE-6639.5.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Attachment: HIVE-6639.6.patch Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Status: Patch Available (was: Open) Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6664: --- Status: Open (was: Patch Available) Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6649: --- Status: Patch Available (was: Open) Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch, HIVE-6649.2.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
[ https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6680: --- Status: Patch Available (was: Open) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value. -- Key: HIVE-6680 URL: https://issues.apache.org/jira/browse/HIVE-6680 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6680.1.patch, HIVE-6680.1.patch Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
[ https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6680: --- Attachment: HIVE-6680.1.patch Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value. -- Key: HIVE-6680 URL: https://issues.apache.org/jira/browse/HIVE-6680 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6680.1.patch, HIVE-6680.1.patch Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Description: The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. The exception stacktrace is : {code} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116) ... 42 more {code} was:The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. The exception stacktrace is : {code} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116) ... 42 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6664: --- Status: Patch Available (was: Open) Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch, HIVE-6664.1.patch Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6649: --- Status: Open (was: Patch Available) Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch, HIVE-6649.2.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6664: --- Attachment: HIVE-6664.1.patch Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch, HIVE-6664.1.patch Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6664) Vectorized variance computation differs from row mode computation.
Jitendra Nath Pandey created HIVE-6664: -- Summary: Vectorized variance computation differs from row mode computation. Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Following query can show the difference: select count(ss_sales_price), sum(ss_sales_price), avg(ss_sales_price), var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6664: --- Attachment: HIVE-6664.1.patch Attached patch fixes the issue. Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6664.1.patch Following query can show the difference: select count(ss_sales_price), sum(ss_sales_price), avg(ss_sales_price), var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6664: --- Description: Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. was: Following query can show the difference: select count(ss_sales_price), sum(ss_sales_price), avg(ss_sales_price), var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6664.1.patch Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6664: --- Status: Patch Available (was: Open) Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6664.1.patch Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Status: Open (was: Patch Available) Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Description: The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. (was: The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. ) Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934793#comment-13934793 ] Jitendra Nath Pandey commented on HIVE-6664: Review board : https://reviews.apache.org/r/19216/ Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6664.1.patch Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6649: --- Status: Open (was: Patch Available) Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6649: --- Attachment: HIVE-6649.2.patch Review board: https://reviews.apache.org/r/19218/ Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6649: --- Status: Patch Available (was: Open) Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Status: Patch Available (was: Open) Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Attachment: HIVE-6639.5.patch Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934805#comment-13934805 ] Jitendra Nath Pandey commented on HIVE-6639: Review board: https://reviews.apache.org/r/19219/ Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6662) Vector Join operations with DATE columns fail
[ https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935401#comment-13935401 ] Jitendra Nath Pandey commented on HIVE-6662: Please use DateWritable#getDays, the date representation is number of days since epoch. Vector Join operations with DATE columns fail - Key: HIVE-6662 URL: https://issues.apache.org/jira/browse/HIVE-6662 Project: Hive Issue Type: Bug Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-6662.1.patch Trying to generate a DATE column as part of a JOIN's output throws an exception {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible Long vector column and primitive category DATE at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6649) Vectorization: some date expressions throw exception.
Jitendra Nath Pandey created HIVE-6649: -- Summary: Vectorization: some date expressions throw exception. Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6649: --- Attachment: HIVE-6649.1.patch Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6649: --- Status: Patch Available (was: Open) Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6649: --- Status: Patch Available (was: Open) Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6649: --- Status: Open (was: Patch Available) Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Status: Patch Available (was: Open) Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch Vectorization: Partition column names are not picked up causing an NPE. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Status: Open (was: Patch Available) Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch Vectorization: Partition column names are not picked up causing an NPE. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Status: Patch Available (was: Open) Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch Vectorization: Partition column names are not picked up causing an NPE. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Attachment: HIVE-6639.3.patch Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch Vectorization: Partition column names are not picked up causing an NPE. -- This message was sent by Atlassian JIRA (v6.2#6252)