[jira] [Work logged] (HIVE-25874) Slow filter evaluation of nest struct fields in vectorized executions

2022-02-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25874?focusedWorklogId=730991=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-730991
 ]

ASF GitHub Bot logged work on HIVE-25874:
-

Author: ASF GitHub Bot
Created on: 22/Feb/22 14:49
Start Date: 22/Feb/22 14:49
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #2952:
URL: https://github.com/apache/hive/pull/2952


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 730991)
Time Spent: 20m  (was: 10m)

> Slow filter evaluation of nest struct fields in vectorized executions
> -
>
> Key: HIVE-25874
> URL: https://issues.apache.org/jira/browse/HIVE-25874
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> time is spent at resizing vectors around 
> [here|https://github.com/apache/hive/blob/200c0bf1feb259f4d95bf065a2ab38fe684383da/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java#L252]
>  or in some other "ensureSize" method
> {code:java}
> create table t as
> select
> named_struct('id',13,'str','string','nest',named_struct('id',12,'str','string','arr',array('value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value')))
> s;
> -- go up to 1M rows
> insert into table t select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t;
> insert into table t select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t;
> insert into table t select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t;
> insert into table t select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t;
> insert into table t select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t;
> -- insert into table t select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t;
> set hive.fetch.task.conversion=none;
> select count(1) from t;
> --explain
> select s
> .id from t
> where 
> s
> .nest
> .id  > 0;
>  {code}
> interestingly; the issue is not present:
> * for a query not looking into the nested struct
> * and in case the struct with the array is at the top level
> {code}
> select count(1) from t;
> --explain
> select s
> .id from t
> where 
> s
> -- .nest
> .id  > 0;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25874) Slow filter evaluation of nest struct fields in vectorized executions

2022-01-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25874?focusedWorklogId=710456=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-710456
 ]

ASF GitHub Bot logged work on HIVE-25874:
-

Author: ASF GitHub Bot
Created on: 18/Jan/22 12:31
Start Date: 18/Jan/22 12:31
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk opened a new pull request #2952:
URL: https://github.com/apache/hive/pull/2952


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 710456)
Remaining Estimate: 0h
Time Spent: 10m

> Slow filter evaluation of nest struct fields in vectorized executions
> -
>
> Key: HIVE-25874
> URL: https://issues.apache.org/jira/browse/HIVE-25874
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> time is spent at resizing vectors around 
> [here|https://github.com/apache/hive/blob/200c0bf1feb259f4d95bf065a2ab38fe684383da/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java#L252]
>  or in some other "ensureSize" method
> {code:java}
> create table t as
> select
> named_struct('id',13,'str','string','nest',named_struct('id',12,'str','string','arr',array('value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value')))
> s;
> -- go up to 1M rows
> insert into table t select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t;
> insert into table t select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t;
> insert into table t select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t;
> insert into table t select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t;
> insert into table t select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t;
> -- insert into table t select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t union all select * from t union all select * from t union all 
> select * from t;
> set hive.fetch.task.conversion=none;
> select count(1) from t;
> --explain
> select s
> .id from t
> where 
> s
> .nest
> .id  > 0;
>  {code}
> interestingly; the issue is not present:
> * for a query not looking into the nested struct
> * and in case the struct with the array is at the top level
> {code}
> select count(1) from t;
> --explain
> select s
> .id from t
> where 
> s
> -- .nest
> .id  > 0;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)