[ https://issues.apache.org/jira/browse/HIVE-25874?focusedWorklogId=730991&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-730991 ]
ASF GitHub Bot logged work on HIVE-25874: ----------------------------------------- Author: ASF GitHub Bot Created on: 22/Feb/22 14:49 Start Date: 22/Feb/22 14:49 Worklog Time Spent: 10m Work Description: kgyrtkirk merged pull request #2952: URL: https://github.com/apache/hive/pull/2952 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 730991) Time Spent: 20m (was: 10m) > Slow filter evaluation of nest struct fields in vectorized executions > --------------------------------------------------------------------- > > Key: HIVE-25874 > URL: https://issues.apache.org/jira/browse/HIVE-25874 > Project: Hive > Issue Type: Improvement > Reporter: Zoltan Haindrich > Assignee: Zoltan Haindrich > Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > time is spent at resizing vectors around > [here|https://github.com/apache/hive/blob/200c0bf1feb259f4d95bf065a2ab38fe684383da/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java#L252] > or in some other "ensureSize" method > {code:java} > create table t as > select > named_struct('id',13,'str','string','nest',named_struct('id',12,'str','string','arr',array('value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value'))) > s; > -- go up to 1M rows > insert into table t select * from t union all select * from t union all > select * from t union all select * from t union all select * from t union all > select * from t union all select * from t union all select * from t union all > select * from t; > insert into table t select * from t union all select * from t union all > select * from t union all select * from t union all select * from t union all > select * from t union all select * from t union all select * from t union all > select * from t; > insert into table t select * from t union all select * from t union all > select * from t union all select * from t union all select * from t union all > select * from t union all select * from t union all select * from t union all > select * from t; > insert into table t select * from t union all select * from t union all > select * from t union all select * from t union all select * from t union all > select * from t union all select * from t union all select * from t union all > select * from t; > insert into table t select * from t union all select * from t union all > select * from t union all select * from t union all select * from t union all > select * from t union all select * from t union all select * from t union all > select * from t; > -- insert into table t select * from t union all select * from t union all > select * from t union all select * from t union all select * from t union all > select * from t union all select * from t union all select * from t union all > select * from t; > set hive.fetch.task.conversion=none; > select count(1) from t; > --explain > select s > .id from t > where > s > .nest > .id > 0; > {code} > interestingly; the issue is not present: > * for a query not looking into the nested struct > * and in case the struct with the array is at the top level > {code} > select count(1) from t; > --explain > select s > .id from t > where > s > -- .nest > .id > 0; > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)