[ https://issues.apache.org/jira/browse/HIVE-26572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alessandro Solimando updated HIVE-26572: ---------------------------------------- Labels: pull-request-available (was: ) > Support constant expressions in vectorization > --------------------------------------------- > > Key: HIVE-26572 > URL: https://issues.apache.org/jira/browse/HIVE-26572 > Project: Hive > Issue Type: Improvement > Components: Vectorization > Affects Versions: 4.0.0-alpha-2 > Reporter: Alessandro Solimando > Assignee: Alessandro Solimando > Priority: Major > Labels: pull-request-available > > At the moment, we cannot vectorize aggregate expression having constant > parameters in addition to the aggregation column (it's forbidden > [here|https://github.com/apache/hive/blob/c19d56ec7429bfcfad92b62ac335dbf8177dab24/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java#L4531]). > One compelling example of how this could help is [PR > 1824|https://github.com/apache/hive/pull/1824], linked to HIVE-24510, where > _compute_bit_vector_ had to be split into _compute_bit_vector_hll_ + > _compute_bit_vector_fm_ when HLL implementation has been added, while > _compute_bit_vector($col, ['HLL'|'FM'])_ could have been used. > Another example is _VectorUDAFBloomFilterMerge_, receiving an extra constant > parameter controlling the number of threads for merging tasks. At the moment > this parameter is "injected" when trying to find an appropriate constructor > (see > [VectorGroupByOperator.java#L1224-L1244|https://github.com/apache/hive/blob/c19d56ec7429bfcfad92b62ac335dbf8177dab24/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java#L1224-L1244]). > This ad-hoc approach is not scalable and would make the code hard to read and > maintain if more UDAF requires constant parameters. > In addition, we are probably missing vectorization opportunities if no such > ad-hoc treatment is added but an appropriate UDAF constructor is available or > could be easily added (data sketches UDAF, although not yet vectorized, are a > good target). -- This message was sent by Atlassian Jira (v8.20.10#820010)