[
https://issues.apache.org/jira/browse/HIVE-26243?focusedWorklogId=776318&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776318
]
ASF GitHub Bot logged work on HIVE-26243:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 31/May/22 13:31
Start Date: 31/May/22 13:31
Worklog Time Spent: 10m
Work Description: kgyrtkirk commented on PR #3317:
URL: https://github.com/apache/hive/pull/3317#issuecomment-1142141281
PR-1824 has nothing to do with datasketches; I don't know how you followed
it's conventions but you might end up in trouble...because DS also has a HLL
implementation...wouldn't that conflict with the existing one?
note: PR-1824 named the file `VectorUDAFComputeBitVector.txt` and internally
named the method `compute_bit_vector_hll` ; I think the class name should have
contained the `Hll` keyword
I think that the file name `VectorUDAFComputeKLL.txt` is not connected at
all to the `ds_kll_sketch` function its about to vectorize...and as such its a
bit confusing....
The current implementation doesn't really look forward: I think we have 20
*sketch* function from datasketches already exposed as inside Hive which could
be vectorized; I think they are behind the same api cover...so just vectorizing
the KLL one without any sight forward and taking "ideas" from the old hll
codepath doesn't seem the best idea to me...
```
grep ^ds_ ql/src/test/results/clientpositive/llap/show_functions.q.out|grep
_sketch$
```
no need to do everything in 1 patch - but this is pretty much just
copy-pasting the existing hll txtfile substituted to kll here and there...so we
should do that 20 times?
> For instance, you seem to be suggesting to remove all helper
classes/methods etc
I don't think those changes neccessary in the *metastore* for a
vectorization of this function?
HIVE-26221 is something which have changes - but has no real end-user
accessible value - and as such I don't think its ready.
Issue Time Tracking
-------------------
Worklog Id: (was: 776318)
Time Spent: 0.5h (was: 20m)
> Add vectorized implementation of the 'ds_kll_sketch' UDAF
> ---------------------------------------------------------
>
> Key: HIVE-26243
> URL: https://issues.apache.org/jira/browse/HIVE-26243
> Project: Hive
> Issue Type: Improvement
> Components: UDF, Vectorization
> Affects Versions: 4.0.0-alpha-2
> Reporter: Alessandro Solimando
> Assignee: Alessandro Solimando
> Priority: Major
> Labels: pull-request-available
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> _ds_kll_sketch_ UDAF does not have a vectorized implementation at the moment,
> the present ticket aims at bridging this gap.
> This is particularly important because vectorization has an "all or nothing"
> approach, so if this function is used at the side of vectorized functions,
> they won't be able to benefit from vectorized execution.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)