Fucun Chu has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/17372 )

Change subject: IMPALA-10687: Implement ds_cpc_union() function
......................................................................

IMPALA-10687: Implement ds_cpc_union() function

This function receives a set of serialized Apache DataSketches CPC
sketches produced by ds_cpc_sketch() and merges them into a single
sketch.

An example usage is to create a sketch for each partition of a table,
write these sketches to a separate table and based on which partition
the user is interested of the relevant sketches can be union-ed
together to get an estimate. E.g.:
  SELECT
      ds_cpc_estimate(ds_cpc_union(sketch_col))
  FROM sketch_tbl
  WHERE partition_col=1 OR partition_col=5;

Testing:
  - Apart from the automated tests I added to this patch I also
    tested ds_cpc_union() on a bigger dataset to check that
    serialization, deserialization and merging steps work well. I
    took TPCH25.linelitem, created a number of sketches with grouping
    by l_shipdate and called ds_cpc_union() on those sketches

Change-Id: Ib94b45ae79efcc11adc077dd9df9b9868ae82cb6
---
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M testdata/data/README
A testdata/data/cpc_sketches_from_impala.parquet
M testdata/workloads/functional-query/queries/QueryTest/datasketches-cpc.test
M tests/query_test/test_datasketches.py
7 files changed, 169 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/17372/3
--
To view, visit http://gerrit.cloudera.org:8080/17372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib94b45ae79efcc11adc077dd9df9b9868ae82cb6
Gerrit-Change-Number: 17372
Gerrit-PatchSet: 3
Gerrit-Owner: Fucun Chu <chufu...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>

Reply via email to