Alexander created KYLIN-4805:
--------------------------------
Summary: Hive Global Dictionary - COUNT_DISTINCT(HLL) with
multiple parameters
Key: KYLIN-4805
URL: https://issues.apache.org/jira/browse/KYLIN-4805
Project: Kylin
Issue Type: Bug
Affects Versions: v3.1.1
Reporter: Alexander
When i try to use COUNT_DISTINCT(HLL) with multiple parameters.
I see in code BaseCuboidBuilder.checkHiveGlobalDictionaryColumn
{code:java}
for (MeasureDesc measure : measureDescList) {
if
(measure.getFunction().getExpression().equalsIgnoreCase(FunctionDesc.FUNC_COUNT_DISTINCT))
{
FunctionDesc functionDesc = measure.getFunction();
TblColRef colRef =
functionDesc.getParameter().getColRefs().get(0);
if (mrDictColumnSet.contains(JoinedFlatTable.colName(colRef,
true))) {
functionDesc.setMrDict(true);
logger.info("Enable hive global dictionary for {}", colRef);
measure.setFunction(functionDesc);
}
}
}
{code}
As i see, here supported only BITMAP with one parameter.
Created:
|SELLER_CNT_HLL|COUNT_DISTINCT| * Value:*KYLIN_SALES.SELLER_ID*, Type:*column*
* Value:*KYLIN_SALES.BUYER_ID*, Type:*column*|hllc(10)|
*kylin.dictionary.mr-hive.columns*
KYLIN_SALES_SELLER_ID
Run build cube: Ok
SQL: SELECT COUNT(DISTINCT seller_id, buyer_id) AS DIST_SELLER FROM kylin_sales;
Hive:
{code:java}
+--------------+
| dist_seller |
+--------------+
| 9998 |
+--------------+
1 row selected (22.947 seconds)
{code}
Kylin:
{code:java}
Results (1)
DIST_SELLER
9983
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)