Alexander created KYLIN-4805:
--------------------------------

             Summary: Hive Global Dictionary - COUNT_DISTINCT(HLL) with 
multiple parameters
                 Key: KYLIN-4805
                 URL: https://issues.apache.org/jira/browse/KYLIN-4805
             Project: Kylin
          Issue Type: Bug
    Affects Versions: v3.1.1
            Reporter: Alexander


When i try to use COUNT_DISTINCT(HLL) with multiple parameters.

I see in code BaseCuboidBuilder.checkHiveGlobalDictionaryColumn
{code:java}
        for (MeasureDesc measure : measureDescList) {
            if 
(measure.getFunction().getExpression().equalsIgnoreCase(FunctionDesc.FUNC_COUNT_DISTINCT))
 {
                FunctionDesc functionDesc = measure.getFunction();
                TblColRef colRef = 
functionDesc.getParameter().getColRefs().get(0);
                if (mrDictColumnSet.contains(JoinedFlatTable.colName(colRef, 
true))) {
                    functionDesc.setMrDict(true);
                    logger.info("Enable hive global dictionary for {}", colRef);
                    measure.setFunction(functionDesc);
                }
            }
        }
{code}
As i see, here supported only BITMAP with one parameter.

Created:

 
|SELLER_CNT_HLL|COUNT_DISTINCT| * Value:*KYLIN_SALES.SELLER_ID*, Type:*column*
 * Value:*KYLIN_SALES.BUYER_ID*, Type:*column*|hllc(10)|
 
*kylin.dictionary.mr-hive.columns*
KYLIN_SALES_SELLER_ID
 
Run build cube: Ok
SQL: SELECT COUNT(DISTINCT seller_id, buyer_id) AS DIST_SELLER FROM kylin_sales;
Hive: 
{code:java}
+--------------+
| dist_seller  |
+--------------+
| 9998         |
+--------------+
1 row selected (22.947 seconds)
{code}
Kylin:
{code:java}
Results (1)
DIST_SELLER
9983
{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to