[ 
https://issues.apache.org/jira/browse/KYLIN-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070771#comment-15070771
 ] 

liyang commented on KYLIN-1122:
-------------------------------

Hi Xiaoyu, fyi that Li Dong is extending the API to allow a MeasureType to 
indicate it should only apply to base cuboid. He is at home town preparing for 
wedding right now. He will update any progress here. :-)

As to your question, the case where a record could contain super many raw rows, 
like 1 million. The first concern is about storage, well, dictionary can help 
out a bit. We can create dictionary on the raw column, then the storage saves 
IDs instead of raw value. This should allow storing 3+ million values in one 
hbase cell (assume default 10 MB of a max hbase cell and each dict ID is 3 
bytes). TopN leverages dictionary exactly in this way. Take a look.

As to the query time returning super many rows, well, that's as designed. 
Storing the raw values is because user want to query them. So let all of them 
return. User should know what he's doing before do it.

> Kylin support detail data query from fact table
> -----------------------------------------------
>
>                 Key: KYLIN-1122
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1122
>             Project: Kylin
>          Issue Type: New Feature
>          Components: Query Engine
>    Affects Versions: v1.2
>            Reporter: Xiaoyu Wang
>            Assignee: liyang
>             Fix For: v2.0, v1.3
>
>         Attachments: 
> 0001-KYLIN-1122-Kylin-support-detail-data-query-from-fact(2.x-staging).patch, 
> 0001-KYLIN-1122-Kylin-support-detail-data-query-from-fact(update-v2-1.x-staging).patch
>
>
> Now Kylin does not support query correct detail rows from fact table like:
> select column1,column2,column3 from fact_table
> The jira KYLIN-1075 add the "SUM" function on the measure column if defined.
> But only the column number type is support.
> I change some code to support this issue:
> Add a "VALUE" measure function : the same value and datatype in the input and 
> output of this function.
> If you want to query detail data from fact table
> *require*:
> 1.Configure the column which not dimensions to "VALUE" or "SUM" measure.(If 
> not configure measure function in the column will get NULL value)
> 2.The source table must has an unique value column and configure it as 
> dimension.
> If you have the better solution please comment here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to