[ 
https://issues.apache.org/jira/browse/KYLIN-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15759986#comment-15759986
 ] 

fengYu commented on KYLIN-2286:
-------------------------------

I don't think it is friendly for user that refresh all segments lookup table 
when a cube segment created, our solve is add a cube level property whether to 
enable it, if enable we use snapshots attribute in cubeInstance(add it) instead 
of cubeSegment, and every time when build dictionary, it check the new snapshot 
from hive table and old snapshot from hbase. we merge it by PK to ensure the 
snapshot table is increasing. and take the merged lookup table as snapshot 
table input to rebuild the new one.

Maybe you can help us to review the process, thanks a lot.

> global snapshot table for one cube 
> -----------------------------------
>
>                 Key: KYLIN-2286
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2286
>             Project: Kylin
>          Issue Type: Improvement
>            Reporter: fengYu
>            Assignee: fengYu
>
> I current version, Kylin build a snapshot table for a segment and isolate 
> with each other in the same cube,  even though some segments share the same 
> snapshot table storage  .
> I some scene, we need global snapshot table for one cube, such as we has a 
> cube with snapshot table,ID is PK,the first day, the table look like:
> id name
> 1   A
> 2   B
> 3   C
> the query 'select name, count(1) from fact join dimension group by name' get 
> result:
> A xx
> B xx
> C xx
> the next day(segment), lookup table modified, it looks like :
> id name
> 1   A
> 2   D
> 3   E
> the same query return :
> A xx
> B xx
> C xx
> D xx
> E xx
> However B and D, C and E has the same ID, we need the newest result. so a 
> global snapshot table shared by all segments which has always the newest 
> values is needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to