[ https://issues.apache.org/jira/browse/KYLIN-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382355#comment-15382355 ]
hongbin ma commented on KYLIN-1855: ----------------------------------- Hi yanghong correct me if I'm wrong. Suppose we have a fact table F and two lookup table X,Y suppose F has 10000 rows, "F inner join X" produces 9000, and "F inner join X inner join Y" produces 8000 Now we have a model M, the join relationship is "F inner join X inner join Y". Next we have a cube C, it contains only F and X. When we build the cube C, currently the hive sql will contain ""F inner join X inner join Y", which is trying to keep all cubes in M having the same "star schema". so the "real fact table" for cube C will contain 8000 rows. Now come the issue, when users queries "select count(*) from F inner join X", it will return 8000, which is wrong from a DBMS perspective. A smarter user will query "select count(*) from F inner join X inner join Y", however kylin will throw "no realization found" because currently the CubeCapabilityChecker sees that cube C does not include table Y. that's the issue. If we want to enforce all cubes under the same model to have uniformed star schema, i.e, each cube fully respect all the inner joins, we should relax the CubeCapabilityChecker. Otherwise, if we allow some cubes seeing a "real fact table" of 8000, and some seeing of 9000, then we need to merge yanghong's patch. To me, the first makes more sense. > Should exclude those joins in whose related lookup tables no dimensions are > used in cube > ---------------------------------------------------------------------------------------- > > Key: KYLIN-1855 > URL: https://issues.apache.org/jira/browse/KYLIN-1855 > Project: Kylin > Issue Type: Improvement > Reporter: Zhong Yanghong > Assignee: Zhong Yanghong > Attachments: exclude_unused_joins.patch > > > A cube is based on a model in which a star schema is defined. In some cases, > the cube utilizes only a few lookup tables rather than all. In this case, > when creating the sql for the flat table, those lookup tables should not be > included. Otherwise, it will confuse users when query. If users do query > according to the definition of the flat table, error of no realization will > occur due to lack of the related join. -- This message was sent by Atlassian JIRA (v6.3.4#6332)