[ 
https://issues.apache.org/jira/browse/KYLIN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15252078#comment-15252078
 ] 

Richard Calaba commented on KYLIN-1313:
---------------------------------------

Though this issue is just resolved I would like to understand how is handled 
deriving from fact table itself if there are multiple candidates for resolution 
?? Does it at least produce error ??? Fact data (transactions) is not always 
clean ... so it can happen that we have multiple candidates for resolution ...

an example:

POS data:

 StoreID, ItemID, Item Name, Quantity, Price

    1, 1, Coca Cola, 1, 5
    2, 1, Coke, 2, 10

Though the PLU (ItemID) is correct and unique it maps to 2 different names 
(depending on the store which produces the descriptive name.

If you have such situation - then the derived dimension as described above is 
not correct. Of course you can use the field Item_Name as normal dimension to 
produce correct query results. But if you allow such functionality it would be 
beneficial that cube data load process checks for unique mapping between the 
ItemID and ItemName ... 




> Enable deriving dimensions on non PK/FK
> ---------------------------------------
>
>                 Key: KYLIN-1313
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1313
>             Project: Kylin
>          Issue Type: Improvement
>            Reporter: hongbin ma
>            Assignee: hongbin ma
>             Fix For: v1.5.2
>
>
> currently derived column has to be columns on look table, and the derived 
> host column has to be PK/FK(It's also a problem when the lookup table grows 
> every large). Sometimes columns on the fact exhibit deriving relationship 
> too. Here's an example fact table:
> (dt date, seller_id bigint, seller_name varchar(100) , item_id bigint, 
> item_url varchar(1000), count decimal, price decimal)
> seller_name is uniquely determined by each seller id, and item_url is 
> uniquely determined by each item_id. The users does not expect to do 
> filtering on columns like seller name or item_url, they just want to retrieve 
> it when they do grouping/filtering on other dimensions like selller id, item 
> id or even other dimensions like dt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to