[ 
https://issues.apache.org/jira/browse/OFBIZ-10953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17032859#comment-17032859
 ] 

Jacopo Cappellato commented on OFBIZ-10953:
-------------------------------------------

I know that in the third edition of the book the author introduced the 
possibility of an exception for the Date dimension:
{quote}A dimension table is designed with one column serving as a unique 
primary key. The primary key cannot be the operational system's natural key 
[...] These dimension surrogate keys are simple integers, assigned in sequence, 
starting with the value 1, every time a new key is needed. The date dimension 
is exempt from the surrogate key rule; this highly predictable and stable 
dimension can use a more meaningful primary key [...]. 
{quote}
However it is made clear that this exception should not be a shortcut to 
provide meaning to date fields in fact tables and instead can be introduced:
{quote}*To facilitate partitioning*, the primary key of a date dimension can be 
more meaningful, such as an integer representing YYYYMMDD, instead of 
sequentially-assigned surrogate key.
{quote}
With that said, the golden rule is always the same:
{quote}Every join between dimension and fact tables in the data warehouse 
should be based on meaningless integer surrogate keys. You should avoid using 
the natural operational production codes. None of the data warehouse keys 
should be smart, where you can tell something about the row just by looking at 
the key.
{quote}
Pierre wrote:
{quote}The latter means that in a production infrastructure the using company 
is penalised (performance and cost-wise) with an additional query (for the 
*origCurrencyDimId*) to the currency dimension to retrieve the underlying 
explanation/meaning (EUR). And similarly for the *quantityUomDimId* and other 
generically defined measurements (e.g. in the examples the  *invoiceDateDimId*).
{quote}
No additional query is required to fetch, for example, date information; 
instead the star schema should include an additional join to link the fact 
table to the Date dimension table. This would not increase the number of 
records produces (which is upper limited by the rows and in fact table): star 
schemas are designed like this to perform best with single queries with several 
joins (from one fact to many dimensions).

In my opinion, rather than relaxing the "surrogate key" rule for dimension 
tables by introducing exceptions to it, it would be better to improve the 
implementation of surrogate keys in our BI component: in fact they are not 
currently implemented as integer sequential numbers and are instead strings. 
Switching to sequential integers would greatly improve performance and shrink 
the size of many tables in the OLAP database. When the BI component was 
created, it was decided to use strings simply because it was a prototype (not 
intended for production), and since there is no support for a db independent 
way to generate integer sequences in OFBiz, we ended up using the available 
sequence generator utility for strings.

 

 

 

> have CurrencyDimension have a dimensionId that is based on the natural key
> --------------------------------------------------------------------------
>
>                 Key: OFBIZ-10953
>                 URL: https://issues.apache.org/jira/browse/OFBIZ-10953
>             Project: OFBiz
>          Issue Type: Improvement
>          Components: bi
>    Affects Versions: Trunk, Release Branch 17.12, Release Branch 18.12
>            Reporter: Pierre Smits
>            Assignee: Pierre Smits
>            Priority: Major
>              Labels: CurrencyDimension, birt, currency, dimension, dwh
>         Attachments: OFBIZ-10953-BI.patch
>
>
> Currently the record sequencer (delegator.getNextSeqId) is used to determine 
> the dimensionId for the CurrencyDimension. This is unnecessary as the uomId 
> from the UOM table can be used for currency.
> It also makes it easier to set the foreign-key in fact tables by generating 
> it based on the date provided, than by retrieving the dimensionId based on a 
> retrieval through the getDimensionIdFromNaturalKey service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to