[Activewarehouse-discuss] Fact/Dimension Many-to-Many Relationship (i.e. multivalued dimensions)?
(Meta: Sorry for not replying in-thread -- I'm posting this from another email account.) > I believe you said: > Article has_many Topics > Article has_many Regions > > Is this correct? If it is then you might have something like this: > > ... > article_facts > * topic_id > * region_id > * hit (always 1) > > With this structure you'd be able to aggregate your facts by the > various attributes in both topic and dimension. Is this starting to > look like what you are trying to do or am I still missing something > here? I think you're still missing something. If I do the above, I'm stuck with one of two alternatives: 1. I have to choose a "primary" topic and region for each article, and discard the others, or 2. For each article, I have m*n article_facts, where m is the number of regions and n is the number of topics. The former leads to artificially low numbers (where an article is not counted for all but one of its topics/regions). The latter leads to artificially high numbers (where an article is n-tuply counted for every region, and m-tuply counted for every topic). I suppose I could fudge the numbers a bit, by creating a "weighting" column which 1/sqrt(m*n) (inverse of geometric mean, so the over/undercounting would be right "on average", assuming the number of regions and number of topics are independent variables), but then I'd feel dirty. Perhaps I scared you by referencing an MSDN technet article. The other article (http://www.dbmsmag.com/9808d05.html) was written by Ralph Kimball. :) Looking at the source code more, it looks like it's hard-coded to recognize HierarchicalBridges, and there's no real support for pluggable bridging. (Please correct me if I'm wrong.) If I decide to follow the approach laid out in the above article, I'd have to modify the source code to whatever Aggregate I'm using. Does this sound correct? Feasible? Stupid? I'd rather use ActiveWarehouse if I can, because it looks like there's a lot of knowledge here I could benefit from, and a lot of boilerplate I can save, but I suppose lacking a solution to the above, I'll just start hard coding some aggregate tables. Not that I blame the authors -- it's no fault of yours if none of your data has multivalued dimensions. If there's another way I can granulate the data that gets rid of them but still lets me slice/count the way I'd like, I'm open to that, as well. ___ Activewarehouse-discuss mailing list Activewarehouse-discuss@rubyforge.org http://rubyforge.org/mailman/listinfo/activewarehouse-discuss
Re: [Activewarehouse-discuss] Fact/Dimension Many-to-Many Relationship (i.e. multivalued dimensions)?
I believe you said: Article has_many Topics Article has_many Regions Is this correct? If it is then you might have something like this: topic_dimension * id * name * [other attributes] region_dimension * id * name * [other attributes] article_facts * topic_id * region_id * hit (always 1) With this structure you'd be able to aggregate your facts by the various attributes in both topic and dimension. Is this starting to look like what you are trying to do or am I still missing something here? V/r Anthony On Jan 15, 2008 8:41 AM, Devin Mullins <[EMAIL PROTECTED]> wrote: > Anthony, thanks for responding so promptly. > > On Tue, Jan 15, 2008 at 08:03:18AM -0500, Anthony Eden wrote: > > Regions and topics should probably be dimensions and article a fact. > > What I can't figure out is what the measurements are in this (aside > > from counts perhaps) since there is very little to go on. > Yes, counts is it, at least for now. Maybe aggregate hits some day. > Assuming the idea of three fact tables is out, then, the question > remains: How do I implement multivalued dimensions in ActiveWarehouse? > > > you want to minimize or > > completely remove joins to large tables, hence your dimensions should > > be small. > Well, that's a problem. The bridge table would be huge -- more rows than > the fact table. (Though, granted, only two columns -- it's essentially a > :through table.) But saying "no" to the customer isn't really an option > I'd like to take right now... > > Thanks, > Devin > ___ Activewarehouse-discuss mailing list Activewarehouse-discuss@rubyforge.org http://rubyforge.org/mailman/listinfo/activewarehouse-discuss
Re: [Activewarehouse-discuss] Fact/Dimension Many-to-Many Relationship (i.e. multivalued dimensions)?
Anthony, thanks for responding so promptly. On Tue, Jan 15, 2008 at 08:03:18AM -0500, Anthony Eden wrote: > Regions and topics should probably be dimensions and article a fact. > What I can't figure out is what the measurements are in this (aside > from counts perhaps) since there is very little to go on. Yes, counts is it, at least for now. Maybe aggregate hits some day. Assuming the idea of three fact tables is out, then, the question remains: How do I implement multivalued dimensions in ActiveWarehouse? > you want to minimize or > completely remove joins to large tables, hence your dimensions should > be small. Well, that's a problem. The bridge table would be huge -- more rows than the fact table. (Though, granted, only two columns -- it's essentially a :through table.) But saying "no" to the customer isn't really an option I'd like to take right now... Thanks, Devin ___ Activewarehouse-discuss mailing list Activewarehouse-discuss@rubyforge.org http://rubyforge.org/mailman/listinfo/activewarehouse-discuss
Re: [Activewarehouse-discuss] Fact/Dimension Many-to-Many Relationship (i.e. multivalued dimensions)?
Regions and topics should probably be dimensions and article a fact. What I can't figure out is what the measurements are in this (aside from counts perhaps) since there is very little to go on. Keep in mind that you are developing a database structure that is designed specifically for analytical queries and thus you want to minimize or completely remove joins to large tables, hence your dimensions should be small. V/r Anthony On Jan 15, 2008 1:13 AM, Devin Mullins <[EMAIL PROTECTED]> wrote: > Hi all, > > I'm brand new (as in, this morning) to AW and OLAP, and have come up > against a wall pretty quick. I'm working on a newspaper site, where an > Article has_many :regions and has_many :topics. I see from pages like > http://www.dbmsmag.com/9808d05.html and > http://technet.microsoft.com/en-us/library/ms345139.aspx that the star > schema can be extended with bridges/helpers/snowflakery to support > multivalued dimensions, but I have questions: > 1. A couple of places seemed to argue that the bridge tables defeated > the performance advantages of the star schema, as bridge tables > would have even more rows than fact tables. Agree/disagree? > 2. I suppose I could just make three fact tables, Article, > ArticleRegion, and ArticleTopic, but that's a bunch of added > complexity, and I lose the ability to slice by both topic and region > at once (or by topic grouping, without double-counting). > Agree/disagree? > > And most importantly: > 3. How would I implement such a beast in ActiveWarehouse? I got as far > as `script/generate bridge`. :P Is there cube support for this type > of thing? > > Thanks, > Devin > ___ > Activewarehouse-discuss mailing list > Activewarehouse-discuss@rubyforge.org > http://rubyforge.org/mailman/listinfo/activewarehouse-discuss > ___ Activewarehouse-discuss mailing list Activewarehouse-discuss@rubyforge.org http://rubyforge.org/mailman/listinfo/activewarehouse-discuss
[Activewarehouse-discuss] Fact/Dimension Many-to-Many Relationship (i.e. multivalued dimensions)?
Hi all, I'm brand new (as in, this morning) to AW and OLAP, and have come up against a wall pretty quick. I'm working on a newspaper site, where an Article has_many :regions and has_many :topics. I see from pages like http://www.dbmsmag.com/9808d05.html and http://technet.microsoft.com/en-us/library/ms345139.aspx that the star schema can be extended with bridges/helpers/snowflakery to support multivalued dimensions, but I have questions: 1. A couple of places seemed to argue that the bridge tables defeated the performance advantages of the star schema, as bridge tables would have even more rows than fact tables. Agree/disagree? 2. I suppose I could just make three fact tables, Article, ArticleRegion, and ArticleTopic, but that's a bunch of added complexity, and I lose the ability to slice by both topic and region at once (or by topic grouping, without double-counting). Agree/disagree? And most importantly: 3. How would I implement such a beast in ActiveWarehouse? I got as far as `script/generate bridge`. :P Is there cube support for this type of thing? Thanks, Devin ___ Activewarehouse-discuss mailing list Activewarehouse-discuss@rubyforge.org http://rubyforge.org/mailman/listinfo/activewarehouse-discuss