[Activewarehouse-discuss] Fact/Dimension Many-to-Many Relationship (i.e. multivalued dimensions)?

2008-01-15 Thread Devin Mullins
(Meta: Sorry for not replying in-thread -- I'm posting this from
another email account.)

> I believe you said:
> Article has_many Topics
> Article has_many Regions
>
> Is this correct? If it is then you might have something like this:
>
> ...
> article_facts
> * topic_id
> * region_id
> * hit (always 1)
>
> With this structure you'd be able to aggregate your facts by the
> various attributes in both topic and dimension. Is this starting to
> look like what you are trying to do or am I still missing something
> here?

I think you're still missing something. If I do the above, I'm stuck
with one of two alternatives:
1. I have to choose a "primary" topic and region for each article, and
discard the others, or
2. For each article, I have m*n article_facts, where m is the number
of regions and n is the number of topics.

The former leads to artificially low numbers (where an article is not
counted for all but one of its topics/regions). The latter leads to
artificially high numbers (where an article is n-tuply counted for
every region, and m-tuply counted for every topic).

I suppose I could fudge the numbers a bit, by creating a
"weighting" column which 1/sqrt(m*n) (inverse of geometric mean, so
the over/undercounting would be right "on average", assuming the
number of regions and number of topics are independent variables), but
then I'd feel dirty.

Perhaps I scared you by referencing an MSDN technet article. The other
article (http://www.dbmsmag.com/9808d05.html) was written by Ralph
Kimball. :)

Looking at the source code more, it looks like it's hard-coded to
recognize HierarchicalBridges, and there's no real support for
pluggable bridging. (Please correct me if I'm wrong.) If I decide to
follow the approach laid out in the above article, I'd have to modify
the source code to whatever Aggregate I'm using. Does this sound
correct? Feasible? Stupid?

I'd rather use ActiveWarehouse if I can, because it looks like there's
a lot of knowledge here I could benefit from, and a lot of boilerplate
I can save, but I suppose lacking a solution to the above, I'll just
start hard coding some aggregate tables. Not that I blame the authors
-- it's no fault of yours if none of your data has multivalued
dimensions. If there's another way I can granulate the data that gets
rid of them but still lets me slice/count the way I'd like, I'm open
to that, as well.
___
Activewarehouse-discuss mailing list
Activewarehouse-discuss@rubyforge.org
http://rubyforge.org/mailman/listinfo/activewarehouse-discuss


Re: [Activewarehouse-discuss] Fact/Dimension Many-to-Many Relationship (i.e. multivalued dimensions)?

2008-01-15 Thread Anthony Eden
I believe you said:

Article has_many Topics
Article has_many Regions

Is this correct? If it is then you might have something like this:

topic_dimension
* id
* name
* [other attributes]

region_dimension
* id
* name
* [other attributes]

article_facts
* topic_id
* region_id
* hit (always 1)

With this structure you'd be able to aggregate your facts by the
various attributes in both topic and dimension. Is this starting to
look like what you are trying to do or am I still missing something
here?

V/r
Anthony

On Jan 15, 2008 8:41 AM, Devin Mullins <[EMAIL PROTECTED]> wrote:
> Anthony, thanks for responding so promptly.
>
> On Tue, Jan 15, 2008 at 08:03:18AM -0500, Anthony Eden wrote:
> > Regions and topics should probably be dimensions and article a fact.
> > What I can't figure out is what the measurements are in this (aside
> > from counts perhaps) since there is very little to go on.
> Yes, counts is it, at least for now. Maybe aggregate hits some day.
> Assuming the idea of three fact tables is out, then, the question
> remains: How do I implement multivalued dimensions in ActiveWarehouse?
>
> > you want to minimize or
> > completely remove joins to large tables, hence your dimensions should
> > be small.
> Well, that's a problem. The bridge table would be huge -- more rows than
> the fact table. (Though, granted, only two columns -- it's essentially a
> :through table.) But saying "no" to the customer isn't really an option
> I'd like to take right now...
>
> Thanks,
> Devin
>
___
Activewarehouse-discuss mailing list
Activewarehouse-discuss@rubyforge.org
http://rubyforge.org/mailman/listinfo/activewarehouse-discuss


Re: [Activewarehouse-discuss] Fact/Dimension Many-to-Many Relationship (i.e. multivalued dimensions)?

2008-01-15 Thread Devin Mullins
Anthony, thanks for responding so promptly.

On Tue, Jan 15, 2008 at 08:03:18AM -0500, Anthony Eden wrote:
> Regions and topics should probably be dimensions and article a fact.
> What I can't figure out is what the measurements are in this (aside
> from counts perhaps) since there is very little to go on.
Yes, counts is it, at least for now. Maybe aggregate hits some day.
Assuming the idea of three fact tables is out, then, the question
remains: How do I implement multivalued dimensions in ActiveWarehouse?

> you want to minimize or
> completely remove joins to large tables, hence your dimensions should
> be small.
Well, that's a problem. The bridge table would be huge -- more rows than
the fact table. (Though, granted, only two columns -- it's essentially a
:through table.) But saying "no" to the customer isn't really an option
I'd like to take right now...

Thanks,
Devin
___
Activewarehouse-discuss mailing list
Activewarehouse-discuss@rubyforge.org
http://rubyforge.org/mailman/listinfo/activewarehouse-discuss


Re: [Activewarehouse-discuss] Fact/Dimension Many-to-Many Relationship (i.e. multivalued dimensions)?

2008-01-15 Thread Anthony Eden
Regions and topics should probably be dimensions and article a fact.
What I can't figure out is what the measurements are in this (aside
from counts perhaps) since there is very little to go on. Keep in mind
that you are developing a database structure that is designed
specifically for analytical queries and thus you want to minimize or
completely remove joins to large tables, hence your dimensions should
be small.

V/r
Anthony

On Jan 15, 2008 1:13 AM, Devin Mullins <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> I'm brand new (as in, this morning) to AW and OLAP, and have come up
> against a wall pretty quick. I'm working on a newspaper site, where an
> Article has_many :regions and has_many :topics. I see from pages like
> http://www.dbmsmag.com/9808d05.html and
> http://technet.microsoft.com/en-us/library/ms345139.aspx that the star
> schema can be extended with bridges/helpers/snowflakery to support
> multivalued dimensions, but I have questions:
>  1. A couple of places seemed to argue that the bridge tables defeated
> the performance advantages of the star schema, as bridge tables
> would have even more rows than fact tables. Agree/disagree?
>  2. I suppose I could just make three fact tables, Article,
> ArticleRegion, and ArticleTopic, but that's a bunch of added
> complexity, and I lose the ability to slice by both topic and region
> at once (or by topic grouping, without double-counting).
> Agree/disagree?
>
> And most importantly:
>  3. How would I implement such a beast in ActiveWarehouse? I got as far
> as `script/generate bridge`. :P Is there cube support for this type
> of thing?
>
> Thanks,
> Devin
> ___
> Activewarehouse-discuss mailing list
> Activewarehouse-discuss@rubyforge.org
> http://rubyforge.org/mailman/listinfo/activewarehouse-discuss
>
___
Activewarehouse-discuss mailing list
Activewarehouse-discuss@rubyforge.org
http://rubyforge.org/mailman/listinfo/activewarehouse-discuss


[Activewarehouse-discuss] Fact/Dimension Many-to-Many Relationship (i.e. multivalued dimensions)?

2008-01-14 Thread Devin Mullins
Hi all,

I'm brand new (as in, this morning) to AW and OLAP, and have come up
against a wall pretty quick. I'm working on a newspaper site, where an
Article has_many :regions and has_many :topics. I see from pages like
http://www.dbmsmag.com/9808d05.html and
http://technet.microsoft.com/en-us/library/ms345139.aspx that the star
schema can be extended with bridges/helpers/snowflakery to support
multivalued dimensions, but I have questions:
 1. A couple of places seemed to argue that the bridge tables defeated
the performance advantages of the star schema, as bridge tables
would have even more rows than fact tables. Agree/disagree?
 2. I suppose I could just make three fact tables, Article,
ArticleRegion, and ArticleTopic, but that's a bunch of added
complexity, and I lose the ability to slice by both topic and region
at once (or by topic grouping, without double-counting).
Agree/disagree?

And most importantly:
 3. How would I implement such a beast in ActiveWarehouse? I got as far
as `script/generate bridge`. :P Is there cube support for this type
of thing?

Thanks,
Devin
___
Activewarehouse-discuss mailing list
Activewarehouse-discuss@rubyforge.org
http://rubyforge.org/mailman/listinfo/activewarehouse-discuss