Whats the best way to manage 'generic two-way relationships'?

2003-01-17 Thread Hoffman, Geoffrey
I couldn't think of a better term for what I want to do...

I am building a CMS for news stories. Often news stories are related to each
other, or need to be grouped by content. I need a simple way to manage
storyId relationships - in both directions. 

Thinking about the business logic to implement this is a bit confusing,
because if you say story f is related to story b, and b is already related
to c,d,e, then do I need aditional records for f-c, f-d, f-e? It gets worse
if story f is related to g,h,i, and k is related to l,m,n and then you
relate f to k... then you'd have to insert additional 9 (or would it be 12)
records for the cross-product of all combinations of all the relationships.
Or would you? And I can't think of how you'd handle this at the DB level
(I'd have to loop in PHP and have 9 inserts?)

My first thought was:

CREATE TABLE storyRelationship(
  relIdINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  storyId1 INT UNSIGNED NOT NULL INDEX,
  storyId2 INT UNSIGNED NOT NULL INDEX
);

so, for a given target story ID I would need a query that has something
like...
... WHERE storyId1 = targetId OR storyId2 = targetId


Another way I thought of has its own problems -

CREATE TABLE storyRelationship (
  relId INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
  storyIdList TEXT 
};

of which a record would look something like:
relId 5
storyIdList '46,59,89,94,213'

but then I would need to do a lot of string parsing to get related stories,
and I have to believe that a query with a 'LIKE %id%' clause is going to be
a lot slower than one with an OR looking on two indexed rows.

Any suggestions?



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: Whats the best way to manage 'generic two-way relationships'?

2003-01-17 Thread Brian Lindner
Geoffrey,

Friday, January 17, 2003, 12:01:20 PM, you wrote:
 I couldn't think of a better term for what I want to do...

 I am building a CMS for news stories. Often news stories are related to each
 other, or need to be grouped by content. I need a simple way to manage
 storyId relationships - in both directions. 

Their is no easy answer..

 Thinking about the business logic to implement this is a bit confusing,
 because if you say story f is related to story b, and b is already related
 to c,d,e, then do I need aditional records for f-c, f-d, f-e?

This would depend on your business needs... if f relates to b..  do
you assume c, d, e  relations to f are equal to b's??  if so.. then
inserting the extra rows could be valid.  Or you could look at it as
f relates to b.. b relates to c,d,eand NOT add the inserts then
you could have Levels of relation (degrees of separation) and have a
script that looks throught the heirarchy.  You would have to watch for
infinate looks as you could have an  e relates to f row somewhere
which would go on forever.


 It gets worse if story f is related to g,h,i, and k is related to l,m,n and then you
 relate f to k... then you'd have to insert additional 9 (or would it be 12)
 records for the cross-product of all combinations of all the relationships.
 Or would you? And I can't think of how you'd handle this at the DB level
 (I'd have to loop in PHP and have 9 inserts?)

Having the extra inserts in the DB can be a pain.. BUT at runtime it
will be faster as all the combinations are made already.. and you done need to
parse the tree heirarchy of relations   (f - b,  b - c,d,e)
but again.. all depends on how you want to treat things

* extra rows: will speed up runtime usage.. but extra storage and
inserts
* parse heirarchy (recursive code): slower at runtime.. but you would
not have to worry about maintaining the entire heirarchy in the db
table

think of it like this.. 9+ extra inserts when a story is saved.. OR a
bunch of different SQL selects at runtime EVERY TIME the story gets
views (if you were displaying related story info that is)

// to get related articles without storing values
select stroyid2 from storyRelationship where storyid1 = f

then have to loop throught that resultset for each storyid2 to get
what it relates to.. then loop through those,, and so on (recursively)


Personally, i like the extra storage of the rows.. sql is easier
(selects) and faster at runtime usually

 CREATE TABLE storyRelationship(
   relIdINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
   storyId1 INT UNSIGNED NOT NULL INDEX,
   storyId2 INT UNSIGNED NOT NULL INDEX
 );

an example:

I would add a   separation_level INT UNSIGNED NOT NULL INDEX to the
table.  This would allow you to query and get everything that relates
to f.. Directly.. or at any separation level.  Then you build all the
relationships like And yes, to build it you would have to do some
recursive calls possible.. but at least only once for each save (not
for every view)

in your example (f - b,k  b - c,d,e , k -  l,m)

relid  stroyid1 storyid2  separation_level
1  fb 0
2  bc 0
3  bd 0
4  be 0
5  fc 1
6  fd 1
7  fe 1
8  kl 0
9  km 0
10 fk 0
11 fl 2
12 fm 2


to get direct relations to f
Select storyid2 from storyRelationship where storyid1 = f and
separation_level = 0

or remove the separateion_level = 0 clause.. and it will return all
related stories reguardless of how.
I would use the separation_level to at least Rank how important a
story is related to f..  just order by seperation_level so direct relations
are first..

The only thing you loose by having all the compinations in the table..
is HOW the relation was made... no way to tell that the reason f
relates to c,d,e was because b related to them... If you use the code
heirarchy parser, you could maintain that kind of info.. but you
probably done need it i think (depends on what your business needs
are)

Brian


 so, for a given target story ID I would need a query that has something
 like...
 ... WHERE storyId1 = targetId OR storyId2 = targetId


 Another way I thought of has its own problems -

 CREATE TABLE storyRelationship (
   relId INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
   storyIdList TEXT 
 };

 of which a record would look something like:
 relId 5
 storyIdList '46,59,89,94,213'

 but then I would need to do a lot of string parsing to get related stories,
 and I have to believe that a query with a 'LIKE %id%' clause is going to be
 a lot slower than one with an OR looking on two indexed rows.

 Any suggestions?



 -
 Before posting, please check:
http://www.mysql.com/manual.php   (the 

Re[2]: Whats the best way to manage 'generic two-way relationships'?

2003-01-17 Thread Brian Lindner
oops small mistake in my prev post..

 I meant to say  (f - b  b- c,d,e, Kk- l,m) in my example
 NOT  f-b,k  b-c,d,e

lol.. got a little confused with all those insert examples ;)

Friday, January 17, 2003, 2:01:22 PM, you wrote:
 Geoffrey,

 Friday, January 17, 2003, 12:01:20 PM, you wrote:
 I couldn't think of a better term for what I want to do...

 I am building a CMS for news stories. Often news stories are related to each
 other, or need to be grouped by content. I need a simple way to manage
 storyId relationships - in both directions. 

 Their is no easy answer..

 Thinking about the business logic to implement this is a bit confusing,
 because if you say story f is related to story b, and b is already related
 to c,d,e, then do I need aditional records for f-c, f-d, f-e?

 This would depend on your business needs... if f relates to b..  do
 you assume c, d, e  relations to f are equal to b's??  if so.. then
 inserting the extra rows could be valid.  Or you could look at it as
 f relates to b.. b relates to c,d,eand NOT add the inserts then
 you could have Levels of relation (degrees of separation) and have a
 script that looks throught the heirarchy.  You would have to watch for
 infinate looks as you could have an  e relates to f row somewhere
 which would go on forever.


 It gets worse if story f is related to g,h,i, and k is related to l,m,n and then you
 relate f to k... then you'd have to insert additional 9 (or would it be 12)
 records for the cross-product of all combinations of all the relationships.
 Or would you? And I can't think of how you'd handle this at the DB level
 (I'd have to loop in PHP and have 9 inserts?)

 Having the extra inserts in the DB can be a pain.. BUT at runtime it
 will be faster as all the combinations are made already.. and you done need to
 parse the tree heirarchy of relations   (f - b,  b - c,d,e)
 but again.. all depends on how you want to treat things

 * extra rows: will speed up runtime usage.. but extra storage and
 inserts
 * parse heirarchy (recursive code): slower at runtime.. but you would
 not have to worry about maintaining the entire heirarchy in the db
 table

 think of it like this.. 9+ extra inserts when a story is saved.. OR a
 bunch of different SQL selects at runtime EVERY TIME the story gets
 views (if you were displaying related story info that is)

 // to get related articles without storing values
 select stroyid2 from storyRelationship where storyid1 = f

 then have to loop throught that resultset for each storyid2 to get
 what it relates to.. then loop through those,, and so on (recursively)


 Personally, i like the extra storage of the rows.. sql is easier
 (selects) and faster at runtime usually

 CREATE TABLE storyRelationship(
   relIdINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
   storyId1 INT UNSIGNED NOT NULL INDEX,
   storyId2 INT UNSIGNED NOT NULL INDEX
 );

 an example:

 I would add a   separation_level INT UNSIGNED NOT NULL INDEX to the
 table.  This would allow you to query and get everything that relates
 to f.. Directly.. or at any separation level.  Then you build all the
 relationships like And yes, to build it you would have to do some
 recursive calls possible.. but at least only once for each save (not
 for every view)

 in your example (f - b,k  b - c,d,ek -  l,m)

FIX!!: I meant to say  (f - b   b-  c,d,e, K k- l,m)


 relid  stroyid1 storyid2  separation_level
 1  fb 0
 2  bc 0
 3  bd 0
 4  be 0
 5  fc 1
 6  fd 1
 7  fe 1
 8  kl 0
 9  km 0
 10 fk 0
 11 fl 2
 12 fm 2




 to get direct relations to f
 Select storyid2 from storyRelationship where storyid1 = f and
 separation_level = 0

 or remove the separateion_level = 0 clause.. and it will return all
 related stories reguardless of how.
 I would use the separation_level to at least Rank how important a
 story is related to f..  just order by seperation_level so direct relations
 are first..

 The only thing you loose by having all the compinations in the table..
 is HOW the relation was made... no way to tell that the reason f
 relates to c,d,e was because b related to them... If you use the code
 heirarchy parser, you could maintain that kind of info.. but you
 probably done need it i think (depends on what your business needs
 are)

 Brian


 so, for a given target story ID I would need a query that has something
 like...
 ... WHERE storyId1 = targetId OR storyId2 = targetId


 Another way I thought of has its own problems -

 CREATE TABLE storyRelationship (
   relId INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
   storyIdList TEXT 
 };

 of which a record would look something like:
 relId 5
 storyIdList '46,59,89,94,213'

 but then I would 

Re: Whats the best way to manage 'generic two-way relationships'?

2003-01-17 Thread Gelu Gogancea
Hi,
- Original Message -
From: Hoffman, Geoffrey [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Friday, January 17, 2003 7:01 PM
Subject: Whats the best way to manage 'generic two-way relationships'?


 I couldn't think of a better term for what I want to do...

 I am building a CMS for news stories. Often news stories are related to
each
 other, or need to be grouped by content. I need a simple way to manage
 storyId relationships - in both directions.

 Thinking about the business logic to implement this is a bit confusing,
 because if you say story f is related to story b, and b is already related
 to c,d,e, then do I need aditional records for f-c, f-d, f-e? It gets
worse
 if story f is related to g,h,i, and k is related to l,m,n and then you
 relate f to k... then you'd have to insert additional 9 (or would it be
12)
 records for the cross-product of all combinations of all the
relationships.
 Or would you? And I can't think of how you'd handle this at the DB level
 (I'd have to loop in PHP and have 9 inserts?)

 My first thought was:

 CREATE TABLE storyRelationship(
   relIdINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
   storyId1 INT UNSIGNED NOT NULL INDEX,
   storyId2 INT UNSIGNED NOT NULL INDEX

In my opinion, is not too well(i don't wish to said it's a big mistake).In
this way you will not be able to have data flexibility.The cardinality of
your related stories will be 2 for ever.
I think you can solve your situation with 5 tables.
1 table for data attribute(category,etc).CATEGORY
2 table for stories - i think in this way because you have 2 entities of
relations : TO_RELATION and FROM_RELATION
2 index tables : one table for every relation table which can contain
interchangeable  ID's of stories:IDX_TO_RELATION which that contain ID's
from FROM_RELATION and IDX_FROM_RELATION which that contain ID's from
TO_RELATION
A little schema (please maximize):


FROM_RELATION(FIELDS:
IDFROM,IDCAT..) IDX_FROM_RELATION(FIELDS: IDFROM,IDRELATED)

CATEGORY(FIELDS: IDCAT...)

TO_RELATION(FIELDS:
IDTO,IDCAT,.)--IDX_TO_RELATION(FIELDS: IDTO,IDRELATED)

relation schema for to_relation can be : CATEGORY.IDCAT-TO_RELATION.IDCAT

TO_RELATION-IDX_TO_RELATION.IDTO

IDX_TO_RELATION.IDRELATED-FROM_RELATIONS.IDFROM


..and for from_relation can be: CATEGORY.IDCAT-FROM_RELATION.IDCAT

FROM_RELATION.IDFROM-IDX_FROM_RELATION.IDFROM

IDX_FROM_RELATION.IDRELATED-TO_RELATION.IDTO

...it's very untested what i drawing here but i think it's a good start.
I hope it help

Best regards,

Gelu

 );

 so, for a given target story ID I would need a query that has something
 like...
 ... WHERE storyId1 = targetId OR storyId2 = targetId


 Another way I thought of has its own problems -

 CREATE TABLE storyRelationship (
   relId INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
   storyIdList TEXT
 };

 of which a record would look something like:
 relId 5
 storyIdList '46,59,89,94,213'

 but then I would need to do a lot of string parsing to get related
stories,
 and I have to believe that a query with a 'LIKE %id%' clause is going to
be
 a lot slower than one with an OR looking on two indexed rows.

 Any suggestions?



 -
 Before posting, please check:
http://www.mysql.com/manual.php   (the manual)
http://lists.mysql.com/   (the list archive)

 To request this thread, e-mail [EMAIL PROTECTED]
 To unsubscribe, e-mail [EMAIL PROTECTED]
 Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php