Re: a graph theory problem

Benjamin Pflugmann Sat, 27 Apr 2002 12:34:54 -0700

Hello.

On Fri, Apr 26, 2002 at 01:43:21AM +0200, [EMAIL PROTECTED] wrote:
> Hello folks,
> 
> here is a pretty original (I think) problem for your minds! :-)


Well, to be true, not really. :-)

> I'd like to store a graph in a MySQL database. By graph, I mean the
> graph theory meaning (i.e., "a set of connections between pairs of
> nodes, which (the connections) may have a direction and/or weight"),
> not the meaning "plot of a value over time" or "graphical
> representation of a function value's dependence on the function
> argument".
> 
> To give an example, say you have a table of products and you want to
> store, for each product, which other products it is compatible with.
> When querying, you would like to retrieve a table where you have
> products in the row headings _and_ in the column headings, with
> either 1's or 0's (for YES or NO) in the cells. So for example, you
> would have printers, toner cartridges, mainboards and processors as
> products. There would be two distinct compatibility matrices:
> printers against toners, and mainboards against processors.

Oh. I thought you had a more complex problem. Because you don't seem
interested in more than one node and it's connections at a time, it is
simply a many-to-many problem.

[...]
> For storing, there seem to be about three possibilities:
> 1) A table with as many columns as there are rows. I don't know what
> is the limit on the number of columns, but as there might be _lots_
> of rows, this doesn't seem like a very good idea.
> 2) A table with a SET column (i.e. "node int, connections set") - this
> might work for lots of rows but not lots of connections from one
> node; also, it doesn't store the "weights" of the connections.
> 3) A table with the columns: (node1 int, node2 int, connection int).
> This is probably the most general way, but also the most space-
> inefficient, I would guess.

The answer from database theory is 3). You describe a many-to-many
relationship and this is represented this way usually.

2) is a replacement for 3), which contradicts theory, and is only
reasonable, if the set is small and doesn't change (which wouldn't be
true in your case), in which case it simplifies the relationships and
the queries.

I think I would never use 1). For any new product you would have to
change the table layout! 


I would start with 3), i.e. a many-to-many table like this (from head,
not tested):

CREATE TABLE compatible
  item INT NOT NULL,
  with INT NOT NULL,
  PRIMARY KEY (item1,item2)
);

This presumes, there is a row only, when you know the items are
compatible. Also, all items are expected to be listed in "item",
i.e. you will have (A,B) and (B,A) in the table if A and B are
compatible.

That is partly redundant and brings a risk for inconsitency, but also
is the more flexible approach (having only (A,B) makes sense in some
cases) and makes the queries simpler.

> For retrieving, AFAIK, SQL doesn't provide any way to transform rows
> into columns, right?

Not that I know of.  

> Which means that in case 3), I would have to do many queries and
> transform the results in a procedural language, right?

Why many queries? 

You are right, you have to reformat this in the application side, but
this is the usual thing to do anyhow.

To retrieve the whole matrix you would do

SELECT i.id, i.name, w.name
FROM   item i, compatible c, item w
WHERE  i.id = c.item AND
       w.id = c.with
ORDER BY i.name

and then loop thorugh the result. A new row of your matrix starts when
i.id changes (you could do without i.id and use i.name, but the way
above also handles the case where two items have the same name for
some reason).

> I would like to pick any product and fetch the whole compatibility
> matrix that contains it.

If you want which products are compatible to an PIII
you would do:

SELECT w.name
FROM   item i, compatible c, item w
WHERE  i.name = 'Pentium III' AND
       c.item = i.id AND
       w.id = c.with;

If you want the matrix for printers and toners this would be

SELECT i.id, i.name, w.name
FROM   item i, compatible c, item w
WHERE  i.type = 'printer' AND
       w.type = 'toner' AND
       i.id = c.item AND
       w.id = c.with
ORDER BY i.name, w.name

Of course, it would be reasonable to have an index on (type, id) of
item. :)

> So, that's it. Any ideas, comments, thoughts, questions? Am I trying
> to solve something that is already solved? Does MySQL have "SELECT
> MATRIX"? :-)))

Well, I don't say that this will be the most reasonable approach to
your problem (which you stated a bit general), but a reaonsable start
and the usual approach.

Bye,

        Benjamin.

-- 
[EMAIL PROTECTED]

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: a graph theory problem

Reply via email to