Hobbs, Richard wrote:
> Hello,
>
> I have two tables - one containing messages, and another containing links
> between messages and other messages in a tree structure (much like a
> threaded mailing list archiving thing). A single message can have
> multiple "parents" though, meaning the links table can have several
> entries for a single message.
> For example:
>
> ----------------
> child    parent
> 10        5
> 10        7
> 11        5
> 12        7
> 13        5
> 13        7
> ----------------
>
> I would like to display all messages which match both 5 and 7 in terms of
> the parent_id, meaning messages 10 and 13 would be displayed.
>
> I have used the following query:
>
> ------------------------------------------------------------
> SELECT DISTINCT message.username,message.content
> FROM message,links WHERE links.child_id = message.id AND (
> links.parent_id = 5 OR links.parent_id = 7
> );
> ------------------------------------------------------------
>
> NOTE: Without the word DISTINCT, if this query finds a message that
> matches both 5 AND 7, it will display the message twice. I have obviously
> used DISTINCT as an easy way to get around this problem.
>
> However, this query displays the message if it matches 5 OR 7. I only
> want it to be displayed if it matches 5 AND 7.
>
> However, if i change the word "OR" to "AND", it displays no message at
> all!
>
> I presume this is because it finds two instances of each message, neither
> of which match both 5 and 7 (i.e. the first instance matches 5, but not
> 7, and the second instance matches 7, but not 5).
>
> Does anyone know how i can get around this problem?
>
> Thanks in advance to anyone who can help! :-)
>
> Richard.


Peter Brawley wrote:
> Richard
>
> >I would like to display all messages which match both 5 and 7 in terms
> >of the parent_id, meaning messages 10 and 13 would be displayed.
>
> SELECT f1.child
> FROM foo AS f1
> INNER JOIN foo AS f2 USING(child)
> WHERE  f1.parent=5 AND f2.parent=7;
>
> PB

Hobbs, Richard wrote:
> Hello,
>
> Perfect :-)
>
> Thank you,
> Hobbs.

Peter's solution is a self-join.  Here it is translated to your tables:

  SELECT message.username, message.content
  FROM message
  JOIN links l1 ON l1.child_id = message.id
  JOIN links l2 ON l2.child_id = message.id
  WHERE l1.parent_id = 5
    AND l2.parent_id = 7;

This works fine.  For completeness, I'll point out an alternate solution.

Your original query, before you added DISTINCT, produced two rows for each message you wanted, and 1 row for each message that had one, but not both, of the desired parents. That is, number of rows per message equals number of matching criteria. We can use this difference to select only the rows you want:

  SELECT message.username, message.content
  FROM message
  JOIN links ON links.child_id = message.id
  WHERE links.parent_id IN (5, 7)
  GROUP BY message.id
  HAVING COUNT(*) = 2;

This replaces a JOIN with a GROUP BY, which may be faster. You might want to test both ways to see which works better for your data.

If you will ever need messages with more than 2 specified parents, I think you'll find the second method works better. The self-join method requires an additional JOIN and an additional WHERE condition for each requirement. For example, here's the self join for 3 criteria:

  SELECT message.username, message.content
  FROM message
  JOIN links l1 ON l1.child_id = message.id
  JOIN links l2 ON l2.child_id = message.id
  JOIN links l3 ON l3.child_id = message.id
  WHERE l1.parent_id = 5
    AND l2.parent_id = 7
    AND l3.parent_id = 8;

In contrast, the GROUP BY solution changes very little:

  SELECT message.username, message.content
  FROM message
  JOIN links ON links.child_id = message.id
  WHERE links.parent_id IN (5, 7, 8)
  GROUP BY message.id
  HAVING COUNT(*) = 3;

The extra criteria are added to the IN list, and the HAVING clause is changed to look for COUNT(*) = number_of_criteria. In this case, notice that we have replaced two JOINs with one GROUP BY.

(Note: For the GROUP BY versions, I am assuming there is a unique value of message.username and message.content for each message.id, and I'm using a mysql extension <http://dev.mysql.com/doc/mysql/en/group-by-hidden-fields.html>. If the assumption is wrong, or you want portability, change the GROUP BY clause to "GROUP BY message.username, message.content".)

Michael

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to