I'm not sure whether the following will meet your needs.  Have you 
considered

SELECT title FROM Title WHERE NOT EXISTS (SELECT * FROM Keyword, 
TitleKeyword WHERE Keyword.kw='A' AND Keyword.id=TitleKeyword.keyword_id 
AND TitleKeyword.title_id=Title.id)

Regards,
Mike Spreitzer
SMTP: mspre...@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
Office phone: +1-914-784-6424 (IBM T/L 863-)
AOL Instant Messaging: M1k3Sprtzr



James Fryer <j...@invocrown.com> 
09/24/09 06:42 AM

To
mysql@lists.mysql.com
cc

Subject
Excluding records that don't match condition






I have a database of Titles (books, magazines, etc.) with a M:M relation 
to Keywords.

I would like to be able to generate queries for the condition "Return 
titles matching X with keywords NOT matching A".

This seems quite hard to do. Here is a minimal table structure:

CREATE TABLE Title
     (
     id int(10) unsigned NOT NULL,
     title CHAR(2)
     );
CREATE TABLE Keyword
     (
     id int(10) unsigned NOT NULL,
     kw CHAR(1)
     );
CREATE TABLE TitleKeyword
     (
     title_id int(10) unsigned NOT NULL,
     keyword_id int(10) unsigned NOT NULL
     );

# X1: A, B
# X2: B, C
# X3: C
# Y1: A, B
# Y2: B, C
# Y3: C
INSERT INTO Title (id, title) VALUES (1, 'X1'), (2, 'X2'), (3, 'X3'), 
(4, 'Y1'), (5, 'Y2'), (6, 'Y3');
INSERT INTO Keyword (id, kw) VALUES (1, 'A'), (2, 'B'), (3, 'C');
INSERT INTO TitleKeyword VALUES (1, 1), (1, 2), (2, 2), (2, 3), (3, 3), 
(4, 1), (4, 2), (5, 2), (5, 3), (6, 3);

Naively I tried this query:

SELECT DISTINCT
     title
FROM
     Title
JOIN
     TitleKeyword ON Title.id=title_id
JOIN
     Keyword ON Keyword.id=keyword_id
WHERE
     title LIKE 'X%'
     AND kw <> 'A'
;

but this includes X1 because it matches B as well as A. I only want X2, 
X3 returned.

This works:

SELECT
     title
FROM
     Title
WHERE title LIKE 'X%'
     AND Title.id NOT IN(
     SELECT
         title_id
     FROM
         TitleKeyword
     JOIN
         Keyword ON Keyword.id=keyword_id
     WHERE
         kw = 'A'
     )
;

However, this uses subselects which I have always found slow, and there 
may be many keywords (thousands), and I believe IN() is not recommended 
for large lists.

So my question is, can this query be rewritten to use JOINs? It seems 
hard to me because all the keywords need to be examined to eliminate the 
title. On the other hand it must be a common requirement so there may be 
something I have overlooked.

Many thanks in advance,

James
-- 
James Fryer  /  j...@invocrown.com  /  j...@cix.co.uk

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=mspre...@us.ibm.com




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Reply via email to