Ken,

   The list is locked at the moment, so this reply may take awhile to reach 
you, so sorry if this question has been answered already (or I'm too late).

   Anyway, here's how I'd format it (assuming I read the question right, that 
you want distinct institutions that have listing from July 1st 2011 on and have 
never been listed before then):

   SELECT distinct institution FROM renewals WHERE date >= '2011-07-01' AND 
institution NOT IN (SELECT distinct institution FROM renewals WHERE date < 
'2011-07-01');


David K. Uspal
Technology Development Specialist
Falvey Memorial Library
Phone: 610-519-8954
Email: [email protected]





-----Original Message-----
From: Code for Libraries [mailto:[email protected]] On Behalf Of Ken 
Irwin
Sent: Wednesday, September 28, 2011 12:41 PM
To: [email protected]
Subject: [CODE4LIB] mysql subquery response time

Hi all,

I've not done much with MySQL subqueries, and I'm trying right now with what I 
find to be surprising results. I wonder if someone can help me understand.

I have a pile of data that with columns for "institution" and "date". 
Institution gets repeated a lot, with many different dates. I want to select 
all the institutions that *only* have dates after July 1 and don't appear in 
the table before that. My solution was to do a first query for all the 
institutions that DO have dates before July 1
SELECT distinct institution FROM `renewals` WHERE snap_date < '2011-07-01'

And then to do a SELECT query on all the institutions:
SELECT distinct institution from renewals

And then try to do a NOT IN subquery subtracting the smaller query from the 
larger one:

SELECT distinct institution from renewals
WHERE institution not in
(SELECT distinct institution FROM `renewals` WHERE snap_date < '2011-07-01')

...only it doesn't seem to work. Or rather, the query has been running for 
several minutes and never comes back with an answer. Each of these two queries 
takes just a few milliseconds to run on its own.

Can someone tell me (a) am I just formatting the query wrong, (b) do subqueries 
like this just take forever, and/or (c) is there a better way to do this? (I 
don't really understand about JOIN queries, but from what I can tell they are 
only for mixing the results of two different tables so I think they might not 
apply here.)

Any advice would be most welcome.

Thanks
Ken

Reply via email to