Re: select where multiple joined records match

2005-02-21 Thread AM Thomas
Such bounty of comments!  Thank you, Michael Stassen.
I see how my logic was faulty, and that a more correct solution would  
indeed be faster.  Thanks for pointing that out and not making me feel  
like too much of an idiot :)   I'll try the revised solution.  I am  
clueful about "join" making more rows/record, but didn't realize that it  
would be *that* dramatic.

As for the NumericGrade field, I'm basically getting the text grade (e.g.  
8th) from someone's MS Access CSV export; I figured I'd leave it in place,  
and use the numeric grade for sorting.  I'd never run into speed issues  
before, so I was just trying to save myself coding time by sticking with  
what I had already.  I wouldn't need the special lookup tables for grade  
and subject; if I were to use numeric fields, I could just do a lookup in  
Perl.  Next time, when I try to do this better from the start, I probably  
will.  At the moment, though, I'm trying to avoid changing the Perl code  
as much as I can (it's much recycled from an earlier project).

Will look into indexing - that's probably covered in my old O'Reilly  
MySQL/mSQL book.

I am curious about how much faster numeric field comparisons would be to  
string field comparisons for the Grade field; Would it make enough of a  
difference to this problem for me to go mucking with this Perl code?  This  
is a CGI Web app, so it's not lightning fast anyway, but then there are a  
lot of comparisons going on.  Currently it looks like it will have about  
300 resource records and about six thousand resource_goal records; I  
should test this myself...

I do have a lot of multi-subject and multi-grade resources; my testing was  
just not very good, I think.

Oh, and I think I see the error of my ways with regard to my TINYTEXT  
fields.  Probably would do well to shrink those.  Easy to change, too.

Thanks!  You've given excellent explanations here.  I feel like I should  
buy your book now, if you have one!

I'm wrestling with CSS issues on IE 4.0 for the Mac at the moment, but  
will return to SQL issues soon, I hope.

More later probably,
AM

On Mon, 21 Feb 2005 14:30:59 -0500, Michael Stassen  
<[EMAIL PROTECTED]> wrote:

As I understand it, you are looking for a resource for 4th grade Social  
Studies which meets goals 1 through 4.  In terms of your tables, that  
corresponds to having 4 rows in resources_goals, *all* of which have  
grade=4th and Subject='Social_Studies'.  That is, we need to look in *4*  
copies of resources_goals (not 5).  For a given resource id, we want  
exactly one row from each copy, namely, the row with the correct  
resource id, correct subject, correct grade, and desired goal number.  I  
think this should do:

   SELECT r.id
   FROM resources as r
   JOIN resource_goals as g1
 ON  r.id = g1.ResourceID
 AND g1.Subject = 'Social_Studies'
 AND g1.Grade = '4th'
 AND g1.GoalNumber = 1
   JOIN resource_goals as g2
 ON  r.id = g2.ResourceID
 AND g2.Subject = 'Social_Studies'
 AND g2.Grade = '4th'
 AND g2.GoalNumber = 2
   JOIN resource_goals as g3
 ON  r.id = g3.ResourceID
 AND g3.Subject = 'Social_Studies'
 AND g3.Grade = '4th'
 AND g3.GoalNumber = 3
   JOIN resource_goals as g4
 ON  r.id = g4.ResourceID
 AND g4.Subject = 'Social_Studies'
 AND g4.Grade = '4th'
 AND g4.GoalNumber = 4;

--
Virtue of the Small / (919) 929-8687
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: select where multiple joined records match

2005-02-16 Thread AM Thomas
Hi there,
I'll answer your questions below, but I should tell you that it looks like  
even three or four "joins" seem to make this too slow.  Theoretically, the  
user could select up to 15 or so GoalNumber values, 12 for Grade, and  
about 20 possibilities for Subject - clearly it wouldn't be useful to  
specify that many items in a search, but that's the theoretical maximum; 4  
or 5 values isn't unreasonable, though.

Four "AND"ed goal numbers plus a subject and a grade slowed the search (on  
the shared commercial web host I'm using) into the 3 minute range, and  
that's with a regular join, not a left join.  This is the SELECT that took  
about 3 minutes (3 trials, simplifying slightly each time, simplest given  
here):

select r.id from resources as r
  join resource_goals as g0 on (r.id=g0.ResourceID)
  join resource_goals as g1 on (r.id=g1.ResourceID and g1.GoalNumber=1)
  join resource_goals as g2 on (r.id=g2.ResourceID and g2.GoalNumber=2)
  join resource_goals as g3 on (r.id=g3.ResourceID and g3.GoalNumber=3)
  join resource_goals as g4 on (r.id=g4.ResourceID and g4.GoalNumber=4)
where ((g0.Subject='Social_Studies')
  and (g0.Grade='4th'))
group by r.id;
The fastest time was 2 min 48 sec.
Last time (simplest query) was 3 min 2 sec.
I'm really running out of time on this project, so I just went ahead and  
made the user interface such that users can only select one subject,  
grade, and/or goal number at a time.  It's probably a sound decision from  
a usability perspective, so I'm not too sad.  If I decide to make this  
work in the future, I'd probably have to just do a SELECT for each "AND"ed  
field, get the list of resource id's for each SELECT, then find the  
intersection of the lists in Perl.

If I could speed this up with some kind of indexing, I'd love to know  
about it.

The GROUP BY phrase is because I wanted just one row per resource.  It  
seemed like I'd get a row for each condition/resource (didn't test it with  
the final ).  I'm actually doing SELECT * FROM... in my code, and not  
using the resource_goals information in my output (that's a separate view  
at present, generated by different Perl code).

It seems to work fine without the NOT NULL parts, you're right.  I was  
wondering about that, but was sleepy enough at the time that I didn't  
trust my thinking.

Sorry about not including my table defs :-(  .  I guess I was just hoping  
for a general approach, and didn't realize that anyone would be interested  
enough to read all that detail and provide and exact solution for me.  Of  
course, now I realize that it would have simplified our discussion.   
Anyway, late but not never, and for help to whoever finds this in the list  
archives someday, here are my table defs (you're right - the goal number  
is a TINYINT):

(Below is an abridged version of the resources table ; it also contains  
about 60 more TINYINT fields which are essentially used as booleans, some  
of which I hope to eliminate.  Yes, I could have used SET or something,  
but I didn't for various reasons.)

CREATE TABLE resources (
  id INT UNSIGNED PRIMARY KEY,
  Title TEXT,
  ResourceType_THJHArticle TINYINT,
  ResourceType_NIEArticle TINYINT,
  DataEntryName TINYTEXT,
  Date DATETIME,
  Notes TEXT,
  Made_Keywords TEXT);
CREATE TABLE  resource_goals (
  goal_id INT UNSIGNED PRIMARY KEY,
  ResourceID INT,
  Grade TINYTEXT,
  Subject TINYTEXT,
  GoalNumber TINYINT,
  NumericGrade TINYINT);
Thanks a bunch for your help; I'm finding this more interesting than I  
thought I would.


On Wed, 16 Feb 2005 11:08:20 -0500, Michael Stassen  
<[EMAIL PROTECTED]> wrote:

AM Thomas wrote:
Guarded exclamations of success!
 This seems to be working for me so far - thank you!  Here's an actual   
example of a working query on my database; the field names are a  
little  different (sorry, I was trying to make my earlier example more  
readable).   The main change, though, is that I did plain 'join'  
instead of 'left  join', which seems to make it much faster.  It was  
pretty slow at first.
Yes, LEFT JOIN does extra work, and it wasn't needed here.
mysql> select r.id, ga.Grade, ga.GoalNumber, gb.GoalNumber from  
resources  as r
   join resource_goals as ga on r.id=ga.ResourceID
   and ga.Grade='4th' and ga.GoalNumber='1'
   join resource_goals as gb on r.id=gb.ResourceID
   and gb.Grade='4th' and gb.GoalNumber='2'
   where ga.goal_id IS NOT NULL
   and gb.goal_id IS NOT NULL
   group by r.id;
There are some strange things here, I think.
* You've never shown us your table definitions, but I would have  
expected GoalNumber to be an integer, not a string.  If so, you  
shouldn't quote the numbers you compare it to.

* I think it unlikely 

Re: select where multiple joined records match

2005-02-14 Thread AM Thomas
Guarded exclamations of success!
This seems to be working for me so far - thank you!  Here's an actual  
example of a working query on my database; the field names are a little  
different (sorry, I was trying to make my earlier example more readable).   
The main change, though, is that I did plain 'join' instead of 'left  
join', which seems to make it much faster.  It was pretty slow at first.

mysql> select r.id, ga.Grade, ga.GoalNumber, gb.GoalNumber from resources  
as r
   join resource_goals as ga on r.id=ga.ResourceID
   and ga.Grade='4th' and ga.GoalNumber='1'
   join resource_goals as gb on r.id=gb.ResourceID
   and gb.Grade='4th' and gb.GoalNumber='2'
   where ga.goal_id IS NOT NULL
   and gb.goal_id IS NOT NULL
   group by r.id;

- AM

On Mon, 14 Feb 2005 11:07:48 -0800, Jeremy Cole <[EMAIL PROTECTED]>  
wrote:

Hi,
Thanks, but unless I'm missing something, that will return the same as   
SUBJECT="English" and (GRADE="1" or GRADE="2"), which is resource  
records  1,2,3, and 4 - too many matches.  Am I missing something?
How about this:
SELECT resources.id, resources.title
FROM resources
LEFT JOIN goals AS goal_a
   ON resources.id=goal_a.resource_id
   AND goal_a.subject="English" AND goal_a.grade=1
LEFT JOIN goals AS goal_b
   ON resources.id=goal_b.resource_id
   AND goal_b.subject="English" AND goal_b.grade=2
WHERE goal_a.id IS NOT NULL
   AND goal_b.id IS NOT NULL
Alternately:
SELECT resources.id, resources.title, COUNT(*) as nr
FROM resources
LEFT JOIN goals
   ON resources.id=goals.resource_id
   AND goals.subject="English"
WHERE goals.grade IN (1, 2)
GROUP BY resources.id
HAVING nr = 2
(The above is untested, since you didn't provide your example table in  
SQL, and I am unwilling to re-format it into CREATE TABLE/INSERT  
statemnts to test things.  The concepts are solid.)

Regards,
Jeremy
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


Re: select where multiple joined records match

2005-02-14 Thread AM Thomas
Thanks, but unless I'm missing something, that will return the same as  
SUBJECT="English" and (GRADE="1" or GRADE="2"), which is resource records  
1,2,3, and 4 - too many matches.  Am I missing something?

- AM

On Mon, 14 Feb 2005 12:30:44 -0600, Gordon <[EMAIL PROTECTED]> wrote:
Try this
Select *
from   resources, goals
where  resources.ID = goals.RESOURCE_ID
   and (SUBJECT="English"
and GRADE="1")
OR
   (SUBJECT="English"
and GRADE="2");
-Original Message-
From: AM Thomas [mailto:[EMAIL PROTECTED]
Sent: Sunday, February 13, 2005 7:23 AM
To: mysql@lists.mysql.com
Subject: select where multiple joined records match
I'm trying to figure out how to select all the records in one table
which have multiple specified records in a second table.  My MySQL is
version 4.0.23a, if that makes a difference.
Here's a simplified version of my problem.
I have two tables, resources and goals.
resources table:
ID  TITLE
1   civil war women
2   bunnies on the plain
3   North Carolina and WWII
4   geodesic domes
goals table:
ID RESOURCE_ID  GRADE  SUBJECT
1  11  English
2  11  Soc
3  12  English
4  21  English
5  23  Soc
6  32  English
7  41  English
Now, how do I select all the resources which have 1st and 2nd grade
English goals?  If I just do:
Select * from resources, goals where ((resources.ID =
goals.RESOURCE_ID) and (SUBJECT="English") and ((GRADE="1") and
(GRADE="2")));
I'll get no results, since no record of the joined set will have more
than one grade.  I can't just put 'or' between the Grade
conditions; that would give resources 1, 2, 3, and 4, when only 1
really should match.
My real problem is slightly more complex, as the 'goals' table also
contains an additional field which might be searched on.
I'm thinking it's time for me to go into the deep end of SQL (MySQL,
actually), and my old O'Reilly MySQL & mSQL book isn't doing the
trick.
Surely this has come up before - thanks for any guidance.
- AM Thomas

--
Virtue of the Small / (919) 929-8687
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


more complexity (was: select where multiple joined records match)

2005-02-13 Thread AM Thomas
Thank you kindly for your gracious help.
I am making much progress (the 'as r' and 'as g' syntax is helpful; I'd  
forgotten about it since I'm doing most of this through Perl; thanks).  I  
played with the COUNT solution for a while and was initially encouraged,  
nay, ecstatic.

However, I was getting weird results and realized that I had left out
a wrinkle in my example (and my thinking): the extra field in my
goals table means that the other values can, as a group, be repeated
for several rows.  This seems to keep this solution from working.  Mea
culpa; my example wasn't adequate.
There might be multiple records with a given subject and grade
combination, due to the additional field (called "GoalNumber") in the
goals table.  The combination of ResourceID, Subject, Grade, and
GoalNumber will be unique, but the combination of ResourceID,
Subject, and Grade need not be.
Here's a revision of my example.
resources table:
ID  TITLE
1   Got Your Nose
2   Goats and Waterfowl: A Promising Alliance
3   North Carolina and WWIIb
4   Geodesic Domes - Ivy Revolution
goals table:
(I know all the numbers make it hard to read, sorry; I'll try to
improve readability by putting a blank line before a new RESOURCE_ID).
ID RESOURCE_ID  GRADE  GOALNUMBER  SUBJECT
 1  11  1   English
 2  11  2   English
 3  11  3   English
 4  11  1   Soc
 5  12  5   English
 6  12  6   English
 7  12  1   English
 8  12  2   English
 9  21  1   English
10  23  1   Soc
11  32  1   English
12  32  7   English
13  32  9   English
14  41  1   English
Now, if I understand how this is working:
SELECT r.TITLE
   FROM resources r JOIN goals g ON (r.ID=g.RESOURCE_ID)
   WHERE g.SUBJECT = 'English'
 AND (g.GRADE = 1 OR g.GRADE = 2)
   GROUP BY r.ID
   HAVING COUNT(*) = 2;
will give an incorrect result, because the number of rows returned for
each matching ID will be unpredictable.  It could be 7 rows for ID =
1 (which is a correct match), or 3 rows for ID = 3 (which shouldn't
match since it only has grade 2).
I wish the EXISTS solution offered by Mr. Brawley would work, but
thanks to Mr. Stassen for the clue about MySQL version.  I tried it
anyway at the command line but, of course, it didn't work.
I also found a reference to an INTERSECTION keyword and experimented
with it briefly; I couldn't find a document that listed when certain
features came into MySQL, so I don't know if INTERSECTION is
completely out of the picture, though UNION seems to have arrived
after 4.0.  Is there such a document?  Or do I have to look at the
change log for each version?
( Running "mysql -V" actually gives me: Ver 12.22 Distrib 4.0.23a )
- AM Thomas
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]


select where multiple joined records match

2005-02-13 Thread AM Thomas
I'm trying to figure out how to select all the records in one table
which have multiple specified records in a second table.  My MySQL is  
version 4.0.23a, if that makes a difference.

Here's a simplified version of my problem.
I have two tables, resources and goals.
resources table:
ID  TITLE
1   civil war women
2   bunnies on the plain
3   North Carolina and WWII
4   geodesic domes
goals table:
ID RESOURCE_ID  GRADE  SUBJECT
1  11  English
2  11  Soc
3  12  English
4  21  English
5  23  Soc
6  32  English
7  41  English
Now, how do I select all the resources which have 1st and 2nd grade
English goals?  If I just do:
   Select * from resources, goals where ((resources.ID =
   goals.RESOURCE_ID) and (SUBJECT="English") and ((GRADE="1") and
   (GRADE="2")));
I'll get no results, since no record of the joined set will have more
than one grade.  I can't just put 'or' between the Grade
conditions; that would give resources 1, 2, 3, and 4, when only 1
really should match.
My real problem is slightly more complex, as the 'goals' table also
contains an additional field which might be searched on.
I'm thinking it's time for me to go into the deep end of SQL (MySQL,
actually), and my old O'Reilly MySQL & mSQL book isn't doing the
trick.
Surely this has come up before - thanks for any guidance.
- AM Thomas
--
Virtue of the Small / (919) 929-8687
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]