Re: select where multiple joined records match
Such bounty of comments! Thank you, Michael Stassen. I see how my logic was faulty, and that a more correct solution would indeed be faster. Thanks for pointing that out and not making me feel like too much of an idiot :) I'll try the revised solution. I am clueful about "join" making more rows/record, but didn't realize that it would be *that* dramatic. As for the NumericGrade field, I'm basically getting the text grade (e.g. 8th) from someone's MS Access CSV export; I figured I'd leave it in place, and use the numeric grade for sorting. I'd never run into speed issues before, so I was just trying to save myself coding time by sticking with what I had already. I wouldn't need the special lookup tables for grade and subject; if I were to use numeric fields, I could just do a lookup in Perl. Next time, when I try to do this better from the start, I probably will. At the moment, though, I'm trying to avoid changing the Perl code as much as I can (it's much recycled from an earlier project). Will look into indexing - that's probably covered in my old O'Reilly MySQL/mSQL book. I am curious about how much faster numeric field comparisons would be to string field comparisons for the Grade field; Would it make enough of a difference to this problem for me to go mucking with this Perl code? This is a CGI Web app, so it's not lightning fast anyway, but then there are a lot of comparisons going on. Currently it looks like it will have about 300 resource records and about six thousand resource_goal records; I should test this myself... I do have a lot of multi-subject and multi-grade resources; my testing was just not very good, I think. Oh, and I think I see the error of my ways with regard to my TINYTEXT fields. Probably would do well to shrink those. Easy to change, too. Thanks! You've given excellent explanations here. I feel like I should buy your book now, if you have one! I'm wrestling with CSS issues on IE 4.0 for the Mac at the moment, but will return to SQL issues soon, I hope. More later probably, AM On Mon, 21 Feb 2005 14:30:59 -0500, Michael Stassen <[EMAIL PROTECTED]> wrote: As I understand it, you are looking for a resource for 4th grade Social Studies which meets goals 1 through 4. In terms of your tables, that corresponds to having 4 rows in resources_goals, *all* of which have grade=4th and Subject='Social_Studies'. That is, we need to look in *4* copies of resources_goals (not 5). For a given resource id, we want exactly one row from each copy, namely, the row with the correct resource id, correct subject, correct grade, and desired goal number. I think this should do: SELECT r.id FROM resources as r JOIN resource_goals as g1 ON r.id = g1.ResourceID AND g1.Subject = 'Social_Studies' AND g1.Grade = '4th' AND g1.GoalNumber = 1 JOIN resource_goals as g2 ON r.id = g2.ResourceID AND g2.Subject = 'Social_Studies' AND g2.Grade = '4th' AND g2.GoalNumber = 2 JOIN resource_goals as g3 ON r.id = g3.ResourceID AND g3.Subject = 'Social_Studies' AND g3.Grade = '4th' AND g3.GoalNumber = 3 JOIN resource_goals as g4 ON r.id = g4.ResourceID AND g4.Subject = 'Social_Studies' AND g4.Grade = '4th' AND g4.GoalNumber = 4; -- Virtue of the Small / (919) 929-8687 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: select where multiple joined records match
Hi there, I'll answer your questions below, but I should tell you that it looks like even three or four "joins" seem to make this too slow. Theoretically, the user could select up to 15 or so GoalNumber values, 12 for Grade, and about 20 possibilities for Subject - clearly it wouldn't be useful to specify that many items in a search, but that's the theoretical maximum; 4 or 5 values isn't unreasonable, though. Four "AND"ed goal numbers plus a subject and a grade slowed the search (on the shared commercial web host I'm using) into the 3 minute range, and that's with a regular join, not a left join. This is the SELECT that took about 3 minutes (3 trials, simplifying slightly each time, simplest given here): select r.id from resources as r join resource_goals as g0 on (r.id=g0.ResourceID) join resource_goals as g1 on (r.id=g1.ResourceID and g1.GoalNumber=1) join resource_goals as g2 on (r.id=g2.ResourceID and g2.GoalNumber=2) join resource_goals as g3 on (r.id=g3.ResourceID and g3.GoalNumber=3) join resource_goals as g4 on (r.id=g4.ResourceID and g4.GoalNumber=4) where ((g0.Subject='Social_Studies') and (g0.Grade='4th')) group by r.id; The fastest time was 2 min 48 sec. Last time (simplest query) was 3 min 2 sec. I'm really running out of time on this project, so I just went ahead and made the user interface such that users can only select one subject, grade, and/or goal number at a time. It's probably a sound decision from a usability perspective, so I'm not too sad. If I decide to make this work in the future, I'd probably have to just do a SELECT for each "AND"ed field, get the list of resource id's for each SELECT, then find the intersection of the lists in Perl. If I could speed this up with some kind of indexing, I'd love to know about it. The GROUP BY phrase is because I wanted just one row per resource. It seemed like I'd get a row for each condition/resource (didn't test it with the final ). I'm actually doing SELECT * FROM... in my code, and not using the resource_goals information in my output (that's a separate view at present, generated by different Perl code). It seems to work fine without the NOT NULL parts, you're right. I was wondering about that, but was sleepy enough at the time that I didn't trust my thinking. Sorry about not including my table defs :-( . I guess I was just hoping for a general approach, and didn't realize that anyone would be interested enough to read all that detail and provide and exact solution for me. Of course, now I realize that it would have simplified our discussion. Anyway, late but not never, and for help to whoever finds this in the list archives someday, here are my table defs (you're right - the goal number is a TINYINT): (Below is an abridged version of the resources table ; it also contains about 60 more TINYINT fields which are essentially used as booleans, some of which I hope to eliminate. Yes, I could have used SET or something, but I didn't for various reasons.) CREATE TABLE resources ( id INT UNSIGNED PRIMARY KEY, Title TEXT, ResourceType_THJHArticle TINYINT, ResourceType_NIEArticle TINYINT, DataEntryName TINYTEXT, Date DATETIME, Notes TEXT, Made_Keywords TEXT); CREATE TABLE resource_goals ( goal_id INT UNSIGNED PRIMARY KEY, ResourceID INT, Grade TINYTEXT, Subject TINYTEXT, GoalNumber TINYINT, NumericGrade TINYINT); Thanks a bunch for your help; I'm finding this more interesting than I thought I would. On Wed, 16 Feb 2005 11:08:20 -0500, Michael Stassen <[EMAIL PROTECTED]> wrote: AM Thomas wrote: Guarded exclamations of success! This seems to be working for me so far - thank you! Here's an actual example of a working query on my database; the field names are a little different (sorry, I was trying to make my earlier example more readable). The main change, though, is that I did plain 'join' instead of 'left join', which seems to make it much faster. It was pretty slow at first. Yes, LEFT JOIN does extra work, and it wasn't needed here. mysql> select r.id, ga.Grade, ga.GoalNumber, gb.GoalNumber from resources as r join resource_goals as ga on r.id=ga.ResourceID and ga.Grade='4th' and ga.GoalNumber='1' join resource_goals as gb on r.id=gb.ResourceID and gb.Grade='4th' and gb.GoalNumber='2' where ga.goal_id IS NOT NULL and gb.goal_id IS NOT NULL group by r.id; There are some strange things here, I think. * You've never shown us your table definitions, but I would have expected GoalNumber to be an integer, not a string. If so, you shouldn't quote the numbers you compare it to. * I think it unlikely
Re: select where multiple joined records match
Guarded exclamations of success! This seems to be working for me so far - thank you! Here's an actual example of a working query on my database; the field names are a little different (sorry, I was trying to make my earlier example more readable). The main change, though, is that I did plain 'join' instead of 'left join', which seems to make it much faster. It was pretty slow at first. mysql> select r.id, ga.Grade, ga.GoalNumber, gb.GoalNumber from resources as r join resource_goals as ga on r.id=ga.ResourceID and ga.Grade='4th' and ga.GoalNumber='1' join resource_goals as gb on r.id=gb.ResourceID and gb.Grade='4th' and gb.GoalNumber='2' where ga.goal_id IS NOT NULL and gb.goal_id IS NOT NULL group by r.id; - AM On Mon, 14 Feb 2005 11:07:48 -0800, Jeremy Cole <[EMAIL PROTECTED]> wrote: Hi, Thanks, but unless I'm missing something, that will return the same as SUBJECT="English" and (GRADE="1" or GRADE="2"), which is resource records 1,2,3, and 4 - too many matches. Am I missing something? How about this: SELECT resources.id, resources.title FROM resources LEFT JOIN goals AS goal_a ON resources.id=goal_a.resource_id AND goal_a.subject="English" AND goal_a.grade=1 LEFT JOIN goals AS goal_b ON resources.id=goal_b.resource_id AND goal_b.subject="English" AND goal_b.grade=2 WHERE goal_a.id IS NOT NULL AND goal_b.id IS NOT NULL Alternately: SELECT resources.id, resources.title, COUNT(*) as nr FROM resources LEFT JOIN goals ON resources.id=goals.resource_id AND goals.subject="English" WHERE goals.grade IN (1, 2) GROUP BY resources.id HAVING nr = 2 (The above is untested, since you didn't provide your example table in SQL, and I am unwilling to re-format it into CREATE TABLE/INSERT statemnts to test things. The concepts are solid.) Regards, Jeremy -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: select where multiple joined records match
Thanks, but unless I'm missing something, that will return the same as SUBJECT="English" and (GRADE="1" or GRADE="2"), which is resource records 1,2,3, and 4 - too many matches. Am I missing something? - AM On Mon, 14 Feb 2005 12:30:44 -0600, Gordon <[EMAIL PROTECTED]> wrote: Try this Select * from resources, goals where resources.ID = goals.RESOURCE_ID and (SUBJECT="English" and GRADE="1") OR (SUBJECT="English" and GRADE="2"); -Original Message- From: AM Thomas [mailto:[EMAIL PROTECTED] Sent: Sunday, February 13, 2005 7:23 AM To: mysql@lists.mysql.com Subject: select where multiple joined records match I'm trying to figure out how to select all the records in one table which have multiple specified records in a second table. My MySQL is version 4.0.23a, if that makes a difference. Here's a simplified version of my problem. I have two tables, resources and goals. resources table: ID TITLE 1 civil war women 2 bunnies on the plain 3 North Carolina and WWII 4 geodesic domes goals table: ID RESOURCE_ID GRADE SUBJECT 1 11 English 2 11 Soc 3 12 English 4 21 English 5 23 Soc 6 32 English 7 41 English Now, how do I select all the resources which have 1st and 2nd grade English goals? If I just do: Select * from resources, goals where ((resources.ID = goals.RESOURCE_ID) and (SUBJECT="English") and ((GRADE="1") and (GRADE="2"))); I'll get no results, since no record of the joined set will have more than one grade. I can't just put 'or' between the Grade conditions; that would give resources 1, 2, 3, and 4, when only 1 really should match. My real problem is slightly more complex, as the 'goals' table also contains an additional field which might be searched on. I'm thinking it's time for me to go into the deep end of SQL (MySQL, actually), and my old O'Reilly MySQL & mSQL book isn't doing the trick. Surely this has come up before - thanks for any guidance. - AM Thomas -- Virtue of the Small / (919) 929-8687 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
more complexity (was: select where multiple joined records match)
Thank you kindly for your gracious help. I am making much progress (the 'as r' and 'as g' syntax is helpful; I'd forgotten about it since I'm doing most of this through Perl; thanks). I played with the COUNT solution for a while and was initially encouraged, nay, ecstatic. However, I was getting weird results and realized that I had left out a wrinkle in my example (and my thinking): the extra field in my goals table means that the other values can, as a group, be repeated for several rows. This seems to keep this solution from working. Mea culpa; my example wasn't adequate. There might be multiple records with a given subject and grade combination, due to the additional field (called "GoalNumber") in the goals table. The combination of ResourceID, Subject, Grade, and GoalNumber will be unique, but the combination of ResourceID, Subject, and Grade need not be. Here's a revision of my example. resources table: ID TITLE 1 Got Your Nose 2 Goats and Waterfowl: A Promising Alliance 3 North Carolina and WWIIb 4 Geodesic Domes - Ivy Revolution goals table: (I know all the numbers make it hard to read, sorry; I'll try to improve readability by putting a blank line before a new RESOURCE_ID). ID RESOURCE_ID GRADE GOALNUMBER SUBJECT 1 11 1 English 2 11 2 English 3 11 3 English 4 11 1 Soc 5 12 5 English 6 12 6 English 7 12 1 English 8 12 2 English 9 21 1 English 10 23 1 Soc 11 32 1 English 12 32 7 English 13 32 9 English 14 41 1 English Now, if I understand how this is working: SELECT r.TITLE FROM resources r JOIN goals g ON (r.ID=g.RESOURCE_ID) WHERE g.SUBJECT = 'English' AND (g.GRADE = 1 OR g.GRADE = 2) GROUP BY r.ID HAVING COUNT(*) = 2; will give an incorrect result, because the number of rows returned for each matching ID will be unpredictable. It could be 7 rows for ID = 1 (which is a correct match), or 3 rows for ID = 3 (which shouldn't match since it only has grade 2). I wish the EXISTS solution offered by Mr. Brawley would work, but thanks to Mr. Stassen for the clue about MySQL version. I tried it anyway at the command line but, of course, it didn't work. I also found a reference to an INTERSECTION keyword and experimented with it briefly; I couldn't find a document that listed when certain features came into MySQL, so I don't know if INTERSECTION is completely out of the picture, though UNION seems to have arrived after 4.0. Is there such a document? Or do I have to look at the change log for each version? ( Running "mysql -V" actually gives me: Ver 12.22 Distrib 4.0.23a ) - AM Thomas -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
select where multiple joined records match
I'm trying to figure out how to select all the records in one table which have multiple specified records in a second table. My MySQL is version 4.0.23a, if that makes a difference. Here's a simplified version of my problem. I have two tables, resources and goals. resources table: ID TITLE 1 civil war women 2 bunnies on the plain 3 North Carolina and WWII 4 geodesic domes goals table: ID RESOURCE_ID GRADE SUBJECT 1 11 English 2 11 Soc 3 12 English 4 21 English 5 23 Soc 6 32 English 7 41 English Now, how do I select all the resources which have 1st and 2nd grade English goals? If I just do: Select * from resources, goals where ((resources.ID = goals.RESOURCE_ID) and (SUBJECT="English") and ((GRADE="1") and (GRADE="2"))); I'll get no results, since no record of the joined set will have more than one grade. I can't just put 'or' between the Grade conditions; that would give resources 1, 2, 3, and 4, when only 1 really should match. My real problem is slightly more complex, as the 'goals' table also contains an additional field which might be searched on. I'm thinking it's time for me to go into the deep end of SQL (MySQL, actually), and my old O'Reilly MySQL & mSQL book isn't doing the trick. Surely this has come up before - thanks for any guidance. - AM Thomas -- Virtue of the Small / (919) 929-8687 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]