Since writing this message, I have discovered another possible way for doing this with two successive SQL statements. It would look something like this:

1) CREATE TEMPORARY TABLE temp SELECT DISTINCT xy FROM table2 WHERE <all of table2 conditions>; (going by data below, maximum of 20k rows because it is DISTINCT)

2) SELECT AVG(column1),<all other group calculations> FROM table1 LEFT JOIN temp ON temp.xy=table1.xy WHERE <all of table1 conditions> AND temp.xy IS NOT NULL; (maximum of 20k rows, less after both WHERE statements tho)

Does this sound like the best way to accomplish this? Any suggestions?
Thx,
Seth

On Sunday, March 2, 2003, at 03:41 PM, Seth Price wrote:

Hi list, I am sure that there is an efficient way to do this in SQL, I just can't figure out what it is.

I am dealing with two tables. One, I'll call table1 has about 90 columns and 20k rows, each number in column "xy" is unique. The other has about 90 columns and about 200k rows, and there will be around 10 duplicate entries in "xy" for each value of "xy". So, if I SELECT a given number in table1.xy, one row is returned, then if I SELECT the same number in table2.xy, about 10 rows will be returned. There is always at least one entry in table2.xy for each entry in table1.xy

I am looking for the best way to join them so for each row in the result is a row of table1 lined up with one (and only one) row from table2. A SELECT of everything would return 20k rows, in this case. In the WHERE statement there will be limitations put on both values in other columns in table1 and table2. The best I can think of is a LEFT JOIN, but it returns a table where every row in table1 has been duplicated about 10 times. (and I am hoping for it to return a table of every row in table1 joined with only one row of table2 where table1.xy=table2.xy.)

I am aware that I could simply GROUP BY xy, but I am also trying to use this with a few AVG(<some other column>) clauses to get the average of all rows returned. GROUP BY xy only returns the average of each grouping. I am also guessing that GROUP By in this instance would be rather inefficient because It would first have to build a large table (>200k rows) and then SELECT from it.

I am using static MyISAM tables and everything in the table is either a float or an int.

Is the best way for me to do this just to go ahead and do the GROUP BY, return a table of 20k rows, and then calculate the AVG (and COUNT, STD, MIN, and MAX) in my script? I was hoping to do this in a less time consuming fashion.

I can calculate all of this in a timely manner when I am only using values from table1.
~Seth



--------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




---------------------------------------------------------------------
Before posting, please check:
  http://www.mysql.com/manual.php   (the manual)
  http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php



Reply via email to