[EMAIL PROTECTED] wrote:
Devananda,

I have to support Peter on this one. What he submitted to you is a perfectly appropriate solution. It seems as though you rejected his assistance before even trying to see if it would work.
On the contrary, I have tried his suggestions and they do not work (see the email I just sent to the list). I have also tried, but obviously not done very well, to explain why they will not work in this particular case. The queries Peter has suggested actually take many minutes to return when run on the real data, whereas my initial subquery takes only a few seconds. I had initially hoped to find a way, using a JOIN, that would take less than a second.


d1 mysql> explain SELECT DISTINCT pt.offer_id FROM paytable AS pt INNER JOIN offers AS o USING (offer_id) LEFT JOIN publisher_advertiser_blocks AS pab1 ON pt.login_id=pab1.login_id LEFT JOIN publisher_advertiser_blocks AS pab2 ON o.advertiser_id=pab2.advertiser_id WHERE pt.login_id=1 AND pab1.login_id IS NULL OR pab2.advertiser_id IS NULL;
+----+-------------+-------+--------+---------------+---------+---------+-----------------------+---------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+-----------------------+---------+------------------------------------+
| 1 | SIMPLE | pt | index | PRIMARY | PRIMARY | 5 | NULL | 1814973 | Using index; Using temporary | | 1 | SIMPLE | o | eq_ref | PRIMARY | PRIMARY | 4 | affiliate.pt.offer_id | 1 | Using where; Distinct | | 1 | SIMPLE | pab1 | ref | PRIMARY | PRIMARY | 4 | affiliate.pt.login_id | 7 | Using index; Distinct | | 1 | SIMPLE | pab2 | index | NULL | PRIMARY | 8 | NULL | 62 | Using where; Using index; Distinct |
+----+-------------+-------+--------+---------------+---------+---------+-----------------------+---------+------------------------------------+



d1 mysql> explain SELECT offer_id FROM paytable WHERE login_id=1 AND offer_id NOT IN ( SELECT offer_id FROM offers LEFT JOIN publisher_advertiser_blocks AS pab USING (advertiser_id) WHERE pab.login_id=1 );
+----+--------------------+----------+--------+-----------------------+---------+---------+--------------------------------------+---------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------+--------+-----------------------+---------+---------+--------------------------------------+---------+--------------------------+
| 1 | PRIMARY | paytable | index | NULL | PRIMARY | 5 | NULL | 1814973 | Using where; Using index | | 2 | DEPENDENT SUBQUERY | offers | eq_ref | PRIMARY,advertiser_id | PRIMARY | 4 | func | 1 | Using where | | 2 | DEPENDENT SUBQUERY | pab | eq_ref | PRIMARY | PRIMARY | 8 | const,affiliate.offers.advertiser_id | 1 | Using where; Using index |
+----+--------------------+----------+--------+-----------------------+---------+---------+--------------------------------------+---------+--------------------------+
3 rows in set (0.03 sec)

There have been and continue to be SQL-driven databases around that have not had and do not have subquery support. They function perfectly well without them. By your response, it appears to me you don't know how a LEFT JOIN is supposed to operate. That is one of the most basic tools of any relational DBA's arsenal.
Perhaps I didn't phrase my response well, regarding 1to1 relationship, but I do understand how to use a LEFT JOIN. I freely admit I don't have as much experience with all the types of joins, or in other DBMS, as many of the folks on this list. Also, I _know_ that subqueries and IN lists are not nearly as efficient as JOINs. That is exactly _why_ I asked in the first place!!

:)

Regards,
Devananda


I implore everyone not comfortable with them to learn their "JOIN"s (INNER JOIN, LEFT JOIN, RIGHT JOIN, CROSS JOIN, etc.) before you learn how to subquery (and I do NOT mean that comma-separated crap that poses as a CROSS JOIN). So many of the performance problems posted to this list are subquery-related when a simple, straightforward JOIN or two will solve the same problem usually with _much_ better response time. This post is a perfect example.

Explicit JOIN statements have the advantage of possibly using indexes at every layer of data aggregation. Subqueries lose that ability at the level of each subquery. Some subqueries, depending on how they are written, are executed once PER ROW of your parent query data. If it takes .01 seconds to run a subquery by itself and you have 10000 rows of parent data, that would be 100 seconds of nothing but subquery execution time before you even get to the GROUP BY or ORDER BY processing part of the query. Rewriting the same query to use an explicit JOIN may turn the .01 seconds of the "naked" subquery into .02-.04 seconds because of the indexing advantage and because it computes a set of matches between the table only once.

Subqueries have their uses and are appropriate for many situations. However, IMHO, they should be your second-line of attack because of their inherent weaknesses. I believe this to be true for the majority of the RDBM systems in the world today.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine


--
Devananda vdv


http://devananda-vdv.blogspot.com/
http://mycat.sourceforge.net/


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to