Shawn,

I've just found out that most of my emails during this discussion were NOT posted to the list because I was sending mail in HTML format and Yahoo was not delivering the bounce notices to me. That explains why you believed I was not listening to Peter's input - only _his_ messages were appearing on the list!

I'm adding my last two emails to the end of this one so that it doesn't appear as though I have ignored everyone, and in the hope that the conclusions I reached (with help from both of you) will be of use to someone else.

My sincere apologies for the confusion that I inadvertently caused.

Best regards,
Devananda

----------------

Actually, what was missing all along was "pab.login_id=1" in the ON clause, not the WHERE clause!!! I guess I've not used that sort of condition in a JOIN before... If I rewrite your query like this, which is much simpler, it does in fact work very, very well!

SELECT pt.offer_id
FROM paytable AS pt
INNER JOIN offers AS o USING (offer_id)
LEFT JOIN publisher_advertiser_blocks AS pab
ON o.advertiser_id=pab.advertiser_id AND pab.login_id=1
WHERE pt.login_id=1 AND pab.advertiser_id IS NULL;

That is just the answer I was hoping for from the start! Thanks :)

On a side note, I looked at the MySQL reference manual for JOINs ( http://dev.mysql.com/doc/refman/5.0/en/join.html ) again, and while the syntax of putting a row-limiting condition on the right table in a LEFT JOIN isn't discussed in the actual documentation (that's where I do most of my reading...) it IS mentioned in a post at the bottom of the page! Damn, I wish I had seen that last week ...

Thanks for the help Peter



------------

[EMAIL PROTECTED] wrote:
You are correct, that the situation you describe was not clearly presented in your previous reply. I humbly apologize for any offense.

Apology humbly accepted. I only took offense because I have been paying close attention to all the suggestions that Peter and you have been offering, trying to learn from them (as I do with most of your posts to the mailing list).

Using the EXPLAIN you posted in your latest reply, you can translate your subquery into the following JOINed query

SELECT p.offer_id
FROM paytable p
LEFT JOIN offers o
        ON o.advertiser_id = p.advertiser_id
WHERE pt.login_id=1
        AND o.offer_id is null;

Which is not what I think you were actually trying to write. The terms selecting values from publisher_advertiser_blocks disappeared because they are on the *right* side of a LEFT JOIN and played no part in actually limiting your final results. Here is the needs statement from your first post:
>>>>The goal of these queries is to select all offer_id's from `paytable`
for a known login_id where that login_id is not "blocked" from that
offer_id.<<<<

 I would write it this way

CREATE TEMPORARY TABLE tmpBlocks SELECT DISTINCT
        o.offer_id
FROM offers
INNER JOIN publisher_advertiser_blocks pab
        ON pab.advertiser_id = o.advertiser_id
        AND  pab.login_id = 1;

ALTER TABLE  tmpBlocks ADD KEY(offer_id);

SELECT p.offer_id
FROM paytable p
LEFT JOIN tmpBlocks tb
        ON tb.offer_id = p.offer_id
WHERE tb.offer_id IS NULL;

DROP TEMPORARY TABLE tmpBlocks;
With a slight change, what you suggested does work quite well. The final SELECT needs "AND p.login_id=1" in the WHERE clause, or else it will return data for all login_id that have records in paytable.

One trick to working efficiently with "larger" datasets (millions of rows per table) is to minimize the number of rows being joined at one time. By breaking this query into two statements we keep our JOIN combinations to a minimum so that at each successive stage we are working with smaller sets of data than if we had written it as a single statement.

Indeed! I often use temporary tables, since some of the datasets I work with are in the tens or hundreds of millions of rows.

The term "pab.login_id=1" is in the ON clause because your index on publisher_advertiser_block is defined in such a way that makes it better to be in the ON clause than in the WHERE clause (also information from your original post). There was a posting not long ago that said that the statistics of a temporary table's indexes were not updated if they exist before you fill the table with data.
That's very good to know! I will keep it in mind, and probably could get more performance out of queries I've written in the past by doing this.

Again, I apologize for any offense I may have caused,
No harm done, we all live and learn :)

Best regards, and thanks again,
Devananda

--
http://devananda-vdv.blogspot.com/
http://mycat.sourceforge.net/


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to