Re: select does too much work to find rows where primary key does not match

Joerg Bruehe Wed, 16 Apr 2008 04:19:58 -0700

Hi Patrick, all !


Patrick J. McEvoy wrote:

I have two MyISAM tables; each uses 'phone' as a primary key. Findingrows where the primary keys match is efficient:
mysql> explain select bar.phone from foo,bar where foo.phone=bar.phone;
[[...]}

Ok, let us take some simple example. Say tables "foo" and "bar" bothhave three rows each, with "phone" values 1, 2, and 3.

Then your matching lines will be:

  foo.phone   bar.phone
      1           1
      2           2
      3           3

Column foo.phone is shown for explanation only, your select would notreturn it.Each individual value in bar.phone will be returned as often as there isan identical value in foo.phone.


I trust that is close to what you expect.

My sample data here do not show what will happen if values in foo.phoneare not distinct - figure yourself.

Finding rows in one table that do not match a row in the other table iswildly inefficient:
mysql> explain select bar.phone from foo,bar where foo.phone!=bar.phone;
[[...]]


Your resulting data from this select would be:

  foo.phone   bar.phone
      1           2
      1           3
      2           1
      2           3
      3           1
      3           2

Again, foo.phone is shown for explanation only.


(This is the same for 'NOT', '!=', or '<>'.)


Correct.

The amount of work should be identical in both cases: grab a row, lookup by primary key in the other table, proceed.


No, it isn't:

A "select ... from foo, bar where CONDITION" effectively creates thecartesian product of both tables and then removes all lines(combinations) which do not meet the condition.

Of course, the system uses better strategies if possible, evaluatingindexes etc, but the resulting data will be the same.

Assuming tables "foo" and "bar" each have a column "num" with the values1 to 100, a condition "... where foo.num = bar.num" will lead to aresult with 100 rows.


But "... where foo.num != bar.num" will lead to a table of 9,900 rows:

For each of the 100 values in "foo.num", there will be 99 entries in"bar.num" that satisfy the inequality condition.

My real goal is to delete rows in the smaller table if there is no matchin the larger table:
    delete from bar using foo,bar where not bar.phone=foo.phone;


See above - wrong approach.

What you need is a subquery or an outer join, as proposed in the otherreplies.


[[...]]


HTH,
Jörg

--
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: select does too much work to find rows where primary key does not match

Reply via email to