Asif Lodhi wrote:
Hi,

I have a query:
<reformatted>

INSERT INTO tmp2 (x)
SELECT  ((t3.m * 1000000) + b.id) AS x2
FROM tmp3 t3
LEFT JOIN (SELECT (MAX(x) - ((MAX(x) div 1000000) * 1000000)) + 1 AS pid
           FROM tmp2
           WHERE (x div 1000000) = 2147
           HAVING (MAX(x) - ((MAX(x) div 1000000) * 1000000)) < 483648) b
 ON t3.m=b.pid

First, the outer SELECT:

  SELECT  ((t3.m * 1000000) + b.id) AS x2

Table b has no column `id`, so I expect that should be

  SELECT  ((t3.m * 1000000) + b.pid) AS x2

Yes?

Now, the inner SELECT:

  SELECT (MAX(x) - ((MAX(x) div 1000000) * 1000000)) + 1 AS pid
  FROM tmp2
  WHERE (x div 1000000) = 2147
  HAVING (((MAX(x)) - ((MAX(x) div 1000000) * 1000000)) < 483648)

The SELECT clause:
As every row examined has x div 1000000 = 2147 (the WHERE condition), MAX(x) div
1000000 must be 2147, so (MAX(x) div 1000000) * 1000000) must be 2147000000.
Therefore,

   (MAX(x) - ((MAX(x) div 1000000) * 1000000)) + 1
  = MAX(x) - 2147000000 + 1
  = MAX(x) - 2146999999

Thus, we can simplify your select to

  SELECT MAX(x) - 2146999999 AS pid

Next, the WHERE clause:
Because the condition "(x div 1000000) = 2147" compares a function of x to a
constant, no index on column x can be used to find matching rows.  Always
compare the unmodified column to a constant or a range, if possible, so that an
index may be used.  In this case, "(x div 1000000) = 2147" is equivalent to

  x BETWEEN 2147000000 AND 2147999999

Written this way, an index on column x could be used to find matching rows.

Finally, the HAVING clause:
Most of the work here has already been done in the SELECT clause, so there's no
need to recompute.  That is,

  MAX(x) - ((MAX(x) div 1000000) * 1000000) < 483648

is equivalent to

  pid - 1 < 483648

which, of course, is the same as

  pid < 483649

Putting it all together yields

  SELECT MAX(x) - 2146999999 AS pid
  FROM tmp2
  WHERE x BETWEEN 2147000000 AND 2147999999
  HAVING pid < 483649

The table t3 has a single INT field "m" with a single record and
value: 2147.  The table tmp2 has a single INT field "x" with no
records.

Really?  You seem to be doing a lot of work to accomplish the equivalent of

  INSERT INTO tmp2 (x) VALUES (2147000000);

I suppose the intent must be to get a single query that works when tmp2 is empty
and later when tmp2 has rows (as a result of previous INSERTs).  How about this?

  INSERT INTO tmp2 (x)
  SELECT  t3.m * 1000000 + IFNULL(b.pid, 0) AS x2
  FROM tmp3 t3
  LEFT JOIN ( SELECT MAX(x) - 2146999999 AS pid
              FROM tmp2
              WHERE x BETWEEN 2147000000 AND 2147999999
              HAVING pid < 483649)
   ON t3.m = b.pid

The above code returns NULL in the 2nd column of the SELECT

What?  There is no second column of either select.

that I can't add to or multiply with another number to get the final
value to insert into the table tmp2.  I am using INNODB tables on
5.0.22 running on a WINDOWS-XP-SP2.  Around 75 Clients connect from
VB6/Windows.  I am STARTing TRANSACTIONs and COMMITing them from VB6
client-code.  Since I am also using SQL STRICT mode with more stricter
parameters in the MY.INI.

I don't want to use functions as that will impair the query speed.

Do you know of any way that I could use to get a numeric ZERO instead
of a null in the 2ND column of the SELECT?
<snip>

Again, there is no second column, but you can use IFNULL().

Asif Lodhi wrote in a second message:
Hi,

I am replying to my own post to tell you that now I am using
                     CASE WHEN {condition} THEN .... END
construct to check for NULL and returning numeric ZERO.  On the face of
it, CASE doesn't seem to be function - it's an operator - isn't?

I'd call it a function, but what's the difference?  Why do you believe operators
are faster than functions?  In any case,  MAX() is a function.  Perhaps you
won't mind adding an IFNULL() now that we've removed a MAX().

However, I would now like to ask you whether I can use some kind of an
Oracle-like  USE_INDEX optimization hint in MySQL to get it to use a
specific index in the SELECT?  I used explain on it and it tells me
that it's using index on all except on the the query I am using as the
2nd table.  Is there any way I can speed it up?

Something doesn't make sense here.  If tmp2 is empty, what difference does an
index make?  How would an index be used to speed up a query on an empty table?
And how could it be slow in the first place?

In any case, as I explained above, use of an index on column x in the inner
query was impossible because the WHERE condition compared a _function of the
column_ (a value calculated from the value of x, whether by a function or use of
an operator) to some value.  By rewriting the query to compare the actual value
of the column, use of an index becomes possible.

Any suggestions?

I get the impression you are implementing some sort of serial primary key where
the millions part means something.  Have you considered a multi-part primary 
key?

In any case, if you describe what you are trying to accomplish, someone is
likely to reply with a better way, if there is one.

Michael


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to