Ah, yes, the MIN should be very helpful.  Can I expect that ordering the 
storage by (S, I) or having an (S, I) index will make that MIN take O(1) 
time, for both MyISAM and InnoDB?

I do not follow why you suggested a join to get the associated S, that can 
be done in the original query (and I did NOT say a given integer I is 
associated with only one string S):

SELECT a.s, a.i, MIN(b.i) AS j 
FROM t AS a 
JOIN t AS b ON b.i > a.i AND a.s = b.s
GROUP BY  a.i

Thanks,
Mike Spreitzer




Peter Brawley <peter.braw...@earthlink.net> 
06/20/09 12:39 PM
Please respond to
peter.braw...@earthlink.net


To
Mike Spreitzer/Watson/i...@ibmus
cc
mysql@lists.mysql.com
Subject
Re: how to efficiently query for the next in MySQL Community Edition 
5.1.34?






Mike,
>Yes, for each (S, I) pair the goal is to efficiently find the next 
largest 
>integer associated with S in T.  For the highest integer I associated 
with 
>S in T, there is no next larger.
Here's a more efficient query for the next i values with matching s 
values:

SELECT a.i, MIN(b.i) AS j 
FROM t AS a 
JOIN t AS b ON b.i > a.i AND a.s = b.s
GROUP BY  a.i

To fetch the matching s values, join the above to the original table:

SELECT n.i, t.s, n.j
FROM (
  SELECT a.i, MIN(b.i) AS j 
  FROM t AS a 
  JOIN t AS b ON b.i > a.i AND a.s = b.s
  GROUP BY  a.i
) AS n JOIN t USING (i);

PB

-----

Mike Spreitzer wrote: 
Yes, for each (S, I) pair the goal is to efficiently find the next largest 

integer associated with S in T.  For the highest integer I associated with 

S in T, there is no next larger.

Thanks,
Mike Spreitzer




Peter Brawley <peter.braw...@earthlink.net> 
06/20/09 08:56 AM
Please respond to
peter.braw...@earthlink.net


To
Mike Spreitzer/Watson/i...@ibmus
cc
mysql@lists.mysql.com
Subject
Re: how to efficiently query for the next in MySQL Community Edition 
5.1.34?






Mike

 
J holding the next integer that T has for S 
 

You mean for each i, the next value of i with that s?

 
(U having no row for the last integer of each string).
 

I do not understand that at all.

PB


Mike Spreitzer wrote: 
Suppose I have a table T with two column, S holding strings (say, 
VARCHAR(200)) and I holding integers.  No row appears twice.  A given 
string appears many times, on average about 100 times.  Suppose I have 
millions of rows.  I want to make a table U holding those same columns 
plus one more, J holding the next integer that T has for S (U having no 
row for the last integer of each string).  I could index T on (S,I) and 
write this query as

select t1.*, t2.I as J from T as t1, T as t2
where t1.S=t2.S and t1.I < t2.I
and not exists (select * from T as t12 where t12.S=t1.S and t1.I < t12.I 
and t12.I < t2.I)

but the query planner says this is quite expensive to run: it will 
enumerate all of T as t1, do a nested enumeration of all t2's entries for 
S=t1.S, and inside that do a further nested enumeration of t12's entries 
for S=t1.S --- costing about 10,000 times the size of T.  There has to be 
a better way!

Thanks,
Mike Spreitzer

 



No virus found in this incoming message.
Checked by AVG - www.avg.com 
Version: 8.5.364 / Virus Database: 270.12.80/2187 - Release Date: 06/19/09 

06:53:00

 

 



No virus found in this incoming message.
Checked by AVG - www.avg.com 
Version: 8.5.364 / Virus Database: 270.12.81/2189 - Release Date: 06/20/09 
06:15:00

 

Reply via email to