subject:"Stop query on first match"

Re: Stop query on first match

2004-06-12 Thread Michael Stassen

Emmanuel van der Meulen wrote:
On Monday, June 07, 2004 19:09, Michael Stassen wrote;
This doesn't quite make sense.  You seem to say that several rows
will match but then you say only one will.  It must be one or the 
other.  Perhaps I've misunderstood you.
Firstly, Michael thank you for your assistance; I'll elaborate.
Michael I'll attempt first without supplying the table definitions and
actual sql and etc..
Table 'a' contains a property which does not match any property on table 'b'
directly, but matches within a range.
Example;
table 'a' property 'num'
num
---
1000
2000
3000
table 'b' properties 'fromNum' and 'toNum'
   fromNum toNum
row 1.   1  1500
row 2.1501  2000
row 3.2001  4000
select... where 'num' between 'fromNum' and 'toNum'.
As seen in this example, using between only one row actually matches,
however the query engine would at first see more than one matching row i.e.,
What do you mean by "at first"?
with 'fromNum' (between is same as num >= fromNum), value 2000 from table a
would match row 1 & 2 on table b; and with 'toNum' (between is same as num
<= toNum), value 2000 from table a would match row 2 & 3 on table b; only
once the range is taken together the result matches one row, viz., row 2 on
table b.
This analysis is not quite right.  You seem to be saying that rows which 
match half of the condition will be retrieved, then filtered according to 
the other half of the condition, but that's not how it works.  If it did, 
BETWEEN would always be slow, but BETWEEN can be very fast, if the range is 
a constant and the column is indexed.  Assuming indexes on num, fromNum, and 
toNum, you should be able to verify that

  SELECT * FROM table_a WHERE num BETWEEN 1501 AND 2000;
  SELECT * FROM table_b WHERE fromNum BETWEEN 1 AND 3000;
  SELECT * FROM table_b WHERE toNum BETWEEN 1500 AND 4000;
are all very quick.  In your case, "num BETWEEN fromNum AND toNum" is the 
same as "num >= fromNum AND num <= toNum".  There is only one matching row. 
 The problem is finding it.

To find which rows in table b match, we have to look at each row and compare 
the values of num, fromNum, and toNum.  Indexes won't help much here, 
because each column's index is unlikely to narrow the list of possible 
matches much.  If the optimizer could put 2 indexes together, it might see 
that fromNum matches from row 23 up, while toNum matches from rows 23 down, 
so row 23 is it, but the optimizer doesn't combine indexes -- it chooses the 
best (most restrictive) one.  Even then, it only uses the index if it 
restricts us to no more than 30% of the rows.  Furthermore, the value of num 
changes with each row of table a.  The result is an unavoidable full table 
scan on table b.

In other words, the problem is not, strictly speaking, the BETWEEN.
Now please remember table b has 1.4 million rows as in this example, with
fromNum and toNum running consecutively, so the query takes 4 seconds to
find a row in table b.  With say 200 rows in table a, that means the query
runs for a long time.
That's 4 seconds to scan all of table b, rather than 4 seconds to find one 
row.  A subtle but important distinction.

Is that right?  You have 1.4 million ranges?  Wow!  Up till now, I've been 
assuming you had a lot of data in table a to be assigned one of a relatively 
few ranges from table b.  Apparently, I've been picturing it backwards.

I'm not sure there's a way to improve this specific query, but there may be 
another way to accomplish the same thing which works better.  It's hard to 
say what that might be without a better picture of what you're doing than I 
currently have.  Perhaps if you described your data someone could make a 
suggestion.  I'm still assuming table b must be static, right?  Otherwise 
you couldn't guarantee uniqueness of ranges, I think.  On the other hand, 
different rows in table a could be in the same range (match the same row of 
table b), right?

What I did was to use limit 1, and ran query with 1 row, this took .01
second.
I assume by "ran query with 1 row" you mean one row from table a.  Now num 
is effectively a constant, and LIMIT 1 short circuits the full table scan as 
soon as one match is found.

You also seem to imply that with BETWEEN you get a full
table scan even though there is only one match for each row.  That sounds
like an indexing problem, but it is hard to say without more information.
Indexed individually on;
fromNum
toNum
Also experimented by adding combining index on;
fromNum/toNum
Either way no difference, query runs 4 seconds.
It would help us help you if you at least posted the query and the results
of EXPLAIN.  It would probably also help if you told us more about the
tables, perhaps with SHOW CREATE TABLE.
If above does not help you, I'll bring the whole lot to the post.
Well, the more info, the better.  At least describe the data a little more. 
 Depending on exactly what you want, there may be a way to change the 
process to make this work at an acceptable speed.

RE: Stop query on first match

2004-06-10 Thread Emmanuel van der Meulen

On Monday, June 07, 2004 20:23, Justin Swanhart wrote

>> It can only ever return one row from table b, for
>> each row on table a, due
>> to the contents that is stored in table b.  The
>> table contains in excess of
>> a million records.  What happens as a result of the

>I assume that you have multiple rows in table b that
>match the primary key on table a, but only one of
>those rows in b will be satisfied by your between
>condition of the query.

Hello Justin, fisrtly, thank you for your assistance.

Then, as I posted previously, keys do not match, however, a property on
table a matches only in respect of being between two properties on table b.

>First off, make sure that you have an index on table b
>that matches the primary key in table a.  Make sure
>the data types and lengths are the same for both keys.

Please see previous post.

> Secondly, make sure that your key cache is large
> enough  so that the keys to the tables are buffered.

I'm not sure it is this.

> If these assumptions are not correct, then you will
> need to post the query, the explain of the query, and
> the "show create table" for each table in the query so
> that we have a better idea what you are dealing with.

Please see previous post.

>> I've looked in several books and searched Google but
>> cannot get a way of
>> doing this.  It seems Oracle has a 'FIRST' in their
>> select which they use
>> for such a use case.  But I do not see anything for
>> MySql anywhere.

> Oracle's FIRST_ROWS simply tells the optimizer to
> prefer index scans over a FTS where lots of rows may
> be returned by the query.  It won't modify the
> behavior of a query in the way you want it to.

Thank you for info.

> The only way I can think to implement the behavior
> that you are looking for is to:
> 1) get all the records from "a" that you need
> 2) execute a second select for each row in "a" on "b"
> with your between condition and a LIMIT clause so that
> only one row is returned.

Until 4.1 when subselects become avaialable, I suspect this is a workaround.

> Still, if multiple rows
> exist in "b" that match the PK on "a", those rows will
> probably be scanned in filesort order, which means
> that multiple rows will be looked at unless your
> between matches the first row inserted, except if you
> have an appropriate index.

Please see previous post.

Justin, again, thank you for your asistance.

Kind regrds
Emmanuel


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Stop query on first match

2004-06-10 Thread Emmanuel van der Meulen

Hello all,

Thank you to everyone who assisted me.

I'm wondering about my theory around the between as posted previously.

Can anyone provide further insights regarding that theory?

Kind regards
Emmanuel

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

RE: Stop query on first match

2004-06-09 Thread Emmanuel van der Meulen

On Monday, June 07, 2004 19:09, Michael Stassen wrote;

> This doesn't quite make sense.  You seem to say that several rows
> will match
> but then you say only one will.  It must be one or the other.
> Perhaps I've
> misunderstood you.

Firstly, Micheal thank you for your assistance; I'll elaborate.

Michael I'll attempt first without supplying the table definitions and
actual sql and etc..

Table 'a' contains a property which does not match any property on table 'b'
directly, but matches within a range.

Example;

table 'a' property 'num'

num
---
1000
2000
3000

table 'b' properties 'fromNum' and 'toNum'
   fromNum toNum
row 1.   1  1500
row 2.1501  2000
row 3.2001  4000

select... where 'num' between 'fromNum' and 'toNum'.

As seen in this example, using between only one row actually matches,
however the query engine would at first see more than one matching row i.e.,
with 'fromNum' (between is same as num >= fromNum), value 2000 from table a
would match row 1 & 2 on table b; and with 'toNum' (between is same as num
<= toNum), value 2000 from table a would match row 2 & 3 on table b; only
once the range is taken together the result matches one row, viz., row 2 on
table b.

Now please remember table b has 1.4 million rows as in this example, with
fromNum and toNum running consecutively, so the query takes 4 seconds to
find a row in table b.  With say 200 rows in table a, that means the query
runs for a long time.

What I did was to use limit 1, and ran query with 1 row, this took .01
second.

> You also seem to imply that with BETWEEN you
> get a full
> table scan even though there is only one match for each row.  That sounds
> like an indexing problem, but it is hard to say without more information.

Indexed individually on;
fromNum
toNum

Also experimented by adding combining index on;
fromNum/toNum

Either way no difference, query runs 4 seconds.

> It would help us help you if you at least posted the query and
> the results
> of EXPLAIN.  It would probably also help if you told us more about the
> tables, perhaps with SHOW CREATE TABLE.

If above does not help you, I'll bring the whole lot to the post.

Michael, again thank you for you assistance.

Kind regards
Emmanuel


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

RE: Stop query on first match

2004-06-07 Thread Emmanuel van der Meulen

On Monday, June 07, 2004 07:54, Jeff Smelser wrote;

> On Monday 07 June 2004 12:49 am, Emmanuel van der Meulen wrote:
> > I've looked in several books and searched Google but cannot get a way of
> > doing this.  It seems Oracle has a 'FIRST' in their select
> which they use
> > for such a use case.  But I do not see anything for MySql anywhere.
>
> If your using a later mysql, you could use sub query with limit.
>
> Other then that, splitting the query is the only think off hand.

Thank you for you note and valuable feedback.

Unfortunately I'm planning to go live before MySql 4.1 is production release
so I need to find a work around.

Do I understand your suggestion about splitting the query; I would first
select all required rows from table a, then take each retrieved row from
table and (in a loop in my java program) select from table b, using limit?

Kind regards
Emmanuel

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Stop query on first match

2004-06-07 Thread Michael Stassen

This doesn't quite make sense.  You seem to say that several rows will match 
but then you say only one will.  It must be one or the other.  Perhaps I've 
misunderstood you.  You also seem to imply that with BETWEEN you get a full 
table scan even though there is only one match for each row.  That sounds 
like an indexing problem, but it is hard to say without more information. 
It would help us help you if you at least posted the query and the results 
of EXPLAIN.  It would probably also help if you told us more about the 
tables, perhaps with SHOW CREATE TABLE.

Michael
Emmanuel van der Meulen wrote:
Hello all,
I see a similar question was asked before, but it was not answered.  I hope
someone can assist me.
My query uses two tables. The query selects one row on table b for each row
on table a, but uses between in the select.
It can only ever return one row from table b, for each row on table a, due
to the contents that is stored in table b.  The table contains in excess of
a million records.  What happens as a result of the between is that for the
query, several rows seem to be candidates on table b, but once the query
evaluates and sifts through the candidate rows on table b, only one row will
ever match.  So if I could inform MySql to stop the query for the particular
row, once one row on table b matches the row on table a, the query would
return hundreds of times faster.  As an experiment I took one example and
used limit and the query reduced from 4 secs to .01 sec.  However, when
doing the 'live' query, I cannot use limit because, I do not want overall
only 1 row returned, I want one row returned for each of the rows from table
a which has 1 match each on table b.
I've looked in several books and searched Google but cannot get a way of
doing this.  It seems Oracle has a 'FIRST' in their select which they use
for such a use case.  But I do not see anything for MySql anywhere.
Could someone please assist me.
Kind regards
Emmanuel


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Stop query on first match

2004-06-06 Thread Jeff Smelser

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Monday 07 June 2004 12:49 am, Emmanuel van der Meulen wrote:
> I've looked in several books and searched Google but cannot get a way of
> doing this.  It seems Oracle has a 'FIRST' in their select which they use
> for such a use case.  But I do not see anything for MySql anywhere.

If your using a later mysql, you could use sub query with limit.

Other then that, splitting the query is the only think off hand.

- -- 
 I am under the influence of sugar, caffeine, and lack of sleep, and
   should not be held responsible for my behavior.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFAxAMEld4MRA3gEwYRAgc5AJ0behkfDwijlpSANX1oXsRfNtwmKgCeOF4b
j9366DtZYNuo2j0aTvQsudY=
=ecrd
-END PGP SIGNATURE-

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Stop query on first match

2004-06-06 Thread Emmanuel van der Meulen

Hello all,

I see a similar question was asked before, but it was not answered.  I hope
someone can assist me.

My query uses two tables. The query selects one row on table b for each row
on table a, but uses between in the select.

It can only ever return one row from table b, for each row on table a, due
to the contents that is stored in table b.  The table contains in excess of
a million records.  What happens as a result of the between is that for the
query, several rows seem to be candidates on table b, but once the query
evaluates and sifts through the candidate rows on table b, only one row will
ever match.  So if I could inform MySql to stop the query for the particular
row, once one row on table b matches the row on table a, the query would
return hundreds of times faster.  As an experiment I took one example and
used limit and the query reduced from 4 secs to .01 sec.  However, when
doing the 'live' query, I cannot use limit because, I do not want overall
only 1 row returned, I want one row returned for each of the rows from table
a which has 1 match each on table b.

I've looked in several books and searched Google but cannot get a way of
doing this.  It seems Oracle has a 'FIRST' in their select which they use
for such a use case.  But I do not see anything for MySql anywhere.

Could someone please assist me.

Kind regards
Emmanuel


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]

Re: Stop query on first match

RE: Stop query on first match

Re: Stop query on first match

RE: Stop query on first match

RE: Stop query on first match

Re: Stop query on first match

Re: Stop query on first match

Stop query on first match

8 matches

Site Navigation

Mail list logo

Footer information