Re: Help with an SQL query

Rhino Sat, 05 Nov 2005 09:50:33 -0800

I'm glad the explanation helped. I figured that the solution alone wouldn't
be very useful if it wasn't explained since it is not obvious to most people
how correlated queries work :-)

I really wasn't trying to center you out with my "rant" about MySQL version.
It's been a long-running frustration for me; I've tried to urge people to
give this information in their notes to the list but an awful lot of people
never think to mention their version. Then, an already lengthy reply gets
even longer as you try to explain: "If you have Version 3.x, the solution is
this.... If you have Version 4.x the solution is this..... If you have
Version 5.x, the solution is this....." You get the picture.

Okay, let me explain the temp tables approach now.

Most subqueries are not correlated so the subquery gets done once and then
its result gets plugged into the outer query. For example, given a table
called Employee in the Sample database which has one row for each employee
and each row contains an employee number, last name, and salary, this query
will return the employee number and last name of every employee who makes
more than the average salary for all employees in the table:

---
select empno, lastname
from Sample.Employee
where salary >
(select avg(salary)
from Sample.Employee);
---

If subqueries are supported, as in Version 4.1.x or above of MySQL, you just
run that exact query and you will get a list of all of the people who make
more than the average salary.

However, if you are using an older version of MySQL which doesn't support
subqueries, you need to use a temporary table approach. Basically, you
create a temporary table that contains the same results that the subquery
would have given you, then join it to the outer query. [Many, but not all,
subqueries can be rewritten as joins.] For example:
---
#Create the temporary table that will contain the average salary for all
employees.
create table if not exists temp_average_salary (average_salary
decimal(9,2));

#Populate the temporary table with the average salary for all employees.
insert into temp_average_salary(select avg(salary) from Sample.Employee);

#Do a join of the employee table and the temporary table to obtain only
employees who make
#more than the average salary.
select empno, lastname, salary
from Sample.Employee e inner join temp_average_salary t
where e.salary > t.average_salary;
---

This would give the same result as the original solution that used
subqueries although there is obviously more work to do in accomplishing the
desired result.

Initially, I was almost certain that this approach could not be made to work
for your problem due to its correlated nature but a couple of experiments
persuaded me that it was possible after all. It's actually quite similar to
the example I gave above, except that this time the temporary table is not
just a single value on a single row but a table containing one row for each
VBS_id, showing the max (latest) date for any row containing that VBS_id.
Here's the definition of the table and the correct method of populating the
table:

---
create table if not exists temp_table
(VBS_id int not null,
max_date date not null);

insert into temp_table (select VBS_id, max(date) from VBS_table group by
VBS_id);
---

If you display the contents of the temporary table, you get:

---
VBS_id    max_date
111111    2005-10-08
111112    2005-10-08

---
Now, you simply join the VBS_table and the temp_table, as follows:
---
select v.VBS_id, v.date, v.weight
from VBS_table v inner join temp_table t
on v.VBS_id = t.VBS_id
and v.date = t.max_date;
---

The result is exactly what you wanted:
---
VBS_id    date                weight
111111    2005-10-08    11.52
111112    2005-10-08    10.50
---

You could apply this solution to versions of MySQL that don't have subquery
support; just remember that you need to re-create the temp_table immediately
before you do the join each time; otherwise, you are going to determine
whatever weights were in effect whenever you first populated the temp_table,
not the current weights.

In your case though, you can make life better for yourself. Since you have
view support, you can simply create a view that is the equivalent to the
temp_table and then join the view to the VBS_table to get the data you want.
Since the view is always up-to-date, you eliminate the need to have the
temp_table at all. Something like this:

---
create view VBS_max_dates
(VBS_id, max_date)
as select VBS_id, max(date)
from VBS_table
group by VBS_id;

select v.VBS_id, v.date, v.weight
from VBS_table v inner join VBS_max_dates z
on v.VBS_id = z.VBS_id
and v.date = z.max_date;
---

Note that the view is only created once, not each time you are about to do
the join!!

One other solution jumps to mind that should be just as good. Instead of
creating temp tables or views, why not just put the original correlated
subquery that I gave you into a stored procedure? Then, all you'd need to do
is call that stored procedure every time you need that result.

Rhino

----- Original Message ----- 
From: "Gobi" <[EMAIL PROTECTED]>
To: <mysql@lists.mysql.com>
Sent: Saturday, November 05, 2005 8:05 AM
Subject: Re: Help with an SQL query

> Rhino wrote:
>
> >I can't test this in MySQL- I'm using an older version of MySQL that
doesn't
> >support subqueries - but it works in DB2 and it should do the trick for
> >getting the current weight of each VBS_id value:
> >
> >select VBS_id, date, weight
> >from VBS_table x
> >where date =
> >(select max(date) from VBS_table
> >where VBS_id = x.VBS_id);
> >
> >I'm assuming you are using a more version of MySQL which DOES support
> >subqueries! If not, you may be able to get the same result with temp
tables
> >but I haven't tried that. Make sure to say something if you don't know
how
> >to use temp tables to simulate subqueries. I'm dubious that this query
can
> >be simulated with temp tables though....
> >
> >(Rant: I REALLY wish people (all people, not just you) posting questions
to
> >this mailing list would get in the habit of specifying which version of
> >MySQL they are using!! The answers to questions FREQUENTLY depend on the
> >MySQL version so it would really help reduce the size of answers if
people
> >volunteered this information in the original question.)
> >
> >Also, I'm assuming that that MySQL supports correlated subqueries; I'm
> >really not sure so I'll let you try the actual query and see if it works
for
> >you.
> >
> >Here's an explanation of how this query works, in case you've never seen
one
> >like this before.
> >
> >This is called a correlated subquery; the key thing that makes it obvious
> >that this is a correlated subquery (in case you have to recognize one) is
> >that a correlation name, in this case 'x', appears in the FROM clause of
the
> >outer query and again in the WHERE clause of the subquery. The subquery
> >appears in brackets in case you are not familiar with subqueries.
> >
> >A correlated subquery works backwards to a normal subquery. In a normal
> >subquery, the subquery is executed first and is only executed once: its
> >result is plugged into the outer query which then executes just once as
> >well. In a correlated subquery, both the outer query and the subquery are
> >executed _repeatedly_, possibly MANY times each.
> >
> >The outer query is executed in order to obtain one row, THEN the subquery
is
> >executed to see if the row found by the outer query can be kept. In this
> >case, let's say that the outer query returned the first row of the table,
> >which has a VBS_id of 111111 and a date of '10/3/2005': the subquery
> >determines the maximum (most recent) date for any row that has same
VBS_id
> >as the one just found by the outer query; if the maximum date differs
from
> >the date found by the outer query, the outer query row is discarded and
does
> >not appear in the final result set. In this case, the maximum date for
> >VBS_ID is 10/8/2005 which is not equal to the value found by the outer
> >query, so that row is discarded.
> >
> >Having dealt with the first row of the outer query, the outer query
executes
> >again and gets another row. Again, the subquery is executed to see if the
> >date is the same as maximum date for that VBS_id and again, the outer row
is
> >only kept if its date matches the maximum date found by the subquery. And
so
> >it goes, one row at a time, until the outer query has read every row of
the
> >table; a single row of the outer query is obtained, then the subquery
> >determines if that row contains the maximum date for the VBS_id that was
> >just found by the outer query. The final result will contain only rows
that
> >have the maximum dates for each VBS_id and will show the VBS_id, the
maximum
> >date, and the weight at that date. That is the result you requested.
> >
> >Bear in mind that a correlated query can perform much more poorly than a
> >regular subquery, although the optimizer sometimes has tricks that make
the
> >performance quite tolerable.
> >
> >There may be a solution that doesn't involve a correlated subquery but
it's
> >not coming to me. However, others who are better at SQL may think of
> >something that does the job.
> >
> >Rhino
> >
>
> Wow, excellent explanation.  By the way, my apologies... I am using
> MySQL 5.0.15.  I would be interested in knowing how to use temp tables
> to simulate subqueries though.  I found out that the sql statement,
> although it returns the correct result, does not work in a view because
> MySQL Views does not allow subqueries.
>
> -- 
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]
>
>
> -- 
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.362 / Virus Database: 267.12.8/161 - Release Date: 03/11/2005
>
>

-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.362 / Virus Database: 267.12.8/161 - Release Date: 03/11/2005

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: Help with an SQL query

Reply via email to