Re: Help with an SQL query

Gobi Sun, 06 Nov 2005 01:03:08 -0800

Rhino wrote:

I'm glad the explanation helped. I figured that the solution alone wouldn't
be very useful if it wasn't explained since it is not obvious to most people
how correlated queries work :-)


I really wasn't trying to center you out with my "rant" about MySQL version.
It's been a long-running frustration for me; I've tried to urge people to
give this information in their notes to the list but an awful lot of people
never think to mention their version. Then, an already lengthy reply gets
even longer as you try to explain: "If you have Version 3.x, the solution is
this.... If you have Version 4.x the solution is this..... If you have
Version 5.x, the solution is this....." You get the picture.

Okay, let me explain the temp tables approach now.

Most subqueries are not correlated so the subquery gets done once and then
its result gets plugged into the outer query. For example, given a table
called Employee in the Sample database which has one row for each employee
and each row contains an employee number, last name, and salary, this query
will return the employee number and last name of every employee who makes
more than the average salary for all employees in the table:

---
select empno, lastname
from Sample.Employee
where salary >
(select avg(salary)
from Sample.Employee);
---

If subqueries are supported, as in Version 4.1.x or above of MySQL, you just
run that exact query and you will get a list of all of the people who make
more than the average salary.

However, if you are using an older version of MySQL which doesn't support
subqueries, you need to use a temporary table approach. Basically, you
create a temporary table that contains the same results that the subquery
would have given you, then join it to the outer query. [Many, but not all,
subqueries can be rewritten as joins.] For example:
---
#Create the temporary table that will contain the average salary for all
employees.
create table if not exists temp_average_salary (average_salary
decimal(9,2));

#Populate the temporary table with the average salary for all employees.
insert into temp_average_salary(select avg(salary) from Sample.Employee);

#Do a join of the employee table and the temporary table to obtain only
employees who make
#more than the average salary.
select empno, lastname, salary
from Sample.Employee e inner join temp_average_salary t
where e.salary > t.average_salary;
---

This would give the same result as the original solution that used
subqueries although there is obviously more work to do in accomplishing the
desired result.

Initially, I was almost certain that this approach could not be made to work
for your problem due to its correlated nature but a couple of experiments
persuaded me that it was possible after all. It's actually quite similar to
the example I gave above, except that this time the temporary table is not
just a single value on a single row but a table containing one row for each
VBS_id, showing the max (latest) date for any row containing that VBS_id.
Here's the definition of the table and the correct method of populating the
table:

---
create table if not exists temp_table
(VBS_id int not null,
max_date date not null);

insert into temp_table (select VBS_id, max(date) from VBS_table group by
VBS_id);
---

If you display the contents of the temporary table, you get:

---
VBS_id    max_date
111111    2005-10-08
111112    2005-10-08

---
Now, you simply join the VBS_table and the temp_table, as follows:
---
select v.VBS_id, v.date, v.weight
from VBS_table v inner join temp_table t
on v.VBS_id = t.VBS_id
and v.date = t.max_date;
---

The result is exactly what you wanted:
---
VBS_id    date                weight
111111    2005-10-08    11.52
111112    2005-10-08    10.50
---

You could apply this solution to versions of MySQL that don't have subquery
support; just remember that you need to re-create the temp_table immediately
before you do the join each time; otherwise, you are going to determine
whatever weights were in effect whenever you first populated the temp_table,
not the current weights.

In your case though, you can make life better for yourself. Since you have
view support, you can simply create a view that is the equivalent to the
temp_table and then join the view to the VBS_table to get the data you want.
Since the view is always up-to-date, you eliminate the need to have the
temp_table at all. Something like this:

---
create view VBS_max_dates
(VBS_id, max_date)
as select VBS_id, max(date)
from VBS_table
group by VBS_id;

select v.VBS_id, v.date, v.weight
from VBS_table v inner join VBS_max_dates z
on v.VBS_id = z.VBS_id
and v.date = z.max_date;
---

Note that the view is only created once, not each time you are about to do
the join!!

One other solution jumps to mind that should be just as good. Instead of
creating temp tables or views, why not just put the original correlated
subquery that I gave you into a stored procedure? Then, all you'd need to do
is call that stored procedure every time you need that result.

Rhino

Rhino, I really appreciate your detailed explanation. It clarifies alot of things for me. I was thinking of going with the view method evenbefore you mentioned it. However, your stored procedure method soundseven better. The only problem is I haven't used stored proceduresbefore so I will need to read up on it and need to find out how to callit from a JSP page.


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: Help with an SQL query

Reply via email to