Jonathan Mangin wrote:
I have a simple query, (part of a union).
calendar is a table of only dates, `date` as PK.
sales has a multi-col index on (date, uid).
(select calendar.date as date,
time_format(time(sales.date),'%H:%i') as time,
from calendar
left join sales
on date(sales.date) = calendar.date
and sales.uid = 'george'
where calendar.date between '2005-08-01' and '2005-08-14')
First, I don't quite understand joins(!) on constants (sales.uid =
'george').
Is this proper, or should uid be included in where clause?
This query makes perfect sense. For each day in the first two weeks of August
(matching rows in calendar), you want a report of sales to/by george. The
following simpler query does most of that job:
SELECT calendar.date AS date,
time_format(time(sales.date),'%H:%i') as time,
FROM calendar
JOIN sales ON date(sales.date) = calendar.date
WHERE calendar.date BETWEEN '2005-08-01' AND '2005-08-14'
AND sales.uid = 'george'
but it doesn't report on days where george has no sales. You fix that by
changing to a LEFT JOIN, which gives one row of output for each calendar day
that has no match in sales with NULLs in the sales columns. The key here is
that you must move the sales.uid = 'george' to the ON clause, because it is part
of the requirement for determing matching rows in sales. If you leave it in the
WHERE clause, it will undo the effect of the LEFT JOIN, because the WHERE clause
conditions on the table on the right, sales, happen after matching rows are
found or NULL rows are made. You see? If george has no sales on a particular
calendar day, you get a column with NULLs on the sales side if both conditions
are in the ON clause, but if the 'george' requirement is in the WHERE clause,
days with no 'george' sales will be filtered out. Hence the query you've given
above.
Second, explain shows no possible keys for sales.
id select_type table type possible_keys key key_len
ref rows Extra
1 PRIMARY calendar range PRIMARY PRIMARY 3 NULL
11 Using where; Using index
1 PRIMARY sales ALL NULL NULL NULL NULL 23
Some versions of my query showed possible_keys, but I can't seem to
find what I did and I'm frustrated. What am I doing wrong here?
The problem is this "date(sales.date) = calendar.date". You've run sales.date
through a functon, so the index on sales.date cannot be used. To use the index,
you must rewrite the query so the actual column value is used. An index on
sales.uid could have been used, but you don't have one (sales.uid is part of an
index, but that part can't be used if the leftmost ppart of the index,
sales.date, is not being used).
You haven't told us what kind of column sales.date is, so it's hard to say what
to do. Ideally, if you are going to be using it to join to calendar, it should
be a DATE. In that case, you can simply do
SELECT calendar.date AS date,
time_format(time(sales.date),'%H:%i') as time,
FROM calendar
LEFT JOIN sales ON sales.date = calendar.date
AND sales.uid = 'george'
WHERE calendar.date BETWEEN '2005-08-01' AND '2005-08-14'
I'm guessing that sales.date is not a DATE, though. If it's a DATETIME or a
TIMESTAMP, one possibility would be to replace it with two columns, a DATE and a
TIME. That would help this query, but may not work well with the rest of your
design. You'll have to decide.
If it doesn't make sense to change sales.date, there are two things you can do
that should help.
1) Add an index on sales.uid. I think the best way to do that would be to
replace your multi-column index on (date, uid) with one on (uid, date). That
would help this query, and would help any query which selects all dates for a
given uid. I expect you also want reports for all sales in a given date range
for all uids. In that case, you would also want a single-column index on date.
So, I'm suggesting something like
ALTER TABLE sales
DROP INDEX name_of_the_index_on_date_and_uid,
ADD INDEX (uid, date),
ADD INDEX (date);
2) Add a condition to your query to help mysql use the index on sales.date. The
trick is to add it to the ON clause, because the WHERE clause is too late to
pre-select rows on the right. So
SELECT calendar.date AS date,
time_format(time(sales.date),'%H:%i') as time,
FROM calendar
LEFT JOIN sales ON sales.date = calendar.date
AND sales.uid = 'george'
AND sales.date BETWEEN '2005-08-01' AND '2005-08-14 23:59:59'
WHERE calendar.date BETWEEN '2005-08-01' and '2005-08-14'
This would work best with a multi-column index on (uid, date). Because the
sales.date restriction is a range, it would only use your current multi-column
index on (date, uid) as if it were a single-column index on date.
Thanks.
Michael
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]