Hello Paul,

Paul Halliday wrote:
I have 2 tables:

1) Event Data
2) Mappings

The query should return something like this:

Hits      IP                      Country Code
20        213.136.52.29     SE

I am trying this:

SELECT COUNT(event.src_ip) AS count, INET_NTOA(event.src_ip),
mappings.cc FROM event, mappings WHERE event.timestamp BETWEEN
'2009-12-06 20:00' and '2009-12-07 20:00:00' AND event.src_ip BETWEEN
'mappings.start_ip' AND 'mappings.end_ip' GROUP BY event.src_ip ORDER
BY count DESC LIMIT 20;

Am I supposed to do a join somewhere? Do joins even apply in a where
clause? or am I totally off the mark.

Singularly, the queries look like this:

SELECT cc FROM mappings WHERE INET_ATON('src_ip') BETWEEN start_ip AND end_ip;

SELECT COUNT(src_ip) AS count, INET_NTOA(src_ip) FROM event WHERE
timestamp BETWEEN '2009-12-06 20:00' and '2009-12-07 20:00:00' GROUP
BY src_ip ORDER BY count DESC LIMIT 20;


Technically speaking, you are already doing a JOIN in your query. The comma operator in the FROM clause combined with the WHERE conditions make your query logically equivalent to the following rewrite:

SELECT COUNT(event.src_ip) AS count
, INET_NTOA(event.src_ip)
, mappings.cc
FROM event
INNER JOIN mappings
  ON event.src_ip BETWEEN 'mappings.start_ip' AND 'mappings.end_ip'
WHERE event.timestamp BETWEEN '2009-12-06 20:00' and '2009-12-07 20:00:00'
GROUP BY event.src_ip
ORDER BY count DESC
LIMIT 20;

However, as we document in the manual, we have demoted the execution precedence of the comma operator to form what I like to call an "implicit join" to be evaluated AFTER any explicit JOIN clauses.
http://dev.mysql.com/doc/refman/5.0/en/join.html

What this means is that you may get better performance out of an explicit join than you do an implicit join.

Also, you may want to consider rewriting your matching condition so that it can use an index or a combination of indexes on your `start_ip` and `end_ip` columns (notice my use of backticks ` ` not single quotes ' ' to identify column names) by rewriting your condition as an AND comparison instead of a BETWEEN comparison

ON event.src_ip >= mappings.start_ip
  AND event.src_ip <= mappings.end_ip

The way it is written now: "<col> BETWEEN <colA> and <colB>" is not one of the cases that we have an optimized and generalized execution plan to handle well. The separate conditions, like I wrote in my example, is an optimized situation and has a much better likelihood of using an index during its evaluation.

I would think that an index on both columns would work better than two single-column indexes.

ALTER TABLE event ADD KEY(src_ip, end_ip);

Warmest regards,
--
Shawn Green, MySQL Senior Support Engineer
Sun Microsystems, Inc.
Office: Blountville, TN



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Reply via email to