Re: Optimizing SQL statement

Joerg Bruehe Thu, 20 Apr 2006 07:14:45 -0700

Hi all!

(Sorry for the late reply.)


Puiu Hrenciuc wrote (re-ordered):

"Barry" <[EMAIL PROTECTED]> wrote in messagenews:[EMAIL PROTECTED]
Puiu Hrenciuc wrote:
Hi,
I have a table ( MyISAM, MySQL 5.0.20, FreeBSD ) containing networktraffic data :
record_time datetime  - time when recording was added
ip char(15) - ip that generated the traffic
type tinyint(3) - traffic type ( 1 - local, 2 - internet )
inbound int(10) - in bytes
outbound int(10) - out bytes

Records are inserted each 5 minutes through a cron script.
Currently there are 3,330,367 rows.
Primary index is defined on ( ip, type, record_time ), columns in thatorder.
Also there is an index defined only on record_time

Now for an example to get traffic for this month, I use :

SELECT `ip`,`type`,SUM(`inbound`) AS `in`, SUM(`outbound`) as `out`
FROM `accounting`
WHERE `record_time` BETWEEN 20060401000000 AND 20060413160000
GROUP BY `ip`,`type`

this query takes aprox 7 seconds

Using EXPLAIN gives :
select_type table type possible_keys keykey_len ref rows ExtraSIMPLE accounting range record_time record_time 8NULL 362410 Using where; Using temporary; Using filesort
If I remove the SUM functions I am getting also "Using index in group-by"
and the query takes only 0.25 sec

Is there anyway to optimize this query to get faster responses ?

For the original query, the index on "record_time" can be used to selectthe records to be considered, and the base table must be accessed to sumthe "inbound" and "outbound" values.

Without the summing, all information needed for the answer is in theprimary key, so an "index-only" strategy can be used (does not need thebase table, profits from key cache, ...).Also, by the "group by" clause each distinct combination of (ip, type)is needed only once, so the amount of data that need to be handled ismuch smaller.


The summing must make a big difference, this cannot be avoided.

You _might_ try an index (record_time, ip, type), because here theleading (= most significant) part can be used for your "wherecondition", and the next components support the "group by", so there isa slight chance to avoid the sorting.

Disclaimer: This is pure speculation from my part!

Set an index on ip and type and "probably" also on record_time

>
> Hmmm, I have omited that :
>
> I also have an index on (ip,type) in that order

The index on (ip,type) is a prefix of the primary key (ip, type,record_time), and in general any prefix of an existing key can be used.

In other words: I doubt whether there are any statements that make useof this additional index but would not make the same use of the primary key.But as an index needs to be maintained on each insert/update/delete, IMOall this additional index does is slow down these statements (and takeup disk space).

You might reduce workload by dropping that index, but it has no effecton the two queries (except reduced file size, amount of data, and allthese general consequences).



Regards,
Joerg

--
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com
Office:  (+49 30) 417 01 487     VoIP: [EMAIL PROTECTED]

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: Optimizing SQL statement

Reply via email to