Hi all!

(Sorry for the late reply.)

Puiu Hrenciuc wrote (re-ordered):
"Barry" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED]
Puiu Hrenciuc wrote:
Hi,

I have a table ( MyISAM, MySQL 5.0.20, FreeBSD ) containing network traffic data :

record_time datetime  - time when recording was added
ip char(15) - ip that generated the traffic
type tinyint(3) - traffic type ( 1 - local, 2 - internet )
inbound int(10) - in bytes
outbound int(10) - out bytes

Records are inserted each 5 minutes through a cron script.
Currently there are 3,330,367 rows.

Primary index is defined on ( ip, type, record_time ), columns in that order.
Also there is an index defined only on record_time

Now for an example to get traffic for this month, I use :

SELECT `ip`,`type`,SUM(`inbound`) AS `in`, SUM(`outbound`) as `out`
FROM `accounting`
WHERE `record_time` BETWEEN 20060401000000 AND 20060413160000
GROUP BY `ip`,`type`

this query takes aprox 7 seconds

Using EXPLAIN gives :

select_type table type possible_keys key key_len ref rows Extra SIMPLE accounting range record_time record_time 8 NULL 362410 Using where; Using temporary; Using filesort

If I remove the SUM functions I am getting also "Using index in group-by"
and the query takes only 0.25 sec

Is there anyway to optimize this query to get faster responses ?


For the original query, the index on "record_time" can be used to select the records to be considered, and the base table must be accessed to sum the "inbound" and "outbound" values.

Without the summing, all information needed for the answer is in the primary key, so an "index-only" strategy can be used (does not need the base table, profits from key cache, ...). Also, by the "group by" clause each distinct combination of (ip, type) is needed only once, so the amount of data that need to be handled is much smaller.

The summing must make a big difference, this cannot be avoided.

You _might_ try an index (record_time, ip, type), because here the leading (= most significant) part can be used for your "where condition", and the next components support the "group by", so there is a slight chance to avoid the sorting.
Disclaimer: This is pure speculation from my part!



Set an index on ip and type and "probably" also on record_time
>
> Hmmm, I have omited that :
>
> I also have an index on (ip,type) in that order


The index on (ip,type) is a prefix of the primary key (ip, type, record_time), and in general any prefix of an existing key can be used.

In other words: I doubt whether there are any statements that make use of this additional index but would not make the same use of the primary key. But as an index needs to be maintained on each insert/update/delete, IMO all this additional index does is slow down these statements (and take up disk space).

You might reduce workload by dropping that index, but it has no effect on the two queries (except reduced file size, amount of data, and all these general consequences).


Regards,
Joerg

--
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com
Office:  (+49 30) 417 01 487     VoIP: [EMAIL PROTECTED]

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to