I have a situation where a query using indexing takes about 2x longer
than non-indexed to complete. I'd greatly appreciate any suggestions as
to how to improve the results or, if not possible, an explanation for
the results (ex. My data represents a worst-case scenario for MySQL
indexing algorithm.)
Table Details----------
-Schema:
timestamp CHAR(14) not null,
f2 CHAR(2) not null,
category CHAR(12) not null,
f4 CHAR(10) not null,
f5 CHAR(13) not null,
f6 CHAR(2) not null,
f7 CHAR(5)
-timestamp is of the form yyyymmddhhmmss
-Table comprised of 3.5 million records of production data.
-timestamp has 1 of 86,400 unique values
-category has 1 of ~9,000 unique values
-Table is loaded using INFILE, then the indexing timestamp and category
fields are indexed separately.
-The raw data has been pre-sorted on the timestamp field in ascending
order.
Test Query-----------
select timestamp,f4 from TABLE_NAME where
timestamp between '19981117000000' and '19981117010000'
and category = 'ABC'
Returns 12,685 records in 58 seconds non-indexed and 114 seconds indexed
averaged over ten iterations of the test query.
Operating Environment------
Perl 5.004_04
-MySQL Host:
Running tests on: SunOS 5.6 sun4u
Server version: MySQL 3.21.33b
SPARCserver 1000
Main Memory 128 MB
Virtual Memory is 461 MB
-MySQL Variables:
| join_buffer | 131072
| key_buffer | 1048568
| max_allowed_packet |
65536
| max_connections |
90
| max_join_size |
4294967295
| max_sort_length |
1024
| net_buffer_length | 8192
| record_buffer | 131072
| skip_locking | 0
| sort_buffer |
2097144
| table_cache |
64
| thread_stack |
65536
| tmp_table_size | 1048576
-All testing was run after hours on a load free client and mysql server
machines. Network traffic was negligible.
-When in production, the table will be loaded only once per day during
off peak hours (no other inserts or updates after bulk load). I can
expect ~50 queries / day resembling with each query resembling the test
query.
Actions Taken / Results------
-Verified that indexing was correct and actively being used by using
command line mysql and EXPLAIN. Also verified results by querying
against a smaller (100,000) record table which expectedly resulted in
the indexed query returning faster than the non-indexed query.
mysql> explain select timestamp,f4 from PILC where timestamp between
'19981117000000' and '19981117010000' and category = 'FRAME_CHK';
| table | type | possible_keys | key | key_len | ref | rows |
Extra |
| PILC | ref | timendx,catendx | abcndx | 12 | FRAME_CHK | 11 |
range used on key timendx |
-From command line mysql:
indexed: 12658 rows in set (1 min 19.66 sec)
non-indexed: 12658 rows in set (2 min 5.91 sec)
-Modified MySQL parameters as indicated in Chapter 10 of documentation:
Run #1: record_buffer 131072
Query returns in 54.6 secs (averaged over 10 trials)
Run #2: record_buffer 1044480
Query returns in 70.8 secs (averaged over 10 trials)
Run#3: -O key_buffer=16M -O table_cache=128 -O sort_buffer=4M -O
record_buffer=1M
Query returns in 72.7 secs (averaged over 10 trials)
-Ran test-ATIS benchmark script (note: these scripts do not use
indexing)
No test measurement varied by more than 5%.
Run 1:
ATIS table test
Creating tables
Time for create_table (28): 3 secs ( 0.00 usr 0.01 sys = 0.01 cpu)
Inserting data
Time to insert (9768): 32 secs ( 2.03 usr 2.16 sys = 4.19 cpu)
Retrieving data
Time for select_simple_join (500): 18 secs ( 2.34 usr 1.89 sys = 4.23
cpu)
Time for select_join (100): 21 secs ( 1.85 usr 1.58 sys = 3.43 cpu)
Time for select_distinct (800): 173 secs ( 7.96 usr 5.98 sys = 13.94
cpu)
Time for select_group (2400): 134 secs ( 5.26 usr 2.75 sys = 8.01 cpu)
Removing tables
Time to drop_table (28): 3 secs ( 0.00 usr 0.01 sys = 0.01 cpu)
Total time: 384 secs (19.45 usr 14.38 sys = 33.83 cpu)
Run 2:
ATIS table test
Creating tables
Time for create_table (28): 3 secs ( 0.01 usr 0.02 sys = 0.03 cpu)
Inserting data
Time to insert (9768): 31 secs ( 1.88 usr 2.59 sys = 4.47 cpu)
Retrieving data
Time for select_simple_join (500): 17 secs ( 2.47 usr 1.87 sys = 4.34
cpu)
Time for select_join (100): 22 secs ( 2.05 usr 1.44 sys = 3.49 cpu)
Time for select_distinct (800): 172 secs ( 8.47 usr 6.55 sys = 15.02
cpu)
Time for select_group (2400): 133 secs ( 5.40 usr 2.85 sys = 8.25 cpu)
Removing tables
Time to drop_table (28): 2 secs ( 0.01 usr 0.00 sys = 0.01 cpu)
Total time: 380 secs (20.31 usr 15.33 sys = 35.64 cpu)
Run 3:
ATIS table test
Creating tables
Time for create_table (28): 3 secs ( 0.00 usr 0.04 sys = 0.04 cpu)
Inserting data
Time to insert (9768): 30 secs ( 2.19 usr 2.65 sys = 4.84 cpu)
Retrieving data
Time for select_simple_join (500): 17 secs ( 2.30 usr 1.99 sys = 4.29
cpu)
Time for select_join (100): 23 secs ( 1.93 usr 1.71 sys = 3.64 cpu)
Time for select_distinct (800): 178 secs ( 8.04 usr 7.05 sys = 15.09
cpu)
Time for select_group (2400): 135 secs ( 4.75 usr 3.09 sys = 7.84 cpu)
Removing tables
Time to drop_table (28): 3 secs ( 0.00 usr 0.00 sys = 0.00 cpu)
Total time: 389 secs (19.24 usr 16.53 sys = 35.77 cpu)
-Changed timestamp from CHAR to DATETIME (thought: more efficient type
= better performance WRONG! though it did reduce table space usage by
11%)
Query returned in 332 secs. non-indexed vs. 162 secs. indexed.
Notes--------
-I tried the same test using a very popular commercial database (VPCDB)
with minimal tuning (I'm not a DBA just yet) using the same client,
host, perl, DBI, environment, etc. and received the same results (ie.
index query slower than non-indexed query for large table), though MySQL
was faster :-) Those stats:
MySQL: Non-indexed 58 vs. Indexed 114 secs
VPCDB: Non-Indexed 118 vs. Indexed 167 secs
-My application requires speed (it will be web-based) but not
transaction processing (though robust backup/recovery (hot?) is
important).
-Not sure if executable was complied statically / mit-threads / etc.
since compile log file missing.
Thanks for your advice/suggestions, Frank V.
Big thanks to TCX and Monty in particular for MySQL!
--
Frank Vetesi Email: [EMAIL PROTECTED]
Lockheed Martin
Advanced Technology Labs Voice: 609-338-3923
1 Federal Street - A&E 3W Fax: 609-338-4122
Camden, NJ 08102 MARCALL: 8*794-3923
-----------------------------------------------------------
Send a mail to [EMAIL PROTECTED] with
unsubscribe mysql [EMAIL PROTECTED]
in the body of the message to unsubscribe from this list.