I have to deal with pretty big data, such a table (say BIG_TABLE) with
16.000.000 of rows.
The table has just one primary index (a simple int).

I thought that splitting data among multiple tables would have boosted
performance,
so I put my data in 16 different tables (everything is on one disk).
The first table had IDs from 1 to 1.500.000, the second the range
1500001-3000000 and so on.
When I made performance tests, I found that I didn't get any performance
gain.

So I tried (just for test purpose) to create a table of 100000 rows (say
SMALL_TABLE) with some data
(IDs from 1.000.000 to 1.100.000) from my big table BIG_TABLE.
Tests were like follows: (pseudocode)

for 100000 times
        read a column from BIG_TABLE where ID=<random number between  1.000.000 and
1.100.000>


for 100000 times
        read a column from SMALL_TABLE where ID=<random number between  1.000.000
and 1.100.000>


The results are the same!
There is no noticeable difference between a table with 16M rows and a table
with 100K rows!
Is that because my IDs are simple int from 1 to n (without gaps)?

I've tried to do 100000 inserts in both tables, and BIG_TABLE takes 45
seconds while
SMALL_TABLE takes 43 secs (not such a big difference for a table 160 times
smaller!)

The only big performance gain I got is from a SUM on a big range, because
MySql in this case
does a full scan on the SMALL_TABLE which is faster than an access by index
on the BIG_TABLE.

Am I doing something wrong? Isn't splitting supposed to improve performance?
Or it is true
only if you can split data on multiple disks?


Thank you

Leonardo


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to