On 2004.06.03, Dan Nelson <[EMAIL PROTECTED]> wrote: > > Consider multicolumn indexes; if you always "SELECT field1 from table > where field2=blah", creating an index on (field2,field1) will let mysql > bypass table lookups completely.
Our typical queries look something like: SELECT level1, level2, level3, SUM(count1), SUM(count2) FROM table WHERE level1 = 'value1' AND level2 = 'value2' AND level3 IS NULL GROUP BY level1, level2, level3 In the real data, we have many more than 3 levels. Perhaps I should have been including the count columns in the indexes as well -- hmm. > > Also, joins on this table are miserable since the BLOB columns make > > MySQL use tmp disk for sorting instead of keeping everything in > > memory. > > Unless you're selecting those blob fields, I don't think mysql will > keep them during the join operation. I could be wrong though. The problem is that our longest running queries are this shape: INSERT INTO destination_table SELECT level1, level2, ... leveN, SUM(count1), SUM(count2) ... FROM source_merge_table -- no WHERE clause GROUP BY level1 ... etc source_merge_table is a MERGE table type that can union 12 or more tables. (Once we go to 4.1, we can eliminate the MERGE tables and use a derived table in the FROM clause with UNION ALL, but for now in 4.0, we have to use MERGE tables.) Yes, the BLOB columns are included in the SELECT clause. This forces the query to go straight to tmpfile on disk. > > I've got read_buffer_size and read_rnd_buffer_size cranked up to 64M, > > which I'm hoping will make these full table scans do better -- I > > could be wrong, though. Storage is not local disk but on > > HBA-attached SAN. > > What's your I/O throughput (run "iostat -xcnz 2") during one of these > full table scans? If you can get 100MB/sec, a scan of a 365MB file > should take under 5 seconds. I'll have to look at iostat next time the query runs. Thanks for the reminder to look at iostat, duh. > > As I said, obvious steps to take are to rework the schema and > > introduce composite/conjoint tables where possible and to move those > > BLOB columns out of the main fact table ... but even then, there > > should be "optimal" settings for a DB that generally does full table > > scans on 2M rows ... a lot of the data can be kept in that 10 GB of > > memory, if I could only force MySQL to use it: those BLOB columns are > > probably killing me. > > With 10GB of RAM, Solaris should be caching your entire table in > memory. You will still have the overhead of mysql reading the data > from the OS cache but you should still get good performance. Make > soure you're not mounting your filesystem with the forcedirectio > option, which will disable the OS cache. Well, what seems to happen is the data gets read from the MyISAM table, gets written back out to disk in the tmpdir, then when it's all done, gets actually placed in the .MYD file -- it's the back-and-forthing to disk that I presume is eating a lot of the time. Not sure what can be done about making it not go straight to tmpdir with a BLOB column in the SELECT clause, though. Probably nothing, in 4.0. -- Dossy -- Dossy Shiobara mail: [EMAIL PROTECTED] Panoptic Computer Network web: http://www.panoptic.com/ "He realized the fastest way to change is to laugh at your own folly -- then you can let go and quickly move on." (p. 70) -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]