Hi Folks -- Ran into a couple performance issues, and looking for some optimization tips :)
I'm currently using MySQL 4.1.5-gamma, built from the bitkeeper tree a month or so ago. I have a table which is roughly 2500 columns by 91 million rows (I get 4 of these a month, from the data we're receiving). The tables are MyISAM, and look roughly like: create table raddata_2004_10 ( granID integer not null, scanID tinyint unsigned not null, fpID tinyint not null, c1 float not null, c2 float not null, c3 float not null, [BIG SNIP] c2499 float not null, c2500 float not null, primary key (granID, scanID, fpID) ) The first issue I'm having is indexing performance. Adding rows into the table comes in data granules, each of which adds just over 12,000 rows into this table. With the keys disabled, inserts take about 50 seconds. With keys enabled, the insert takes 120-125 seconds. I can escape some of the pain by just not "promoting" a table for the users (adding it to the larger merge tables that the users issue queries against) until the entire month's data has been added, and then enabling the keys. On a table with a full month's data, ALTER TABLE raddata_2004_10 ENABLE KEYS takes about 22 hours to complete. Fortunately, its a write-once setup, for the most part, and most of the work of the database engine is selecting/subsetting the data once it's in. So, is there a faster way to insert/index the data? Would a different table or index type improve performace? The second issue is query performance. It seems that regardless of what fields are selected, it reads the entire row? Since a monthly table averages 840GB, this takes a while, even on a well-organized query like 'Select c1 from raddata_2004_10 where granID between 147568 and 150000' (granID tracks the granules from the start of the mission, and is a calculated value based on the date/time of the data in the file -- incrementing over time. The 2500 granules in the example above represent about 1/3 of a month -- 10 days or so). Even making sure it's using the index to do the query (which it does anyhow, for the above query, according to explain), the current behavior makes it reads 29548800 complete rows, which is 275.4 GB of data to read, even though the desired return is about 1/2500th of that (112.7 MB). All of these tables are fixed-format rows. Is there a configuration or code tweak that would enable MySQL to just read the requested data pieces, or even a smaller subset of each row? With a fixed row format, each field offset should be a relatively simple calculation within a row. Any/all suggestions, comments, even flames are welcoome :) Thanks in advance! ken =========================================================================== "Diplomacy is the weapon of the Civilized Warrior" - Hun, A.T. Ken Gieselman [EMAIL PROTECTED] System Administrator http://www.endlessknot.com/~kgieselm Endlessknot Communications http://www.endlessknot.com =========================================================================== -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]