"Martijn Tonies" <[EMAIL PROTECTED]> wrote on 01/19/2005 03:33:32 AM:
> Hello Jeremy, > > > > I have a curious issue here, maybe someone can help. > > > > > > I have a single process that inserts data into tables that contain > > > purely logging information. This table is then searched by our Care > > > department to troubleshoot issues. I am looking for the best way to > > > store this data, and the structure on the backend. > > > > > > There are 50 million inserts into table LOG a day. The primary index > > > on the table is seconds from 1971. I only need to keep 60 days worth > > > of data, and the table is only used for read purposes. This is my > > > design criteria, but my problem is how to delete old data without > > > crashing the log writer that is atteched to the table. > > > > OK, how about this: > > > > Use MyISAM and MERGE tables. Keep one table per day. E.g.: > > > > log_2005_01_15 > > log_2005_01_16 > > log_2005_01_17 > > log_2005_01_18 > > > > etc. > > > > Use MERGE tables to give you the 60 day (and perhaps e.g. 14 day, 30 > > day, 7 day, etc.) read views that you need, like so: > > > > CREATE TABLE log_view_7day ( > > ... > > ) TYPE=MERGE UNION=( > > log_2005_01_12, > > You can then do all of your inserts from the log writer into the "today" > > table, and do your reads against the various MERGEs. > > > > Every day at exactly midnight, you would use ALTER TABLE (which is > > atomic) to redefine the UNION of the MERGE definition of the various > tables. > > Modifying metadata because you need a different view at your > data. > > Am I the only one to which this sound ugly? > > With regards, > > Martijn Tonies > Database Workbench - developer tool for InterBase, Firebird, MySQL & MS SQL > Server > Upscene Productions > http://www.upscene.com > It sounds just fine to me. IMHO this is an excellent usage of MySQL features to solve a technical issue. The original poster wants to maintain only 60 days worth of data. He also needs really fast inserts. MyISAM can perform concurrent, non-blocking inserts so long as there are no deletion gaps in the table to which it is inserting. One solution to this is to create what is in effect a "view" that spans 60 days worth of data, each day's data in its own table. However, since views wont be officially available until later, a MERGE table is an effective substitute. This gives him many kinds of granularity options when performing queries, too. He can query the daily table directly if he knows that the query does not span days. He could use multiple MERGE tables to aggregate various ranges of daily data. He could UNION several single-day queries together to cover a span or list of dates. All of these have various advantages. The biggest of which is that when it comes time to move the out of date data off-line, he won't have to lock the current day's data to do it. One alternative could be a VIEW built against a single, larger table with a query specifying a date range of 60 days. Something like: CREATE VIEW umptyfratz AS SELECT * FROM datatable WHERE logdate > (NOW()- INTERVAL 60 days) However this introduces several performance hits: 1) the WHERE Clause has to be recomputed every time the VIEW is called so that you return only the data within the specified date range. This may not scale well to larger tables. 2) If we delete rows from "logdate" older than 60 days, we create deletion gaps in the data. This prevents MyISAM from simply appending data to the end of the table (non-blocking inserts) and forces a table lock for each insert so that the deletion gaps can be filled in first. This limits concurrency severely. However, there is a dark side to the MERGE table solution, too. As is explained in the manual ( http://dev.mysql.com/doc/mysql/en/MERGE_storage_engine.html), the merge table AND each table participating in the merge will consume an operating system file handle each time the merge table is opened. For a merge table covering 60 daily tables, that means that the OS has to issue 61 new file handles PER USER ACCESSING THE TABLE. Depending on the limits of your OS, you may not be able to open very many files. I would recommend using weekly tables so that a MERGE table covering 60 days worth of data would only need 10 file handles from the OS (9 weeks worth of data + 1 for the MERGE itself) Of course, it is imperative that before putting such a solution into production that any DBA must test, test, and retest to make sure they are getting optimal performance and resource usage. Shawn Green Database Administrator Unimin Corporation - Spruce Pine