Ah-ha, excuse my earlier response, I was under the impression you were trying to track schema changes etc.
A On Fri, May 31, 2013 at 7:54 PM, Rick James <rja...@yahoo-inc.com> wrote: > UUID PRIMARY KEY (or even secondary index) -- > Once the table gets big enough (bigger than RAM cache), each row INSERTed > (or SELECTed) will be a disk hit. (Rule of Thumb: only 100 hits/sec.) > This is because _random_ keys (like UUID) make caching useless. Actually, > the slowdown will be gradual. For example, once the table is 5 times as > big as the cache, 80% (1-1/5) of the INSERTs/SELECTs will hit disk. > Bottom line -- Avoid UUIDs in huge tables, if at all possible. > (Exception: The bits in type-1 UUIDs can be rearranged to be roughly > chronological.) > > BIGINT -- You cannot possibly hit its max with any existing hardware. > > MyISAM -- PRIMARY KEY is just another secondary index. Secondary indexes > are separate BTrees. > > InnoDB -- PRIMARY KEY and data coexist in the same BTree. Secondary > indexes are separate BTrees. > > So, assuming this "audit" table will be huge (too big to be cached), you > need to carefully consider every index, both for writing and for reading. > > You mentioned that you might audit 50 tables? An index that starts with > table_name would be inserting/selecting in 50 spots. If the second part of > the index is something 'chronological', such as an AUTO_INCREMENT or > TIMESTAMP, then there would be 50 "hot spots" in the index. This is quite > efficient. INDEX(table_name, UUID) would be bad because of the randomness. > > InnoDB may be the preferred engine, even though the footprint is bigger. > This is because careful design of the PK could lead to INSERTs into hot > spot(s), plus SELECTs being able to take advantage of locality of > reference. With PRIMARY KEY(table_name, ...), and SELECT .. WHERE > tablename='...', InnoDB will find all the rows together (fewer disk hits); > MyISAM will find the data scattered (more disk hits, hence slower). > > Another aspect... Would your SELECTs say "WHERE ... AND timestamp > BETWEEN..." ? And, would you _usually_ query _recent_ times? If so, there > could be a boost from doing both of these > ** PARTITION BY RANGE(TO_DAYS(timestamp)) > ** Move timestamp to the _end_ of any indexes that it is in. > > I would be happy to discuss these principles further. To be able to > discuss more specifically, please provide > ** Your tentative SHOW CREATE TABLE > ** how big you plan for the table to become (#rows or GB), > ** how much RAM you have > > > -----Original Message----- > > From: Carsten Pedersen [mailto:cars...@bitbybit.dk] > > Sent: Friday, May 31, 2013 4:05 AM > > Cc: [MySQL] > > Subject: Re: Audit Table storage for Primary Key(s) > > > > Based on the little information available, I would make a lookup field > > consisting of tablename and primary keys. > > > > (although I still believe that storing this information in the database > > in the first place is probably the wrong approach, but to each his own) > > > > / Carsten > > > > On 31-05-2013 12:58, Neil Tompkins wrote: > > > The kind of look ups will be trying to diagnose when and by who > > > applied a update. So the primary key of the audit is important. My > > > question is for performance, should the primary key be stored as a > > > indexed field like I mentioned before, or should I have a actual > > > individual field per primary key > > > > > > > > > On Fri, May 31, 2013 at 11:03 AM, Carsten Pedersen > > > <cars...@bitbybit.dk <mailto:cars...@bitbybit.dk>> wrote: > > > > > > Again: Unless you can give some idea as to the kind of lookups > > you > > > will be performing (which fields? Temporal values? etc.), it is > > > impossible to give advice on the table structure. I wouldn't > > blame > > > anyone for not being able to do so; saving data for debugging > > will > > > always be a moving target and almost by definition you don't know > > > today what you'll be looking for tomorrow. > > > > > > That's why I think that using CSV tables _the contents of which > > can > > > subsequently be analyzed using any of a number of text file > > > processing tools_ may indeed be your best initial option. > > > > > > On UUIDs vs. INTs: (1) Please do yourself a favor and read up on > > how > > > UUIDs are generated. If it's the same server that generates all > > the > > > UUIDs, you won't get a lot of uniqueness for the amount of space > > > you'll be using for your data and index; (2) Please do the math > > of > > > just how many inserts you can do per second over the next 1.000 > > > years if you use a longint auto-increment field for your PK. > > > > > > / Carsten > > > > > > On 31-05-2013 11 <tel:31-05-2013%2011>:14, Neil Tompkins wrote: > > > > > > Thanks for your response. We expect to use the Audit log > > when > > > looking into > > > exceptions and/or any need to debug table updates. I don't > > > think a CSV > > > table would be sufficient as we are wanting to use a > > interface > > > to query > > > this data at least on a daily basis if not weekly. > > > > > > I use UUID because we have currently 54 tables, of which > > > probably 30 will > > > be audited. So a INT PK wouldn't work because of the number > > of > > > updates we > > > are applying. > > > > > > > > > On Fri, May 31, 2013 at 9:58 AM, Carsten Pedersen > > > <cars...@bitbybit.dk <mailto:cars...@bitbybit.dk>>wrote: > > > > > > On 30-05-2013 09:27, Neil Tompkins wrote: > > > > > > Hi, > > > > > > I've created a Audit table which tracks any changed > > > fields for multiple > > > tables. In my Audit table I'm using a UUID for the > > > primary key. However > > > I > > > need to have a reference back to the primary key(s) > > of > > > the table audited. > > > > > > At the moment I've a VARCHAR field which stores > > primary > > > keys like > > > > > > 1 > > > 1|2013-05-29 > > > 2|2013-05-29 > > > 2 > > > 3 > > > 1|2|2 > > > etc > > > > > > Is this the best approach, or should I have a > > individual > > > field in the > > > audit > > > table for all primary keys. At the moment I think > > the > > > max number of > > > primary keys on any given table is 3 > > > > > > Thanks > > > Neil > > > > > > > > > First you need to ask yourself how you expect to use the > > > table in the > > > future. Will you be looking up the data on a regular > > basis? > > > Or will lookups > > > only be something you will do in exceptional situtions? > > > > > > What is the intended goal of having a UUID for the > > primary > > > key rather > > > than, say, an integer - or having no PK at all? > > > > > > My immediate thought when reading this was "why even > > store > > > that data in a > > > table?" - if it's a simple log, use a log file. > > Especially > > > if you don't > > > know how you intend to search for data later on. There > > are > > > many tools that > > > are far superior to SQL when it comes to searching for > > text > > > strings. You > > > could even consider having a CSV table, which will give > > you > > > an SQL > > > interface to said text file. > > > > > > / Carsten > > > > > > > > > > > > -- > > > MySQL General Mailing List > > > For list archives: http://lists.mysql.com/mysql > > > To unsubscribe: http://lists.mysql.com/mysql > > > > > > > > > > > > > > > -- > > > MySQL General Mailing List > > > For list archives: http://lists.mysql.com/mysql > > > To unsubscribe: http://lists.mysql.com/mysql > > > > > > > > > > -- > > MySQL General Mailing List > > For list archives: http://lists.mysql.com/mysql > > To unsubscribe: http://lists.mysql.com/mysql > > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: http://lists.mysql.com/mysql > >