Re: How to avoid redundancy between PK and indices ?
I understand what you're saying. The problem is that if mysql attempted to do a query like you suggest: Select Count(*) From myTable Where a=1 And b=1 And c=1 And d=1 And e=1 And f=1; It can only use one index for the query, and hopefully, the optimizer will pick one of the six indexes with the fewest keys to scan. But even if it could virtualize the PK that way, it could still cause a scan of millions of records while limiting the table scan to one of the six non-unique keys. In other words, it would/could take alot of time to see if a record is unique upon inserting new records - not something you'd be happy with performance wise, I'm sure. Therefore, a true, concatenated key that enforces uniqueness and can operate immediately upon inserts is really necessary, regardless of what other indexes are on the columns. The type of query you're suggesting can be done with bitmapped indexes (Oracle has them), where the indexes values are stored as bitmaps, and you can combine them so Oracle uses multiple indexes in one query to quickly pair down the records to scan. Bitmapped indexes work very well with the cardinality of keys is less than 10,000 (number of unique key values). In a nutshell, think of a field for sex/gender and a table of 1 million records. A bitmapped index of that field would only be 125,000 bytes long (1 million bits) (one bit=one record), and to find all the M records, just map the on bits to the record number in the datatable. For fields with larger possible values (say, state of residence - 50 values), each location would be represented by 6 bits. Pretty simple concept, but great performance gains can be had over regular btree indexes. I think this is what you're getting at. When I asked the MySQL AB folks at the first conference in Orlando a couple of years ago about adding bitmapped index support in MySQL, they didn't really know what I was talking about. The developer I spoke to thought I was suggesting creating indexes on bitmapped images. No, not exactly. I hope they know what it is now, though, and have (or already have) considered adding support for it in MySQL. -Hank On 10/5/05, C.R. Vegelin [EMAIL PROTECTED] wrote: Hi Hank, You are quite right. I need separate non-unique indices on a, b, c, d, e and f to avoid table scans. And when each combi must be unique I need a Primary Key (a,b,c,d,e,f). And only Key a (a) seems to be redundant with the primary key ... Suppose there would be a PK (a,b,c,d,e,f) defined, without a separate PK index. And let's assume some rows like: columns:a b c d e f row1 has: 1 1 1 1 1 1 row2 has: 1 1 1 1 1 2 row3 has: 1 1 1 1 1 3 etc. Then checking on unique PK could be done by MySQL internally with: Select Count(*) From myTable Where a=1 And b=1 And c=1 And d=1 And e=1 And f=1; to avoid a duplicate primary key for row1, by using / joining the separate index tables. With this Select query, MySQL could / should make use of the 6 existing separate indices. Uniqueness can be fully guaranteed with these 6 non-unique indices in this case. In other words, a separate PK index is fully redundant in this case, right ? In addition, it would save space without the longer concatenate key of a+b+c+d+e+f. Thanks, Cor - Original Message - From: Hank [EMAIL PROTECTED] To: C.R. Vegelin [EMAIL PROTECTED] Cc: mysql@lists.mysql.com Sent: Wednesday, October 05, 2005 5:57 PM Subject: Re: How to avoid redundancy between PK and indices ? It depends.. if this is your create table statement: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f) ); Then only one unique index is being created on the concatenate key of a+b+c+d+e+f. Queries on any fields other than A will cause a full table scan. On the other hand, if your create table is: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f), KEY a (a), KEY b (b), KEY c (c), KEY d (d), KEY e (e), KEY f (f) ); This will create the primary key, plus six additional indexes, each of which is queryable. But in this case, the KEY a (a) non-unique index is redundent with the primary key, so to do what you want - a unique index on a+b+c+d+e+f PLUS the ability to independtly search the b c d e and f fields, here is the create table you'll need to use: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f), KEY b (b), KEY c (c), KEY d (d), KEY e (e), KEY f (f) ); -- -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL
Re: How to avoid redundancy between PK and indices ?
It depends.. if this is your create table statement: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f) ); Then only one unique index is being created on the concatenate key of a+b+c+d+e+f. Queries on any fields other than A will cause a full table scan. On the other hand, if your create table is: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f), KEY a (a), KEY b (b), KEY c (c), KEY d (d), KEY e (e), KEY f (f) ); This will create the primary key, plus six additional indexes, each of which is queryable. But in this case, the KEY a (a) non-unique index is redundent with the primary key, so to do what you want - a unique index on a+b+c+d+e+f PLUS the ability to independtly search the b c d e and f fields, here is the create table you'll need to use: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f), KEY b (b), KEY c (c), KEY d (d), KEY e (e), KEY f (f) ); -- -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: How to avoid redundancy between PK and indices ?
Hi Hank, You are quite right. I need separate non-unique indices on a, b, c, d, e and f to avoid table scans. And when each combi must be unique I need a Primary Key (a,b,c,d,e,f). And only Key a (a) seems to be redundant with the primary key ... Suppose there would be a PK (a,b,c,d,e,f) defined, without a separate PK index. And let's assume some rows like: columns:a b c d e f row1 has: 1 1 1 1 1 1 row2 has: 1 1 1 1 1 2 row3 has: 1 1 1 1 1 3 etc. Then checking on unique PK could be done by MySQL internally with: Select Count(*) From myTable Where a=1 And b=1 And c=1 And d=1 And e=1 And f=1; to avoid a duplicate primary key for row1, by using / joining the separate index tables. With this Select query, MySQL could / should make use of the 6 existing separate indices. Uniqueness can be fully guaranteed with these 6 non-unique indices in this case. In other words, a separate PK index is fully redundant in this case, right ? In addition, it would save space without the longer concatenate key of a+b+c+d+e+f. Thanks, Cor - Original Message - From: Hank [EMAIL PROTECTED] To: C.R. Vegelin [EMAIL PROTECTED] Cc: mysql@lists.mysql.com Sent: Wednesday, October 05, 2005 5:57 PM Subject: Re: How to avoid redundancy between PK and indices ? It depends.. if this is your create table statement: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f) ); Then only one unique index is being created on the concatenate key of a+b+c+d+e+f. Queries on any fields other than A will cause a full table scan. On the other hand, if your create table is: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f), KEY a (a), KEY b (b), KEY c (c), KEY d (d), KEY e (e), KEY f (f) ); This will create the primary key, plus six additional indexes, each of which is queryable. But in this case, the KEY a (a) non-unique index is redundent with the primary key, so to do what you want - a unique index on a+b+c+d+e+f PLUS the ability to independtly search the b c d e and f fields, here is the create table you'll need to use: CREATE TABLE foo ( a smallint NOT NULL, b smallint NOT NULL, c smallint NOT NULL, d smallint NOT NULL, e smallint NOT NULL, f smallint NOT NULL, PRIMARY KEY (a,b,c,d,e,f), KEY b (b), KEY c (c), KEY d (d), KEY e (e), KEY f (f) ); -- -Hank -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
How to avoid redundancy between PK and indices ?
Hi List, Is anyone familiar with optimizing indices, including primary key ? I do have a large myisam table with 6 non-unique key fields, lets say named A, B, C, D, E and F. Each of these columns may have NOT NULL values from 0 to 999, and are defined as SmallInt. Requirement: each row must have a unique combination of these 6 key fields (all Btree) ! To force uniqueness on this table, I can define a PRIMARY KEY (A, B, C, D, E, F); But I suppose that MySQL makes a separate (physical) index for the primary key, besides the 6 member indices. And apart from uniqueness, this primary key does not have any added value for programming purposes. Right ? My question: does MySQL alllow some kind of virtual primary key, where uniqueness is enforced by MySQL by checking its member indices ? At this moment my table has more than 13 million rows (about 1100 MB Data_Length). And the Index_Length is about 500 MB, for the 6 indices and the primary key, consisting of these 6 indices. In this case a virtual primary key could save maybe 200 MB in stead of a real primary key index and could speed up the updating processes. I like to hear from you. Thanks, Cor
Re: How to avoid redundancy between PK and indices ?
C.R. Vegelin [EMAIL PROTECTED] wrote on 04/10/2005 12:52:01: Hi List, Is anyone familiar with optimizing indices, including primary key ? I do have a large myisam table with 6 non-unique key fields, lets say named A, B, C, D, E and F. Each of these columns may have NOT NULL values from 0 to 999, and are defined as SmallInt. Requirement: each row must have a unique combination of these 6 key fields (all Btree) ! To force uniqueness on this table, I can define a PRIMARY KEY (A, B, C, D, E, F); But I suppose that MySQL makes a separate (physical) index for the primary key, besides the 6 member indices. I do not think this is true. If you specify an index, be it primary or not, there is only one index. As you describe it, none of the separate columns is a candidate for a primary key, visible or otherwise, because none of them is of itself unique. And apart from uniqueness, this primary key does not have any added value for programming purposes. Right ? I believe that if you have InnoDB tables, searching by the primary key is likely to be significantly faster than searching by secondary keys. My question: does MySQL allow some kind of virtual primary key, where uniqueness is enforced by MySQL by checking its member indices ? In MyISAM tables, the Primary key, or UNIQUE keys have no other function than this. In structure, a primary key is no different to any other key. At this moment my table has more than 13 million rows (about 1100 MB Data_Length). And the Index_Length is about 500 MB, for the 6 indices and the primary key, consisting of these 6 indices. In this case a virtual primary key could save maybe 200 MB in stead of a real primary key index and could speed up the updating processes. I like to hear from you. I think you are wrong in your presumption that there are individual indexes. Consider a telephone directory: this may be regarded as indexed on FamilyName, GivenName. There is only one index even though it is over two fields (the order in the telephone directory). You would only need another index if you wanted to search over GivenName,FamilyName. This would then require an extra index, which would have to be put in the back. Alec Cawley -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: How to avoid redundancy between PK and indices ?
Hi Alec, Thanks for your comment. Well, we disagree on a few points. Suppose I have a table with columns CountryID, CompanyID, SectorID and ProductID. And let's say that all these columns are NOT NULL, but indexed as non-unique. I need to select on specific countries, specific products etc. So I need 4 separate indices, where CountryId may occur more than once in the CountryId index, CompanyID may occur more than once in the CompanyID index etc. But if these 4 columns together are defined as Primary Key, then each combi of CountryID, CompanyID, SectorID and ProductID is unique. In my point of view this can only be realized with a separate PK index, leading to redundancy in the indices. My theory is backed by what I read in the manual ... if I read it right ... If you use ALTER TABLE on a MyISAM table, all non-unique indexes are created in a separate batch. ALTER TABLE ... DISABLE KEYS tells MySQL to stop updating non-unique indexes for a MyISAM table. ALTER TABLE ... ENABLE KEYS then should be used to re-create missing indexes. These 2 features can only be realized if MySQL uses separate indices for non-uniques and for PK's. Don't you think ? Regards, Cor From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Re: How to avoid redundancy between PK and indices ? C.R. Vegelin [EMAIL PROTECTED] wrote on 04/10/2005 12:52:01: Hi List, Is anyone familiar with optimizing indices, including primary key ? I do have a large myisam table with 6 non-unique key fields, lets say named A, B, C, D, E and F. Each of these columns may have NOT NULL values from 0 to 999, and are defined as SmallInt. Requirement: each row must have a unique combination of these 6 key fields (all Btree) ! To force uniqueness on this table, I can define a PRIMARY KEY (A, B, C, D, E, F); But I suppose that MySQL makes a separate (physical) index for the primary key, besides the 6 member indices. I do not think this is true. If you specify an index, be it primary or not, there is only one index. As you describe it, none of the separate columns is a candidate for a primary key, visible or otherwise, because none of them is of itself unique. And apart from uniqueness, this primary key does not have any added value for programming purposes. Right ? I believe that if you have InnoDB tables, searching by the primary key is likely to be significantly faster than searching by secondary keys. My question: does MySQL allow some kind of virtual primary key, where uniqueness is enforced by MySQL by checking its member indices ? In MyISAM tables, the Primary key, or UNIQUE keys have no other function than this. In structure, a primary key is no different to any other key. At this moment my table has more than 13 million rows (about 1100 MB Data_Length). And the Index_Length is about 500 MB, for the 6 indices and the primary key, consisting of these 6 indices. In this case a virtual primary key could save maybe 200 MB in stead of a real primary key index and could speed up the updating processes. I like to hear from you. I think you are wrong in your presumption that there are individual indexes. Consider a telephone directory: this may be regarded as indexed on FamilyName, GivenName. There is only one index even though it is over two fields (the order in the telephone directory). You would only need another index if you wanted to search over GivenName,FamilyName. This would then require an extra index, which would have to be put in the back. Alec Cawley -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: How to avoid redundancy between PK and indices ?
I agree that if you want to do the searches you describe, you will need extra keys. But MySQL does *not* create these keys automatically - it creates the keys you ask for *and no more*. You have asked, correctly, for a primary key on ABCDEF. MySQL will create exactly that key and no others. This will allow you to search on A, AB, ABC etc. If you want a separate search on B, or on D, or on F, *you* must request individual indexes on these columns. MySQL does not implicitly create hidden indexes for you; there is exactly one index for each PRIMARY KEY/UNIQUE/INDEX. This, of course, produces redundancy; but only the redundancy that you request. Without these redundant indexes, a search on B will be forced to do a full table scan. And you *need* those indexes to do the searches you want. Without them, all searches become full table scans. And without a PRIMARY KEY index, every insert would have to include a full table scan. Alec C.R. Vegelin [EMAIL PROTECTED] 04/10/2005 15:10 To mysql@lists.mysql.com cc Subject Re: How to avoid redundancy between PK and indices ? Hi Alec, Thanks for your comment. Well, we disagree on a few points. Suppose I have a table with columns CountryID, CompanyID, SectorID and ProductID. And let's say that all these columns are NOT NULL, but indexed as non-unique. I need to select on specific countries, specific products etc. So I need 4 separate indices, where CountryId may occur more than once in the CountryId index, CompanyID may occur more than once in the CompanyID index etc. But if these 4 columns together are defined as Primary Key, then each combi of CountryID, CompanyID, SectorID and ProductID is unique. In my point of view this can only be realized with a separate PK index, leading to redundancy in the indices. My theory is backed by what I read in the manual ... if I read it right ... If you use ALTER TABLE on a MyISAM table, all non-unique indexes are created in a separate batch. ALTER TABLE ... DISABLE KEYS tells MySQL to stop updating non-unique indexes for a MyISAM table. ALTER TABLE ... ENABLE KEYS then should be used to re-create missing indexes. These 2 features can only be realized if MySQL uses separate indices for non-uniques and for PK's. Don't you think ? Regards, Cor From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Re: How to avoid redundancy between PK and indices ? C.R. Vegelin [EMAIL PROTECTED] wrote on 04/10/2005 12:52:01: Hi List, Is anyone familiar with optimizing indices, including primary key ? I do have a large myisam table with 6 non-unique key fields, lets say named A, B, C, D, E and F. Each of these columns may have NOT NULL values from 0 to 999, and are defined as SmallInt. Requirement: each row must have a unique combination of these 6 key fields (all Btree) ! To force uniqueness on this table, I can define a PRIMARY KEY (A, B, C, D, E, F); But I suppose that MySQL makes a separate (physical) index for the primary key, besides the 6 member indices. I do not think this is true. If you specify an index, be it primary or not, there is only one index. As you describe it, none of the separate columns is a candidate for a primary key, visible or otherwise, because none of them is of itself unique. And apart from uniqueness, this primary key does not have any added value for programming purposes. Right ? I believe that if you have InnoDB tables, searching by the primary key is likely to be significantly faster than searching by secondary keys. My question: does MySQL allow some kind of virtual primary key, where uniqueness is enforced by MySQL by checking its member indices ? In MyISAM tables, the Primary key, or UNIQUE keys have no other function than this. In structure, a primary key is no different to any other key. At this moment my table has more than 13 million rows (about 1100 MB Data_Length). And the Index_Length is about 500 MB, for the 6 indices and the primary key, consisting of these 6 indices. In this case a virtual primary key could save maybe 200 MB in stead of a real primary key index and could speed up the updating processes. I like to hear from you. I think you are wrong in your presumption that there are individual indexes. Consider a telephone directory: this may be regarded as indexed on FamilyName, GivenName. There is only one index even though it is over two fields (the order in the telephone directory). You would only need another index if you wanted to search over GivenName,FamilyName. This would then require an extra index, which would have to be put in the back. Alec Cawley -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]