Re: How to avoid redundancy between PK and indices ?

2005-10-06 Thread Hank
I understand what you're saying.

The problem is that if mysql attempted to do a query like you suggest:

Select Count(*) From myTable Where a=1 And b=1 And c=1 And d=1 And e=1 And f=1;

It can only use one index for the query, and hopefully, the optimizer
will pick one of the six indexes with the fewest keys to scan.  But
even if it could virtualize the PK that way, it could still cause a
scan of millions of records while limiting the table scan to one of
the six non-unique keys.  In other words, it would/could take alot of
time to see if a record is unique upon inserting new records - not
something you'd be happy with performance wise, I'm sure.  Therefore,
a true, concatenated key that enforces uniqueness and can operate
immediately upon inserts is really necessary, regardless of what other
indexes are on the columns.

The type of query you're suggesting can be done with bitmapped indexes
(Oracle has them), where the indexes values are stored as bitmaps, and
you can combine them so Oracle uses multiple indexes in one query to
quickly pair down the records to scan.  Bitmapped indexes work very
well with the cardinality of keys is less than 10,000 (number of
unique key values).  In a nutshell, think of a field for sex/gender
and a table of 1 million records. A bitmapped index of that field
would only be 125,000 bytes long (1 million bits) (one bit=one
record), and to find all the M records, just map the on bits to
the record number in the datatable. For fields with larger possible
values (say, state of residence - 50 values), each location would be
represented by 6 bits. Pretty simple concept, but great performance
gains can be had over regular btree indexes.  I think this is what
you're getting at.

When I asked the MySQL AB folks at the first conference in Orlando a
couple of years ago about adding bitmapped index support in MySQL,
they didn't really know what I was talking about. The developer I
spoke to thought I was suggesting creating indexes on bitmapped
images. No, not exactly.   I hope they know what it is now, though,
and have (or already have) considered adding support for it in MySQL.

-Hank

On 10/5/05, C.R. Vegelin [EMAIL PROTECTED] wrote:
 Hi Hank,
 You are quite right.
 I need separate non-unique indices on a, b, c, d, e and f to avoid table
 scans.
 And when each combi must be unique I need a Primary Key (a,b,c,d,e,f).
 And only Key a (a) seems to be redundant with the primary key ...
 Suppose there would be a PK (a,b,c,d,e,f) defined, without a separate PK
 index.
 And let's assume some rows like:
  columns:a   b   c   d   e   f
  row1 has:  1  1   1   1   1   1
  row2 has:  1  1   1   1   1   2
  row3 has:  1  1   1   1   1   3
  etc.
 Then checking on unique PK could be done by MySQL internally with:
  Select Count(*) From myTable Where a=1 And b=1 And c=1 And d=1 And e=1 And
  f=1;
 to avoid a duplicate primary key for row1, by using / joining the separate
 index tables.
 With this Select query, MySQL could / should make use of the 6 existing
 separate indices.
 Uniqueness can be fully guaranteed with these 6 non-unique indices in this
 case.
 In other words, a separate PK index is fully redundant in this case, right ?
 In addition, it would save space without the longer concatenate key of
 a+b+c+d+e+f.
 Thanks, Cor

 - Original Message -
 From: Hank [EMAIL PROTECTED]
 To: C.R. Vegelin [EMAIL PROTECTED]
 Cc: mysql@lists.mysql.com
 Sent: Wednesday, October 05, 2005 5:57 PM
 Subject: Re: How to avoid redundancy between PK and indices ?


 It depends.. if this is your create table statement:

 CREATE TABLE foo (
  a smallint NOT NULL,
  b smallint NOT NULL,
  c smallint NOT NULL,
  d smallint NOT NULL,
  e smallint NOT NULL,
  f smallint NOT NULL,
  PRIMARY KEY  (a,b,c,d,e,f)
 );

 Then only one unique index is being created on the concatenate key of
 a+b+c+d+e+f.  Queries on any fields other than A will cause a full
 table scan.

 On the other hand, if your create table is:

 CREATE TABLE foo (
  a smallint NOT NULL,
  b smallint NOT NULL,
  c smallint NOT NULL,
  d smallint NOT NULL,
  e smallint NOT NULL,
  f smallint NOT NULL,
  PRIMARY KEY  (a,b,c,d,e,f),
  KEY a (a),
  KEY b (b),
  KEY c (c),
  KEY d (d),
  KEY e (e),
  KEY f (f)
 );

 This will create the primary key, plus six additional indexes, each of
 which is queryable. But in this case, the KEY a (a) non-unique index
 is redundent with the primary key, so to do what you want - a unique
 index on a+b+c+d+e+f PLUS the ability to independtly search the  b c d
 e and f fields, here is the create table you'll need to use:

 CREATE TABLE foo (
  a smallint NOT NULL,
  b smallint NOT NULL,
  c smallint NOT NULL,
  d smallint NOT NULL,
  e smallint NOT NULL,
  f smallint NOT NULL,
  PRIMARY KEY  (a,b,c,d,e,f),
  KEY b (b),
  KEY c (c),
  KEY d (d),
  KEY e (e),
  KEY f (f)
 );


 --

 -Hank

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL

Re: How to avoid redundancy between PK and indices ?

2005-10-05 Thread Hank
It depends.. if this is your create table statement:

CREATE TABLE foo (
  a smallint NOT NULL,
  b smallint NOT NULL,
  c smallint NOT NULL,
  d smallint NOT NULL,
  e smallint NOT NULL,
  f smallint NOT NULL,
  PRIMARY KEY  (a,b,c,d,e,f)
);

Then only one unique index is being created on the concatenate key of
a+b+c+d+e+f.  Queries on any fields other than A will cause a full
table scan.

On the other hand, if your create table is:

CREATE TABLE foo (
  a smallint NOT NULL,
  b smallint NOT NULL,
  c smallint NOT NULL,
  d smallint NOT NULL,
  e smallint NOT NULL,
  f smallint NOT NULL,
  PRIMARY KEY  (a,b,c,d,e,f),
  KEY a (a),
  KEY b (b),
  KEY c (c),
  KEY d (d),
  KEY e (e),
  KEY f (f)
);

This will create the primary key, plus six additional indexes, each of
which is queryable. But in this case, the KEY a (a) non-unique index
is redundent with the primary key, so to do what you want - a unique
index on a+b+c+d+e+f PLUS the ability to independtly search the  b c d
e and f fields, here is the create table you'll need to use:

CREATE TABLE foo (
  a smallint NOT NULL,
  b smallint NOT NULL,
  c smallint NOT NULL,
  d smallint NOT NULL,
  e smallint NOT NULL,
  f smallint NOT NULL,
  PRIMARY KEY  (a,b,c,d,e,f),
  KEY b (b),
  KEY c (c),
  KEY d (d),
  KEY e (e),
  KEY f (f)
);


--

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: How to avoid redundancy between PK and indices ?

2005-10-05 Thread C.R. Vegelin

Hi Hank,
You are quite right.
I need separate non-unique indices on a, b, c, d, e and f to avoid table 
scans.

And when each combi must be unique I need a Primary Key (a,b,c,d,e,f).
And only Key a (a) seems to be redundant with the primary key ...
Suppose there would be a PK (a,b,c,d,e,f) defined, without a separate PK 
index.

And let's assume some rows like:

columns:a   b   c   d   e   f
row1 has:  1  1   1   1   1   1
row2 has:  1  1   1   1   1   2
row3 has:  1  1   1   1   1   3
etc.

Then checking on unique PK could be done by MySQL internally with:
Select Count(*) From myTable Where a=1 And b=1 And c=1 And d=1 And e=1 And 
f=1;
to avoid a duplicate primary key for row1, by using / joining the separate 
index tables.
With this Select query, MySQL could / should make use of the 6 existing 
separate indices.
Uniqueness can be fully guaranteed with these 6 non-unique indices in this 
case.

In other words, a separate PK index is fully redundant in this case, right ?
In addition, it would save space without the longer concatenate key of 
a+b+c+d+e+f.

Thanks, Cor

- Original Message - 
From: Hank [EMAIL PROTECTED]

To: C.R. Vegelin [EMAIL PROTECTED]
Cc: mysql@lists.mysql.com
Sent: Wednesday, October 05, 2005 5:57 PM
Subject: Re: How to avoid redundancy between PK and indices ?


It depends.. if this is your create table statement:

CREATE TABLE foo (
 a smallint NOT NULL,
 b smallint NOT NULL,
 c smallint NOT NULL,
 d smallint NOT NULL,
 e smallint NOT NULL,
 f smallint NOT NULL,
 PRIMARY KEY  (a,b,c,d,e,f)
);

Then only one unique index is being created on the concatenate key of
a+b+c+d+e+f.  Queries on any fields other than A will cause a full
table scan.

On the other hand, if your create table is:

CREATE TABLE foo (
 a smallint NOT NULL,
 b smallint NOT NULL,
 c smallint NOT NULL,
 d smallint NOT NULL,
 e smallint NOT NULL,
 f smallint NOT NULL,
 PRIMARY KEY  (a,b,c,d,e,f),
 KEY a (a),
 KEY b (b),
 KEY c (c),
 KEY d (d),
 KEY e (e),
 KEY f (f)
);

This will create the primary key, plus six additional indexes, each of
which is queryable. But in this case, the KEY a (a) non-unique index
is redundent with the primary key, so to do what you want - a unique
index on a+b+c+d+e+f PLUS the ability to independtly search the  b c d
e and f fields, here is the create table you'll need to use:

CREATE TABLE foo (
 a smallint NOT NULL,
 b smallint NOT NULL,
 c smallint NOT NULL,
 d smallint NOT NULL,
 e smallint NOT NULL,
 f smallint NOT NULL,
 PRIMARY KEY  (a,b,c,d,e,f),
 KEY b (b),
 KEY c (c),
 KEY d (d),
 KEY e (e),
 KEY f (f)
);


--

-Hank

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



How to avoid redundancy between PK and indices ?

2005-10-04 Thread C.R. Vegelin
Hi List,

Is anyone familiar with optimizing indices, including primary key ?
I do have a large myisam table with 6 non-unique key fields, lets say named A, 
B, C, D, E and F.
Each of these columns may have NOT NULL values from 0 to 999, and are defined 
as SmallInt.
Requirement: each row must have a unique combination of these 6 key fields (all 
Btree) !

To force uniqueness on this table, I can define a PRIMARY KEY (A, B, C, D, E, 
F);
But I suppose that MySQL makes a separate (physical) index for the primary key, 
besides the 6 member indices.
And apart from uniqueness, this primary key does not have any added value for 
programming purposes. Right ?
My question: does MySQL alllow some kind of virtual primary key, where 
uniqueness is enforced by MySQL by checking its member indices ?

At this moment my table has more than 13 million rows (about 1100 MB 
Data_Length).
And the Index_Length is about 500 MB, for the 6 indices and the primary key, 
consisting of these 6 indices.
In this case a virtual primary key could save maybe 200 MB in stead of a real 
primary key index and could speed up the updating processes.

I like to hear from you.
Thanks, Cor

Re: How to avoid redundancy between PK and indices ?

2005-10-04 Thread Alec . Cawley
C.R. Vegelin [EMAIL PROTECTED] wrote on 04/10/2005 12:52:01:

 Hi List,
 
 Is anyone familiar with optimizing indices, including primary key ?
 I do have a large myisam table with 6 non-unique key fields, lets 
 say named A, B, C, D, E and F.
 Each of these columns may have NOT NULL values from 0 to 999, and 
 are defined as SmallInt.
 Requirement: each row must have a unique combination of these 6 key 
 fields (all Btree) !
 
 To force uniqueness on this table, I can define a PRIMARY KEY (A, B,
 C, D, E, F);
 But I suppose that MySQL makes a separate (physical) index for the 
 primary key, besides the 6 member indices.

I do not think this is true. If you specify an index, be it primary or 
not, there is only one index. As you describe it, none of the separate 
columns is a candidate for a primary key, visible or otherwise, because 
none of them is of itself unique.


 And apart from uniqueness, this primary key does not have any added 
 value for programming purposes. Right ?

I believe that if you have InnoDB tables, searching by the primary key is 
likely to be significantly faster than searching by secondary keys.

 My question: does MySQL allow some kind of virtual primary key, 
 where uniqueness is enforced by MySQL by checking its member indices ?

In MyISAM tables, the Primary key, or UNIQUE keys have no other function 
than this. In structure, a primary key is no different to any other key.

 
 At this moment my table has more than 13 million rows (about 1100 MB
 Data_Length).
 And the Index_Length is about 500 MB, for the 6 indices and the 
 primary key, consisting of these 6 indices.
 In this case a virtual primary key could save maybe 200 MB in stead 
 of a real primary key index and could speed up the updating processes.
 
 I like to hear from you.

I think you are wrong in your presumption that there are individual 
indexes. Consider a telephone directory: this may be regarded as indexed 
on FamilyName, GivenName. There is only one index even though it is over 
two fields (the order in the telephone directory). You would only need 
another index if you wanted to search over GivenName,FamilyName. This 
would then require an extra index, which would have to be put in the back.

Alec Cawley




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: How to avoid redundancy between PK and indices ?

2005-10-04 Thread C.R. Vegelin

Hi Alec,

Thanks for your comment. Well, we disagree on a few points.
Suppose I have a table with columns CountryID, CompanyID, SectorID and 
ProductID.
And let's say that all these columns are NOT NULL, but indexed as 
non-unique.

I need to select on specific countries, specific products etc.
So I need 4 separate indices, where CountryId may occur more than once in 
the CountryId index,

CompanyID may occur more than once in the CompanyID index etc.
But if these 4 columns together are defined as Primary Key, then each combi 
of  CountryID, CompanyID, SectorID and ProductID is unique.
In my point of view this can only be realized with a separate PK index, 
leading to redundancy in the indices.


My theory is backed by what I read in the manual ... if I read it right ...
If you use ALTER TABLE on a MyISAM table, all non-unique indexes are created 
in a separate batch.
ALTER TABLE ... DISABLE KEYS tells MySQL to stop updating non-unique indexes 
for a MyISAM table.
ALTER TABLE ... ENABLE KEYS then should be used to re-create missing 
indexes.
These 2 features can only be realized if MySQL uses separate indices for 
non-uniques and for PK's.

Don't you think ?
Regards, Cor


From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: Re: How to avoid redundancy between PK and indices ?



C.R. Vegelin [EMAIL PROTECTED] wrote on 04/10/2005 12:52:01:


Hi List,

Is anyone familiar with optimizing indices, including primary key ?
I do have a large myisam table with 6 non-unique key fields, lets
say named A, B, C, D, E and F.
Each of these columns may have NOT NULL values from 0 to 999, and
are defined as SmallInt.
Requirement: each row must have a unique combination of these 6 key
fields (all Btree) !

To force uniqueness on this table, I can define a PRIMARY KEY (A, B,
C, D, E, F);
But I suppose that MySQL makes a separate (physical) index for the
primary key, besides the 6 member indices.


I do not think this is true. If you specify an index, be it primary or
not, there is only one index. As you describe it, none of the separate
columns is a candidate for a primary key, visible or otherwise, because
none of them is of itself unique.



And apart from uniqueness, this primary key does not have any added
value for programming purposes. Right ?


I believe that if you have InnoDB tables, searching by the primary key is
likely to be significantly faster than searching by secondary keys.


My question: does MySQL allow some kind of virtual primary key,
where uniqueness is enforced by MySQL by checking its member indices ?


In MyISAM tables, the Primary key, or UNIQUE keys have no other function
than this. In structure, a primary key is no different to any other key.



At this moment my table has more than 13 million rows (about 1100 MB
Data_Length).
And the Index_Length is about 500 MB, for the 6 indices and the
primary key, consisting of these 6 indices.
In this case a virtual primary key could save maybe 200 MB in stead
of a real primary key index and could speed up the updating processes.

I like to hear from you.


I think you are wrong in your presumption that there are individual
indexes. Consider a telephone directory: this may be regarded as indexed
on FamilyName, GivenName. There is only one index even though it is over
two fields (the order in the telephone directory). You would only need
another index if you wanted to search over GivenName,FamilyName. This
would then require an extra index, which would have to be put in the back.

   Alec Cawley




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: How to avoid redundancy between PK and indices ?

2005-10-04 Thread Alec . Cawley
I agree that if you want to do the searches you describe, you will need 
extra keys. But MySQL does *not* create these keys automatically - it 
creates the keys you ask for *and no more*. You have asked, correctly, for 
a primary key on ABCDEF. MySQL will create exactly that key and no others. 
This will allow you to search on A, AB, ABC etc. If you want a separate 
search on B, or on D, or on F, *you* must request individual indexes on 
these columns. MySQL does not implicitly create hidden indexes for you; 
there is exactly one index for each PRIMARY KEY/UNIQUE/INDEX. This, of 
course, produces redundancy; but only the redundancy that you request. 
Without these redundant indexes, a search on B will be forced to do a full 
table scan. And you *need* those indexes to do the searches you want. 
Without them, all searches become full table scans. And without a PRIMARY 
KEY index, every insert would have to include a full table scan.

Alec





C.R. Vegelin [EMAIL PROTECTED] 
04/10/2005 15:10

To
mysql@lists.mysql.com
cc

Subject
Re: How to avoid redundancy between PK and indices ?






Hi Alec,

Thanks for your comment. Well, we disagree on a few points.
Suppose I have a table with columns CountryID, CompanyID, SectorID and 
ProductID.
And let's say that all these columns are NOT NULL, but indexed as 
non-unique.
I need to select on specific countries, specific products etc.
So I need 4 separate indices, where CountryId may occur more than once in 
the CountryId index,
CompanyID may occur more than once in the CompanyID index etc.
But if these 4 columns together are defined as Primary Key, then each 
combi 
of  CountryID, CompanyID, SectorID and ProductID is unique.
In my point of view this can only be realized with a separate PK index, 
leading to redundancy in the indices.

My theory is backed by what I read in the manual ... if I read it right 
...
If you use ALTER TABLE on a MyISAM table, all non-unique indexes are 
created 
in a separate batch.
ALTER TABLE ... DISABLE KEYS tells MySQL to stop updating non-unique 
indexes 
for a MyISAM table.
ALTER TABLE ... ENABLE KEYS then should be used to re-create missing 
indexes.
These 2 features can only be realized if MySQL uses separate indices for 
non-uniques and for PK's.
Don't you think ?
Regards, Cor


From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: Re: How to avoid redundancy between PK and indices ?


 C.R. Vegelin [EMAIL PROTECTED] wrote on 04/10/2005 12:52:01:

 Hi List,

 Is anyone familiar with optimizing indices, including primary key ?
 I do have a large myisam table with 6 non-unique key fields, lets
 say named A, B, C, D, E and F.
 Each of these columns may have NOT NULL values from 0 to 999, and
 are defined as SmallInt.
 Requirement: each row must have a unique combination of these 6 key
 fields (all Btree) !

 To force uniqueness on this table, I can define a PRIMARY KEY (A, B,
 C, D, E, F);
 But I suppose that MySQL makes a separate (physical) index for the
 primary key, besides the 6 member indices.

 I do not think this is true. If you specify an index, be it primary or
 not, there is only one index. As you describe it, none of the separate
 columns is a candidate for a primary key, visible or otherwise, because
 none of them is of itself unique.


 And apart from uniqueness, this primary key does not have any added
 value for programming purposes. Right ?

 I believe that if you have InnoDB tables, searching by the primary key 
is
 likely to be significantly faster than searching by secondary keys.

 My question: does MySQL allow some kind of virtual primary key,
 where uniqueness is enforced by MySQL by checking its member indices ?

 In MyISAM tables, the Primary key, or UNIQUE keys have no other function
 than this. In structure, a primary key is no different to any other key.


 At this moment my table has more than 13 million rows (about 1100 MB
 Data_Length).
 And the Index_Length is about 500 MB, for the 6 indices and the
 primary key, consisting of these 6 indices.
 In this case a virtual primary key could save maybe 200 MB in stead
 of a real primary key index and could speed up the updating processes.

 I like to hear from you.

 I think you are wrong in your presumption that there are individual
 indexes. Consider a telephone directory: this may be regarded as indexed
 on FamilyName, GivenName. There is only one index even though it is over
 two fields (the order in the telephone directory). You would only need
 another index if you wanted to search over GivenName,FamilyName. This
 would then require an extra index, which would have to be put in the 
back.

Alec Cawley



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]