Re: Unique Index on multiple columns that can contain NULL in MySQL

2005-05-05 Thread Martijn Tonies


> From: "Dennis Fogg"
> > I'm getting lots of duplicate rows even though I have a
> > unique index defined over multiple columns.
> > The issue is that multiple NULL values are allowed,
> > even when some values are not null.
> > This could be as specified by the SQL standard,
> > but it's certainly confusing for the developer.
>
> It is confusing. You have to 'grok' NULL "values" to understand the
problems
> that may arise.
> In Boolean logic you're used to two distinct values TRUE and FALSE. The
> introduction of NULL actually introduced a second outcome of comparison
> operators and functions. Since NULL represents "unknown", comparing NULL
to
> NULL will result in MAYBE.
> If you look at NULL as being a yet unknown variable it starts to make
sense:
> (x = 2) : maybe true, maybe false, depending on the value of 'x'.
> (x != 2) : maybe true, maybe false, depending on the value of 'x'.
> In MySQL the outcome MAYBE is represented by NULL, so (NULL = NULL) ->
NULL;
> and (NULL != 2) -> NULL; (NULL * 1) -> NULL; etcetera.
>
> Regarding indexes, there doesn't seem to be any logic involved, other than
> "what the standard says". UNIQUE indexes may have multiple NULL values
> (excepting BDB tables). A PRIMARY key is defined as a combination of
UNIQUE
> and NOT NULL.

Since when does the standard handle Indices?

It's about time that MySQL gets Unique Constraints - these are defined by
the SQL standard :-)

With regards,

Martijn Tonies
Database Workbench - developer tool for InterBase, Firebird, MySQL & MS SQL
Server
Upscene Productions
http://www.upscene.com


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Unique Index on multiple columns that can contain NULL in MySQL

2005-05-05 Thread Jigal van Hemert
From: "Dennis Fogg"
> I'm getting lots of duplicate rows even though I have a
> unique index defined over multiple columns.
> The issue is that multiple NULL values are allowed,
> even when some values are not null.
> This could be as specified by the SQL standard,
> but it's certainly confusing for the developer.

It is confusing. You have to 'grok' NULL "values" to understand the problems
that may arise.
In Boolean logic you're used to two distinct values TRUE and FALSE. The
introduction of NULL actually introduced a second outcome of comparison
operators and functions. Since NULL represents "unknown", comparing NULL to
NULL will result in MAYBE.
If you look at NULL as being a yet unknown variable it starts to make sense:
(x = 2) : maybe true, maybe false, depending on the value of 'x'.
(x != 2) : maybe true, maybe false, depending on the value of 'x'.
In MySQL the outcome MAYBE is represented by NULL, so (NULL = NULL) -> NULL;
and (NULL != 2) -> NULL; (NULL * 1) -> NULL; etcetera.

Regarding indexes, there doesn't seem to be any logic involved, other than
"what the standard says". UNIQUE indexes may have multiple NULL values
(excepting BDB tables). A PRIMARY key is defined as a combination of UNIQUE
and NOT NULL.

> Here's the test case:
>
> Goal: prevent duplicate rows on the (c1, c2) pair:
This depends on your definition of 'duplicate' in the light of the
NULL-logic I explained earlier.
If there is already a 1-1 pair in the db, inserting another 1-1 pair will
require the index to compare the entry to be inserted with the current
entries in the index. Comparing 1-1 to 1-1 will result in TRUE, so the new
entry is rejected.
If you try the same with a NULL-NULL pair in the db and you try to insert
another NULL-NULL pair, the comparison does not result in TRUE (it would be
MAYBE, thus NULL) and the entry is accepted.

> ++--+--+
> | pk | c1 | c2 |
> ++--+--+
> | 1 | 1 | NULL |
> | 2 | 1 | NULL |
> | 3 | 1 | |
> | 4 | NULL | NULL |
> | 5 | NULL | NULL |
> ++--+--+
> 5 rows in set (0.00 sec)
>
>
> Note: this works even with bdb engine in MySQL:
>
> mysql> alter table test_multi_column_null engine = bdb;
> Query OK, 5 rows affected (0.03 sec)
> Records: 5 Duplicates: 0 Warnings: 0

Does it? Try SHOW CREATE TABLE ; and check whether the engine is
really BDB??
If you do an ALTER TABLE  ENGINE=; the table
will be rebuilt anyway.

I tries your example and the engine type remained MyISAM...

> Conclusion: if you want to enforce uniqueness,
> don't use columns that allow NULL.

That entirely depends on your definition of 'uniqueness' as I explained
before. If you consider NULL to be equal to NULL (which it is not) then you
should only use NOT NULL columns.

Regards, Jigal.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Unique Index on multiple columns that can contain NULL in MySQL

2005-05-04 Thread Hank
I think you should review the very recent thread "why NOT NULL in
PRIMARY key??" which might shed some light on your particular issue.

In a nutshell, NULL!=NULL, so the database engine can not detect the
duplicate rows, as is expected.

-Hank

On 5/4/05, Dennis Fogg <[EMAIL PROTECTED]> wrote:
> I'm getting lots of duplicate rows even though I have a
> unique index defined over multiple columns.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Unique Index on multiple columns that can contain NULL in MySQL

2005-05-04 Thread Dennis Fogg
I'm getting lots of duplicate rows even though I have a
unique index defined over multiple columns.
The issue is that multiple NULL values are allowed,
even when some values are not null.
This could be as specified by the SQL standard,
but it's certainly confusing for the developer.
(By the way, one source of confusion is that
phpMyEdit was disallowing duplicate values
even though sql insert statements allowed them).

Here's the test case:

Goal: prevent duplicate rows on the (c1, c2) pair:


CREATE TABLE `test_multi_column_null` (
`pk` INT NOT NULL AUTO_INCREMENT,
`c1` VARCHAR( 30 ) ,
`c2` VARCHAR( 30 ) ,
PRIMARY KEY ( `pk` ) 
);
ALTER TABLE `test_multi_column_null` ADD UNIQUE `unique_index` ( `c1` , `c2` 
);


Unexpected works:

INSERT INTO `test_multi_column_null` ( `pk` , `c1` , `c2` ) 
VALUES (
'', '1', NULL 
), (
'', '1', NULL 
);


As expected, this causes a duplicate entry:

INSERT INTO `test_multi_column_null` ( `pk` , `c1` , `c2` ) 
VALUES (

'', '1', ''
), (
'', '1', ''
);


mysql> select * from test_multi_column_null;
++--+--+
| pk | c1 | c2 |
++--+--+
| 1 | 1 | NULL |
| 2 | 1 | NULL |
| 3 | 1 | |
++--+--+
3 rows in set (0.00 sec)



Now, to drive the point home, let's add some null rows:

INSERT INTO `test_multi_column_null` ( `pk` , `c1` , `c2` ) 
VALUES (
'', NULL , NULL 
), (
'', NULL , NULL 
);


mysql> select * from test_multi_column_null;
++--+--+
| pk | c1 | c2 |
++--+--+
| 1 | 1 | NULL |
| 2 | 1 | NULL |
| 3 | 1 | |
| 4 | NULL | NULL |
| 5 | NULL | NULL |
++--+--+
5 rows in set (0.00 sec)


Note: this works even with bdb engine in MySQL:

mysql> alter table test_multi_column_null engine = bdb;
Query OK, 5 rows affected (0.03 sec)
Records: 5 Duplicates: 0 Warnings: 0

mysql> select * from test_multi_column_null;
++--+--+
| pk | c1 | c2 |
++--+--+
| 1 | 1 | NULL |
| 2 | 1 | NULL |
| 3 | 1 | |
| 4 | NULL | NULL |
| 5 | NULL | NULL |
++--+--+
5 rows in set (0.00 sec)



Conclusion: if you want to enforce uniqueness,
don't use columns that allow NULL.