I'm getting lots of duplicate rows even though I have a
unique index defined over multiple columns.
The issue is that multiple NULL values are allowed,
even when some values are not null.
This could be as specified by the SQL standard,
but it's certainly confusing for the developer.
(By the way, one source of confusion is that
phpMyEdit was disallowing duplicate values
even though sql insert statements allowed them).

Here's the test case:

Goal: prevent duplicate rows on the (c1, c2) pair:


CREATE TABLE `test_multi_column_null` (
`pk` INT NOT NULL AUTO_INCREMENT,
`c1` VARCHAR( 30 ) ,
`c2` VARCHAR( 30 ) ,
PRIMARY KEY ( `pk` ) 
);
ALTER TABLE `test_multi_column_null` ADD UNIQUE `unique_index` ( `c1` , `c2` 
);


Unexpected works:

INSERT INTO `test_multi_column_null` ( `pk` , `c1` , `c2` ) 
VALUES (
'', '1', NULL 
), (
'', '1', NULL 
);


As expected, this causes a duplicate entry:

INSERT INTO `test_multi_column_null` ( `pk` , `c1` , `c2` ) 
VALUES (

'', '1', ''
), (
'', '1', ''
);


mysql> select * from test_multi_column_null;
+----+------+------+
| pk | c1 | c2 |
+----+------+------+
| 1 | 1 | NULL |
| 2 | 1 | NULL |
| 3 | 1 | |
+----+------+------+
3 rows in set (0.00 sec)



Now, to drive the point home, let's add some null rows:

INSERT INTO `test_multi_column_null` ( `pk` , `c1` , `c2` ) 
VALUES (
'', NULL , NULL 
), (
'', NULL , NULL 
);


mysql> select * from test_multi_column_null;
+----+------+------+
| pk | c1 | c2 |
+----+------+------+
| 1 | 1 | NULL |
| 2 | 1 | NULL |
| 3 | 1 | |
| 4 | NULL | NULL |
| 5 | NULL | NULL |
+----+------+------+
5 rows in set (0.00 sec)


Note: this works even with bdb engine in MySQL:

mysql> alter table test_multi_column_null engine = bdb;
Query OK, 5 rows affected (0.03 sec)
Records: 5 Duplicates: 0 Warnings: 0

mysql> select * from test_multi_column_null;
+----+------+------+
| pk | c1 | c2 |
+----+------+------+
| 1 | 1 | NULL |
| 2 | 1 | NULL |
| 3 | 1 | |
| 4 | NULL | NULL |
| 5 | NULL | NULL |
+----+------+------+
5 rows in set (0.00 sec)



Conclusion: if you want to enforce uniqueness,
don't use columns that allow NULL.

Reply via email to