Re: duplicate rows in spite of multi-column unique constraint

Chris Hornung Tue, 24 Mar 2015 07:13:14 -0700

Thanks for the suggestions regarding non-printing characters, definitelymakes sense as a likely culprit!


However, the data really does seem to be identical in this case:

mysql> select id, customer_id, concat('-', group_id, '-') fromapp_customergroupmembership where customer_id ='ajEiQA';

+-------------+-------------+----------------------------+
| id          | customer_id | concat('-', group_id, '-') |
+-------------+-------------+----------------------------+
| 20279608258 | ajEiQA      | -ddH6Ev-                   |
| 20279608269 | ajEiQA      | -ddH6Ev-                   |
+-------------+-------------+----------------------------+
2 rows in set (0.00 sec)

I also ran the data through hexdump as a secondary check, also looksidentical:

mysql --defaults-extra-file=~/.customers_mysql.cnf app -s -e "selectid, customer_id, group_id from app_customergroupmembership wherecustomer_id ='ajEiQA';" | hexdump -c

0000000   2   0   2   7   9   6   0   8   2   5   8  \t   a   j   E   i
0000010   Q   A  \t   d   d   H   6   E   v  \n   2   0   2   7   9   6
0000020   0   8   2   6   9  \t   a   j   E   i   Q   A  \t   d   d   H
0000030   6   E   v  \n


Any other suggestions given this info?


Thanks,
--
Chris Hornung

Johan De Meersman <mailto:vegiv...@tuxera.be>
March 24, 2015 at 6:08 AM
Please do

   select id, customer_id, concat('-', group_id, '-')
     from app_customergroupmembership
     where customer_id ='ajEiQA';

I suspect one of those group IDs has a trailing space or similar 'invible' 
character that makes it not identical.


----- Original Message -----

From: "Chris Hornung"<chris.horn...@klaviyo.com>
To: "MySql"<mysql@lists.mysql.com>
Sent: Monday, 23 March, 2015 18:20:36
Subject: duplicate rows in spite of multi-column unique constraint

Hello,

I'm come across a situation where a table in our production DB has a
relatively small number of duplicative rows that seemingly defy the
unique constraint present on that table.

We're running MySQL 5.6.19a via Amazon RDS. The table in question is
~250M rows.

`show create table` gives:

app_customergroupmembership | CREATE TABLE `app_customergroupmembership` (
   `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
   `customer_id` varchar(6) COLLATE utf8_bin NOT NULL,
   `group_id` varchar(6) COLLATE utf8_bin NOT NULL,
   `created` datetime NOT NULL,
   PRIMARY KEY (`id`),
   UNIQUE KEY `app_customergroupmembership_customer_id_31afe160_uniq`
(`customer_id`,`group_id`),
   KEY `app_customergroupmembership_group_id_18aedd38e3f8a4a0`
(`group_id`,`created`)
) ENGINE=InnoDB AUTO_INCREMENT=21951158253 DEFAULT CHARSET=utf8
COLLATE=utf8_bin


Despite that, records with duplicate customer_id/group_id do exist:

mysql>  select * from app_customergroupmembership where customer_id =
'ajEiQA';
+-------------+-------------+----------+---------------------+
| id          | customer_id | group_id | created             |
+-------------+-------------+----------+---------------------+
| 20279608258 | ajEiQA      | ddH6Ev   | 2015-02-17 00:14:54 |
| 20279608269 | ajEiQA      | ddH6Ev   | 2015-02-17 00:14:54 |
+-------------+-------------+----------+---------------------+

Interestingly, these dupe records can't seem to be queried when using
both columns from the unique constraint in the WHERE clause:

mysql>  select * from app_customergroupmembership where customer_id =
'ajEiQA' and group_id = 'ddH6Ev';
+-------------+-------------+----------+---------------------+
| id          | customer_id | group_id | created             |
+-------------+-------------+----------+---------------------+
| 20279608258 | ajEiQA      | ddH6Ev   | 2015-02-17 00:14:54 |
+-------------+-------------+----------+---------------------+


Any thoughts on how this situation came to pass, and how to prevent it
from happening?

Thanks,
--
Chris Hornung

Re: duplicate rows in spite of multi-column unique constraint

Reply via email to