Inexplicable duplicate rows with unique constraint

Richard van der Hoff Thu, 16 Jan 2020 08:50:51 -0800

I'm trying to track down the cause of some duplicate rows in a tablewhich I would expect to be impossible due to a unique constraint. I'mhoping that somebody here will be able to suggest something I might havemissed.

The problem relates to a bug filed against our application(https://github.com/matrix-org/synapse/issues/6696). At first I put thisdown to random data corruption on a single user's postgres instance, butI've now seen three separate reports in as many days and am wondering ifthere is more to it.


We have a table whose schema is as follows:

synapse=# \d current_state_events
Table "public.current_state_events"
   Column   | Type | Modifiers
------------+------+-----------
 event_id   | text | not null
 room_id    | text | not null
 type       | text | not null
 state_key  | text | not null
 membership | text |
Indexes:
    "current_state_events_event_id_key" UNIQUE CONSTRAINT, btree (event_id)

"current_state_events_room_id_type_state_key_key" UNIQUECONSTRAINT, btree (room_id, type, state_key)"current_state_events_member_index" btree (state_key) WHERE type ='m.room.member'::text

Despite the presence of thecurrent_state_events_room_id_type_state_key_key constraint, severalusers have reported seeing errors which suggest that their tables haveduplicate rows for the same (room_id, type, state_key) triplet andindeed querying confirms that to be the case:

synapse=> select count(*), room_id, type, state_key fromcurrent_state_events group by 2,3,4 order by count(*) DESC LIMIT 2;

 count |              room_id              |     type      |   state_key
-------+-----------------------------------+---------------+-------------------------------------

(2 rows)

Further investigation suggests that these are genuinely separate rowsrather than duplicate entries in an index.


The index appears to consider itself valid:

------------+----------+----------+-------------+--------------+----------------+--------------+----------------+------------+--------------+------------+-----------+----------------+--------+--------------+----------------+-----------+----------+---------

17023 | 16456 | 3 | t | f | f| t | f | t | f | t| t | f | 2 3 4 | 100 100 100 | 31263126 3126 | 0 0 0 | |

(1 row)

So, question: what could we be doing wrong to get ourselves into thissituation?


Some other datapoints which may be relevant:

* this has been reported by one user on postgres 9.6.15 and one on10.10, though it's hard to be certain of the version that was runningwhen the duplication occurred* the constraint is added when the table is first created (before anydata is added)* At least one user reports that he has recently migrated his databasefrom one server to another via a `pg_dump -C` and later piping into psql.

Inexplicable duplicate rows with unique constraint

Reply via email to