[HACKERS] Improving deadlock error messages

Neil Conway Thu, 19 Apr 2007 23:36:38 -0700

The error message we currently produce when a deadlock occurs is pretty
unfriendly:


    ERROR:  deadlock detected
    DETAIL:  Process 32068 waits for AccessExclusiveLock on relation
    16413 of database 16384; blocked by process 32064.
    Process 32064 waits for AccessExclusiveLock on relation 16415
    of database 16384; blocked by process 32068.

Users encounter this message relatively frequently -- and they often
depend on the information in the errdetail to track down the source of
the deadlock. Presenting numeric OIDs effectively forces the DBA to
resolve the OIDs to the appropriate relation and database names
manually.

I whipped up a quick patch to use names as well as OIDs for the
identifiers in the message, but on reflection the simple approach to
doing this is problematic: when we do syscache lookups to lookup the
identifier names, we might need to acquire an AccessShareLock on various
system catalogs (pg_class, pg_namespace, pg_database). This could fail
(e.g. because of a deadlock involving a system catalog), causing the
deadlock detector to infinitely recurse (albeit slowly).

We could fix this by first conditionally acquiring AccessShareLocks on
the necessary system catalogs. If any of those lock acquisitions fails,
we can just print the numeric OID instead. Since failing to acquire an
AccessShareLock on the system catalogs should not occur frequently in
practise, most users should see a more friendly error message.

Comments? Anyone see a better approach?

-Neil



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

[HACKERS] Improving deadlock error messages

Reply via email to