Re: Allow foreign keys to reference a superset of unique columns

Wolfgang Walther Fri, 02 Sep 2022 02:42:45 -0700

Kaiting Chen:

I'd like to propose a change to PostgreSQL to allow the creation of a foreign
key constraint referencing a superset of uniquely constrained columns.


+1

Tom Lane:

TBH, I think this is a fundamentally bad idea and should be rejected
outright.  It fuzzes the semantics of the FK relationship, and I'm
not convinced that there are legitimate use-cases.  Your example
schema could easily be dismissed as bad design that should be done
some other way.

I had to add quite a few unique constraints on a superset of alreadyuniquely constrained columns in the past, just to be able to support FKsto those columns. I think those cases most often come up when dealingwith slightly denormalized schemas, e.g. for efficiency.

One other use-case I had recently, was along the followling lines, inabstract terms:


CREATE TABLE classes (class INT PRIMARY KEY, ...);

CREATE TABLE instances (
  instance INT PRIMARY KEY,
  class INT REFERENCES classes,
  ...
);

Think about classes and instances as in OOP. So the table classescontains some definitions for different types of object and the tableinstances realizes them into concrete objects.

Now, assume you have some property of a class than is best modeled as atable like this:


CREATE TABLE classes_prop (
  property INT PRIMARY KEY,
  class INT REFERNECES classes,
  ...
);

Now, assume you need to store data for each of those classes_prop rowsfor each instance. You'd do the following:


CREATE TABLE instances_prop (
  instance INT REFERENCES instances,
  property INT REFERENCES classes_prop,
  ...
);

However, this does not ensure that the instance and the property you'rereferencing in instances_prop are actually from the same class, so youadd a class column:


CREATE TABLE instances_prop (
  instance INT,
  class INT,
  property INT,
  FOREIGN KEY (instance, class) REFERENCES instances,
  FOREIGN KEY (property, class) REFERENCES classes_prop,
  ...
);

But this won't work, without creating some UNIQUE constraints on thosesupersets of the PK column first.

For one example of where the semantics get fuzzy, it's not
very clear how the extra-baggage columns ought to participate in
CASCADE updates.  Currently, if we have
    CREATE TABLE foo (a integer PRIMARY KEY, b integer);
then an update that changes only foo.b doesn't need to update
referencing tables, and I think we even have optimizations that
assume that if no unique-key columns are touched then RI checks
need not be made.  But if you did
    CREATE TABLE bar (x integer, y integer,
                      FOREIGN KEY (x, y) REFERENCES foo(a, b) ON UPDATE 
CASCADE);
then perhaps you expect bar.y to be updated ... or maybe you don't?


In all use-cases I had so far, I would expect bar.y to be updated, too.

I think it would not even be possible to NOT update bar.y, because theFK would then not match anymore. foo.a is the PK, so the value in bar.xalready forces bar.y to be the same as foo.b at all times.

bar.y is a little bit like a generated value in that sense, it shouldalways match foo.b. I think it would be great, if we could actually go astep further, too: On an update to bar.x to a new value, if foo.a=bar.xexists, I would like to set bar.y automatically to the new foo.b.Otherwise those kind of updates always have to either query foo before,or add a trigger to do the same.

In the classes/instances example above, when updatinginstances_prop.property to a new value, instances_prop.class would beupdated automatically to match classes_prop.class. This would fail, whenthe class is different than the class required by the FK to instances,though, providing exactly the safe-guard that this constraint wassupposed to provide, without incurring additional overhead in updatestatements.

In the foo/bar example above, which is just a bit of denormalization,this automatic update would also be helpful - because rejecting theupdate on the grounds that the columns don't match doesn't make sense here.

Another example is that I think the idea is only well-defined when
the subset column(s) are a primary key, or at least all marked NOT NULL.
Otherwise they're not as unique as you're claiming.

I fail to see why. My understanding is that rows with NULL values in thereferenced table can't participate in FK matches anyway, because bothMATCH SIMPLE and MATCH FULL wouldn't require a match when any/all of thecolumns in the referencing table are NULL. MATCH PARTIAL is notimplemented, so I can't tell whether the semantics would be different there.

I'm not sure whether a FK on a superset of unique columns would beuseful with MATCH SIMPLE. Maybe it could be forced to be MATCH FULL, ifMATCH SIMPLE is indeed not well-defined.

It's also unclear to me how this ought to interact with the
information_schema views concerning foreign keys.  We generally
feel that we don't want to present any non-SQL-compatible data
in information_schema, for fear that it will confuse applications
that expect to see SQL-spec behavior there.  So do we leave such
FKs out of the views altogether, or show only the columns involving
the associated unique constraint?  Neither answer seems pleasant.

Instead of tweaking FKs, maybe it would be possible to define a UNIQUEconstraint re-using an existing index that guarantees uniqueness on asubset of columns already? This would allow to create those FKrelationships by creating another unique constraint - without theoverhead of creating yet another index.


So roughly something like this:

ALTER TABLE foo ADD UNIQUE (a, b) USING INDEX foo_pk;

This should give a consistent output for information_schema views?

Best

Wolfgang

Re: Allow foreign keys to reference a superset of unique columns

Reply via email to