Would not the following snippet be useful to return the record ID if the feature set already exists?
SELECT rec.id FROM records rec WHERE 1 = (SELECT SUM(id_f IN (21, 22, 23)) = 3 AND COUNT(*) = 3 FROM records_features rf WHERE rf.id_r = rec.id) You would have to have some logic in your calling language to deal with things appropriately... I don't know how this could be coded into a DB constraint since SQLite lacks FOR EACH STATEMENT triggers, and to insert a superset of an existing set you may at some point insert the existing set. On 17 April 2016 at 17:32, James K. Lowden <jklowden at schemamania.org> wrote: > On Sat, 16 Apr 2016 01:20:55 +0200 > Ketil Froyn <ketil at froyn.name> wrote: > > > I have two tables and a join table, in principle like this: > > > > CREATE TABLE records (id INTEGER PRIMARY KEY, data TEXT); > > CREATE TABLE features (id INTEGER PRIMARY KEY, data TEXT UNIQUE); > > CREATE TABLE records_features (id_r INTEGER, id_f INTEGER, ord > > INTEGER); > > (As a style suggestion, consider not reduplicating plurals. It's a > one-man band, a 3-star general, and a 5-finger discount. What you have > is a set of record_features.) > > > Later, if I come across a new record that has exactly features > > "feature1", "feature2" and "feature3" (IDs 20,21 and 22), I don't want > > to add a new record for this. So given a new set of feature IDs, how > > can I best check if there's another record that has these exact > > features before I insert it? > > > > To be clear, a record with features "feature1", "feature2", "feature4" > > would be ok. So would "feature1", "feature2". Subsets are ok, but not > > exact duplicates. > > This is an application of relational division: you want to know if two > sets are equal. Well, almost. You really want to know if the "new" set > is a subset of an existing one. > > You're also a little stuck for lack of syntax. There are two sets in > question: the extant one and the one to be inserted. You can insert a > set in SQLite, > > insert into T values (a), (b), (c); > > but because that set of rows (a, b, and c) doesn't have a name, you > can't refer to them again in the same query. So you have to dance a > little jig, something like: > > begin transation > insert into record features ... > delete from record features > id = new_id > where exists ( > select 1 from record features as rf > where rf.id_r = record_features.id_r > and rf.id_f = record_features.id_f > group by id > having count(*) = ( > select count(*) > from record_features where id_r = new_id > ) > ); > commit transaction > > That puts the rows into the database -- where they can be examined -- > and deletes them if they turn out to be a duplicate. You could also > apply the same logic in an insert & update trigger, and probably should > if the rule applies to the *data*, and not just the application's > expectations. > > --jkl > > > _______________________________________________ > sqlite-users mailing list > sqlite-users at mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >