On Sat, 16 Apr 2016 01:20:55 +0200 Ketil Froyn <ketil at froyn.name> wrote:
> I have two tables and a join table, in principle like this: > > CREATE TABLE records (id INTEGER PRIMARY KEY, data TEXT); > CREATE TABLE features (id INTEGER PRIMARY KEY, data TEXT UNIQUE); > CREATE TABLE records_features (id_r INTEGER, id_f INTEGER, ord > INTEGER); (As a style suggestion, consider not reduplicating plurals. It's a one-man band, a 3-star general, and a 5-finger discount. What you have is a set of record_features.) > Later, if I come across a new record that has exactly features > "feature1", "feature2" and "feature3" (IDs 20,21 and 22), I don't want > to add a new record for this. So given a new set of feature IDs, how > can I best check if there's another record that has these exact > features before I insert it? > > To be clear, a record with features "feature1", "feature2", "feature4" > would be ok. So would "feature1", "feature2". Subsets are ok, but not > exact duplicates. This is an application of relational division: you want to know if two sets are equal. Well, almost. You really want to know if the "new" set is a subset of an existing one. You're also a little stuck for lack of syntax. There are two sets in question: the extant one and the one to be inserted. You can insert a set in SQLite, insert into T values (a), (b), (c); but because that set of rows (a, b, and c) doesn't have a name, you can't refer to them again in the same query. So you have to dance a little jig, something like: begin transation insert into record features ... delete from record features id = new_id where exists ( select 1 from record features as rf where rf.id_r = record_features.id_r and rf.id_f = record_features.id_f group by id having count(*) = ( select count(*) from record_features where id_r = new_id ) ); commit transaction That puts the rows into the database -- where they can be examined -- and deletes them if they turn out to be a duplicate. You could also apply the same logic in an insert & update trigger, and probably should if the rule applies to the *data*, and not just the application's expectations. --jkl