Quoth gc <gc1...@gmail.com>, on 2010-10-16 03:37:15 -0400: > CREATE TABLE Code_Units ("code", "unit", PRIMARY KEY ("code")); > CREATE TABLE Base_Data_1 ("partner", "code", "unit", "value", PRIMARY KEY > ("partner", "code")); [...]
I might consider doing something like this (untested): CREATE TEMPORARY TABLE Code_Units_staging ("code", "unit", PRIMARY KEY ("code", "unit")); -- Note the difference in schema for the temporary table. INSERT OR IGNORE INTO Code_Units_staging ("code", "unit") SELECT DISTINCT "code", "unit" FROM Base_Data_1; INSERT OR IGNORE INTO Code_Units_staging ("code", "unit") SELECT DISTINCT "code", "unit" FROM Base_Data_2; [...] INSERT OR IGNORE INTO Code_Units_staging ("code", "unit") SELECT DISTINCT "code", "unit" FROM Base_Data_N; -- Now the rows with equal code/unit pairs have been merged, but -- any with the same code and different units have not, and will -- cause an error on this INSERT. INSERT INTO Code_Units ("code", "unit") SELECT "code", "unit" FROM Code_Units_staging; (You don't strictly need the DISTINCT in this case; it might be faster or slower though; I'd try it both ways if feasible.) This is going to be very slow if errors are frequent, though, since it won't catch them until you've iterated the entire set of data. On the plus side, it does make it easier with some modification to detect where the discrepancies were (add extra columns for source data set to the staging table, say). How frequent are these invalid combinations going to be, and how frequently does the Code_Units table need to be updated? Another approach would be to create a view and then use an INSTEAD OF trigger on insertion into the view that ignores fully-identical rows, raises an abort on same-code/different-unit rows, and does the insert on new-code rows. Maybe something like (again, untested): CREATE VIEW Code_Units_for_insert AS SELECT * FROM Code_Units; CREATE TRIGGER slightly_different_insert INSTEAD OF INSERT ON Code_Units FOR EACH ROW BEGIN INSERT OR IGNORE INTO Code_Units ("code", "unit") VALUES (NEW."code", NEW."unit"); SELECT CASE WHEN (SELECT "unit" = NEW."unit" FROM Code_Units WHERE "code" = NEW."code") THEN 1 ELSE RAISE(ABORT, "Code associated with multiple units") END; END; ---> Drake Wilson _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users