On 2016/02/19 8:00 AM, admin at shuling.net wrote: > Hi, > > > > I create a table as follows: > > > > CREATE TABLE MyTable (F1 INTEGER, F2 INTEGER); > > > > Then add the following records: > > > > INSERT INTO MyTable (F1, F2) Values (1, 2); > > INSERT INTO MyTable (F1, F2) Values (1, 3); > > INSERT INTO MyTable (F1, F2) Values (2, 4); > > INSERT INTO MyTable (F1, F2) Values (2, 5); > > INSERT INTO MyTable (F1, F2) Values (3, 6); > > INSERT INTO MyTable (F1, F2) Values (3, 7); > > INSERT INTO MyTable (F1, F2) Values (4, 2); > > > > Now if two records have the same value of F1, then I will define them as > conflict records. > > > > Now I need to perform the following tasks: > > > > 1. For all conflict records, get the total count of distinct F1 values. > In the above sample, record 1, 2, 3, 4, 5, 6 are conflict records, but the > distinct values are only 1, 2, 3 so the total count should be 3. > 2. Get the total count of all the conflict records. In the above sample, > it should be 6. > 3. Set the F2 value of all the conflict records to 9. Keep all other > records intact. > > > > How to do that? Can task 2 and 3 be implemented in one SQL query to improve > the performance?
I think you meant to use the word "Duplicate" where you said "Distinct". You need the number of Duplicate F1 records, which is 3. Then you need the total count of records that might be duplicated for F1, which is 6 in the above case. This all is easy and Hick's solution will work. As to the question of doing 2 and 3 together, how can that ever be? 3 does an update, and 2 expects a return value. There is no mix and match updates and selects (Though Postgres has a great way of doing a bit of both) - but apart from some convenience to the query creator, it has no real efficiency advantage, which is why it's not really done. Best efficiency would be a temp table to avoid multiple walking of the original table and be very fast for even large datasets - I would advise this way perhaps: ------------------------------------------------ -- Processing SQL in: E:\Documents\SQLiteAutoScript.sql -- SQLite version 3.9.2 [ Release: 2015-11-02 ] on SQLitespeed version 2.0.2.4. -- Script Items: 10 Parameter Count: 0 -- 2016-02-19 11:35:58.596 | [Info] Script Initialized, Started executing... -- ================================================================================================ CREATE TABLE MyTable (F1 INTEGER, F2 INTEGER); INSERT INTO MyTable (F1, F2) Values (1, 2) ,(1, 3) ,(2, 4) ,(2, 5) ,(3, 6) ,(3, 7) ,(4, 2) ; CREATE TEMP TABLE F1Dups (F1_ID INTEGER PRIMARY KEY, F1_Count INT); INSERT INTO F1Dups SELECT DISTINCT B.F1, 0 FROM MyTable AS A INNER JOIN MyTable AS B ON B.F1 = A.F1 WHERE B.rowid <> A.rowid; UPDATE F1Dups SET F1_Count = (SELECT COUNT(*) FROM MyTable WHERE MyTable.F1=F1Dups.F1_ID); SELECT COUNT(*) AS Dist_Confl FROM F1Dups; -- Dist_Confl -- ------------ -- 3 SELECT SUM(F1_Count) AS All_Conf FROM F1Dups; -- All_Conf -- ------------ -- 6 UPDATE MyTable SET F2=9 WHERE F1 IN (SELECT F1_ID FROM F1Dups); SELECT * FROM MyTable; -- F1 | F2 -- --- | --- -- 1 | 9 -- 1 | 9 -- 2 | 9 -- 2 | 9 -- 3 | 9 -- 3 | 9 -- 4 | 2 DROP TABLE F1Dups; -- Script Stats: Total Script Execution Time: 0d 00h 00m and 00.031s -- Total Script Query Time: -- --- --- --- --.---- -- Total Database Rows Changed: 19 -- Total Virtual-Machine Steps: 717 -- Last executed Item Index: 10 -- Last Script Error: -- ------------------------------------------------------------------------------------------------