One of the things others have mentioned, but have not really expounded upon... What constitutes a duplicate? For an individual field this can be complex, for multiple fields...
For some data types (numerics) a duplicate is easy to define 2 = 2 = 2 = 2 (integer/long integer) For others (text) it is not simple. A = A But does A = a ? or A = <any extended character set of A>? (such as A[or a] + umlaut) Does Mike MacKenzie = MacKenzie, Mike ? If you have 3 fields which you are looking to locate duplication it becomes more of an issue (assuming each field is text). It also could be an issue depending on how the 3 fields are concatenated. ex. Name + longint ID + street address How is the name constructed? is it consistent across all data (Mike MacKenzie vs MacKenzie Mike vs Michael MacKenzie) How is the longint ID converted to text? String(ID), or String(ID;<some zero padding format>), is the padding long enough? How is the street Name constructed? Number Street Name Designation (123 East West Ave) Is the street name construct consistent across all data? (123 East West Ave vs 123 East West vs 123 East West Avenue) Assuming (a BIG assumption), that the data is 'clean', i.e. all the data is formatted similarly) a couple of approaches: - might be to use the function DISTINCT VALUES (with counting option On - if available in your version of 4D) on all 3 fields independently then use one of the arrays to loop over the data looking for duplicate values in the other fields. - if you can modify the data (add a field to the table), and the amount of data is not too large, or you have very little execution time constraint, you could add a field which would hold the concatenated 3 field values, then use DISTINCT VALUES. - I have not really worked with this aspect of 4D, so I am not sure, but I know that it is possible to build a composite index (2 [or more?] fields to make a single index. Whether you can use the composite index with DISTINCT VALUES I am not sure. I'm sure there are ways in which the approach can be made efficient, but it will definitely depend on the data, and what constitutes a duplicate. On Tue, 31 Oct 2017 14:40:50 -0600, David Witton via 4D_Tech wrote: > I'm looking for a strategy to identify duplicate records in a table - that > is, records for which 3 fields are identical across two or more records - > or in another case, where a single field is not unique. > > Does anyone have a suggestion on how to proceed? > > -- > David Witton > ********************************************************************** > 4D Internet Users Group (4D iNUG) > FAQ: http://lists.4d.com/faqnug.html > Archive: http://lists.4d.com/archives.html > Options: http://lists.4d.com/mailman/options/4d_tech > Unsub: mailto:4d_tech-unsubscr...@lists.4d.com > ********************************************************************** --------------- Gas is for washing parts Alcohol is for drinkin' Nitromethane is for racing ********************************************************************** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **********************************************************************