Re: [SQL] identifying duplicates in table with redundancies

2010-09-29 Thread Oliveiros d'Azevedo Cristina
Hallo Andreas, I reduced the problem to the innermost query: 1) SELECT DISTINCT trainer_id, trainer_name FROM student This results in a sequential table scan. Execution time 7500ms. 2) I created an INDEX ON (trainer_id, trainer_name). Then I had an index scan instead, which still cost 7000ms.

Re: [SQL] identifying duplicates in table with redundancies

2010-09-29 Thread Tarlika Elisabeth Schmitz
;>> From: "Tarlika Elisabeth Schmitz" >>> To: >>> Sent: Monday, September 27, 2010 5:54 PM >>> Subject: Re: [SQL] identifying duplicates in table with redundancies >>> >>> >>>> On Fri, 24 Sep 2010 18:12:18 +0100 >>>>

Re: [SQL] identifying duplicates in table with redundancies

2010-09-29 Thread Andreas Schmitz
On 09/28/2010 10:36 PM, Tarlika Elisabeth Schmitz wrote: On Tue, 28 Sep 2010 11:34:31 +0100 "Oliveiros d'Azevedo Cristina" wrote: - Original Message - From: "Tarlika Elisabeth Schmitz" To: Sent: Monday, September 27, 2010 5:54 PM Subject: Re: [SQL] identi

Re: [SQL] identifying duplicates in table with redundancies

2010-09-28 Thread Tarlika Elisabeth Schmitz
On Tue, 28 Sep 2010 11:34:31 +0100 "Oliveiros d'Azevedo Cristina" wrote: >- Original Message - >From: "Tarlika Elisabeth Schmitz" >To: >Sent: Monday, September 27, 2010 5:54 PM >Subject: Re: [SQL] identifying duplicates in table with redundan

Re: [SQL] identifying duplicates in table with redundancies

2010-09-28 Thread Oliveiros d'Azevedo Cristina
answer. There are many people on this list that can help you better Best, Oliveiros - Original Message - From: "Tarlika Elisabeth Schmitz" To: Sent: Monday, September 27, 2010 5:54 PM Subject: Re: [SQL] identifying duplicates in table with redundancies On Fri, 24 Sep 2

Re: [SQL] identifying duplicates in table with redundancies

2010-09-27 Thread Tarlika Elisabeth Schmitz
On Fri, 24 Sep 2010 18:12:18 +0100 Oliver d'Azevedo Christina wrote: >>> SELECT DISTINCT trainer_id,trainer_name >>> FROM ( >>> SELECT trainer_name -- The field you want to test for duplicates >>> FROM ( >>> SELECT DISTINCT "trainer_id","trainer_name" >>> FROM student >>> ) x >>> GROUP BY "trai

Re: [SQL] identifying duplicates in table with redundancies

2010-09-24 Thread Tarlika Elisabeth Schmitz
Hello Oliveiros, On Fri, 24 Sep 2010 18:12:18 +0100 Oliver d'Azevedo Christina wrote: >Hey,Tarlika. >I tried to reproduce your test case through a series of inserts and >It seems that the lower case "d" went unnoticed. I can assure you people up here (in Scotland) get quite upset about it! It

Re: [SQL] identifying duplicates in table with redundancies

2010-09-24 Thread Oliver d'Azevedo Christina
t: Thursday, September 23, 2010 10:39 PM Subject: [SQL] identifying duplicates in table with redundancies [...] I want to check for duplicates: 1) multiples trainer names for same trainer id 2) multiple trainer ids for same trainer name I cobbled together the SQL and it does the job but it see

Re: [SQL] identifying duplicates in table with redundancies

2010-09-24 Thread Tarlika Elisabeth Schmitz
Dear Oliveiros, Thank you for taking the time to help. On Fri, 24 Sep 2010 11:22:21 +0100 "Oliveiros d'Azevedo Cristina" wrote: >- Original Message - >From: "Tarlika Elisabeth Schmitz" >To: >Sent: Thursday, September 23, 2010 10:39 PM >Subjec

Re: [SQL] identifying duplicates in table with redundancies

2010-09-24 Thread Oliveiros d'Azevedo Cristina
er_id on the commented places. But on the example data you provided you don't have the 1) situation, am I right? So it will output an empty list. Best, Oliver - Original Message - From: "Tarlika Elisabeth Schmitz" To: Sent: Thursday, September 23, 2010 10:39 PM Subj

[SQL] identifying duplicates in table with redundancies

2010-09-23 Thread Tarlika Elisabeth Schmitz
I loaded data from a spread into a interim table so I can analyze the quality of the data. The table contains an entry for every student (250K records) and his trainer. Eventually, I want to extract a unique list of trainers from it. But first of all I want to check for duplicates: 1) multiples t