Re: Duplicated records

Peter Brawley Tue, 05 Apr 2005 10:30:52 -0700

Scott,

To find dupe batch values

SELECT batch, COUNT(batch) AS Cnt
FROM qa
GROUP BY batch
HAVING Cnt > 1;

Peter Brawley
http://www.artfulsoftware.com

-----

Scott Hamm wrote:

Sorry for the confusion.  I don't mean a duplicated records, but to
find duplicated Batch where two different operators worked on a single
batch (one started off, then another one to finish the batch) and a
single reviewer to review a batch. I need a list that lists duplicated
Batches with different operators that worked on that batch.


For example: (for clarification)
SELECT ID, Batch, KeyDate, OperatorID, QAID, NTID 
FROM QA 
WHERE Batch='439584414';

ID         Batch              KeyDate                      OperatorID
QAID      NTID
90577    439584414	2004-10-03 00:00:00	2661	      5334	JulieAnt	
90575    439584414	2004-10-03 00:00:00	5657	      5334	JulieAnt	


On Apr 5, 2005 10:54 AM, Rhino <[EMAIL PROTECTED]> wrote:

----- Original Message -----
From: "Scott Hamm" <[EMAIL PROTECTED]>
To: "Mysql" <[email protected]>
Sent: Tuesday, April 05, 2005 10:05 AM
Subject: Duplicated records

Here is my novice question:
I need to find duplicates in "Batch" category when I issued this query:

SELECT * FROM QA WHERE Batch=439584414;

Result:
90577
1 2661 5334 JulieAnt 25 5 5 439584414 2004-10-03 00:00:00 2004-10-03
00:00:00 0 90575
1 5657 5334 JulieAnt 25 5 5 439584414 2004-10-03 00:00:00 2004-10-03

00:00:00 0

How do I issue a query that finds duplicated Batch number?

First of all, I think you made a mistake when putting this data into your
note, a mistake which is compounded by the wrapping that the email program
did. I *think* you meant to display two rows with the '90577' value at the
end of the second row, not before the first row. This makes the example a
bit confusing and hard to follow.

The two rows you provide in your example are not duplicates of each other.
Even assuming that the '90577' actually belongs at the end of the second row
rather than the start of the first row, the two rows are different: the
first row ends in '90575', not '90577' and the second value in each row is
also different: '2661' and '5657'. Therefore, you're not really trying to
find duplicate records because a duplicate row would be one that is
identical in every column, not just identical in several columns. I hope
that doesn't sound like I am splitting hairs; maybe I am. ;-)

Anyway, the standard method for finding duplicates of rows goes like this
and can be adapted for your situation. Let's say that you have a table
called Foo with columns foo1, foo2, and foo3 and you want to find all of the
duplicate rows in the table, i.e. any row whose foo1, foo2, and foo3 value
is identical to the foo1, foo2, and foo3 value of another table. You would
write this query:

    select foo1, foo2, foo3, count(*) as num
    from Foo1
    group by foo1, foo2, foo3
    having num > 1
    order by num;

Your result might look like this:

    foo1    foo2    foo3    num
    Daffy    D        Duck    1
    Bugs    B        Bunny    1
    Elmer    A        Fudd    3

This result means that Daffy D Duck and Bugs B Bunny each occur once in the
table but Elmer A Fudd occurs three times.

In the case you mentioned in your email, where the rows aren't complete
duplicates of one another, just write the query so that you omit the columns
which are allowed to be different. In your case, that would appear to be all
but the second and last columns.

However, I think if you get to the point of having duplicate records in a
table when you don't want them, finding them is like shutting the barn door
after the horse is already gone. I think you need to reconsider your design,
specifically your primary key, and change that primary key to *PREVENT* the
duplicate keys in the first place.

For instance, in the case of the Foo1 table, I can prevent duplicate rows by
choosing a good primary key; in this case I would choose the combination of
all three columns in the table; that would ensure that I never stored more
than 1 person named Elmer A Fudd. [Actually, that's not a great example! It
might be perfectly valid to have two people named Elmer A Fudd - or John A
Smith - so I probably need to add additional columns to my table to ensure
that the rows are unique; something like Social Security Number which is
(supposed to be) unique would be ideal for this purpose.]

Rhino

--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.2 - Release Date: 05/04/2005

No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.3 - Release Date: 4/5/2005

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Re: Duplicated records

Reply via email to