Re: [GENERAL] how to remove the duplicate records from a table

2008-10-13 Thread Albe Laurenz
Robert Treat wrote:
 I have a table contains some duplicate records, and this table create
 without oids, for example:
  id | temp_id
 +-
  10 |   1
  10 |   1
  10 |   1
  20 |   4
  20 |   4
  30 |   5
  30 |   5
 I want get the duplicated records removed and only one is reserved, so
 the results is:
 10 1
 20 4
 30 5

 I know create a temp table will resolve this problem, but I don't want
 this way:)

 DELETE FROM t t1 USING t t2
 WHERE t1.id = t2.id AND t1.temp_id = t2.temp_id AND t1.ctid t2.ctid;
 
 note that one problem the delete from approaches have that the temp table 
 solutions dont is that you can end up with a lot of dead tuples if there were 
 a lot of duplicates... so if you can afford the locks, its not a bad idea to 
 do begin; lock table t1 in access exclsuive mode; create temp table x as 
 select ... from t1; truncate t1; insert into t1 select * from x; create 
 unique index ui1 on t1(...); commit;  this way you're now unique table will 
 be nice and compacted, and wont get any more duplicate rows.  

Very true; an alternative way to achieve that is to
VACUUM FULL t
after deleting the duplicate rows.

As for the UNIQUE index, that's of course the right thing to do, but
I wasn't sure if Yi Zhao wanted to change the database design.

At any rate, I had thought that a unique constraint was preferrable to
a unique index because - while doing the same thing - the former will
also show up in pg_catalog.pg_constraint.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how to remove the duplicate records from a table

2008-10-10 Thread Robert Treat
On Tuesday 07 October 2008 05:48:01 Albe Laurenz wrote:
 Yi Zhao wrote:
  I have a table contains some duplicate records, and this table create
  without oids, for example:
   id | temp_id
  +-
   10 |   1
   10 |   1
   10 |   1
   20 |   4
   20 |   4
   30 |   5
   30 |   5
  I want get the duplicated records removed and only one is reserved, so
  the results is:
  10 1
  20 4
  30 5
 
  I know create a temp table will resolve this problem, but I don't want
  this way:)
 
  can someone tell me a simple methold?

 Don't know if you'd call that simple, but if the table is
 called t, you could do

 DELETE FROM t t1 USING t t2
 WHERE t1.id = t2.id AND t1.temp_id = t2.temp_id AND t1.ctid  t2.ctid;


note that one problem the delete from approaches have that the temp table 
solutions dont is that you can end up with a lot of dead tuples if there were 
a lot of duplicates... so if you can afford the locks, its not a bad idea to 
do begin; lock table t1 in access exclsuive mode; create temp table x as 
select ... from t1; truncate t1; insert into t1 select * from x; create 
unique index ui1 on t1(...); commit;  this way you're now unique table will 
be nice and compacted, and wont get any more duplicate rows.  

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] how to remove the duplicate records from a table

2008-10-07 Thread Yi Zhao
I have a table contains some duplicate records, and this table create
without oids, for example:
 id | temp_id 
+-
 10 |   1
 10 |   1
 10 |   1
 20 |   4
 20 |   4
 30 |   5
 30 |   5
I want get the duplicated records removed and only one is reserved, so
the results is:
10 1
20 4
30 5

I know create a temp table will resolve this problem, but I don't want
this way:)

can someone tell me a simple methold?

any help is appreciated,

thanks,



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how to remove the duplicate records from a table

2008-10-07 Thread Peter Childs
2008/10/7 Yi Zhao [EMAIL PROTECTED]:
 I have a table contains some duplicate records, and this table create
 without oids, for example:
  id | temp_id
 +-
  10 |   1
  10 |   1
  10 |   1
  20 |   4
  20 |   4
  30 |   5
  30 |   5
 I want get the duplicated records removed and only one is reserved, so
 the results is:
 10 1
 20 4
 30 5

 I know create a temp table will resolve this problem, but I don't want
 this way:)

 can someone tell me a simple methold?

 any help is appreciated,

 thanks,



I would not say this is easier

1. alter table t add key serial;
2. delete from table where key not in (select max(key) from table
group on id,temp_id);

The truth is this is not any less work then using a temporary table
(whole table still needs rewriting). Which method you select really
depends on why these duplicate records exist in the first place.

Regards

Peter

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how to remove the duplicate records from a table

2008-10-07 Thread Albe Laurenz
Yi Zhao wrote:
 I have a table contains some duplicate records, and this table create
 without oids, for example:
  id | temp_id 
 +-
  10 |   1
  10 |   1
  10 |   1
  20 |   4
  20 |   4
  30 |   5
  30 |   5
 I want get the duplicated records removed and only one is reserved, so
 the results is:
 10 1
 20 4
 30 5
 
 I know create a temp table will resolve this problem, but I don't want
 this way:)
 
 can someone tell me a simple methold?

Don't know if you'd call that simple, but if the table is
called t, you could do

DELETE FROM t t1 USING t t2
WHERE t1.id = t2.id AND t1.temp_id = t2.temp_id AND t1.ctid  t2.ctid;

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general