On Montag, 14. April 2008 Tom Lane wrote:
> What PG version is this?

I've looked a little more into history, and PostgreSQL was 8.1.5 
originally from CD, then 8.1.9 and now 8.1.11.

I'm currently investigating a way to delete duplicates from the table. 
Is there any chance to do that? A 
SELECT * from bayes_token;
shows no dups, as there's a primary index... at least it works here.

This is the bayes_token table and its index:
CREATE TABLE bayes_token (
    id integer DEFAULT 0 NOT NULL,
    token bytea DEFAULT ''::bytea NOT NULL,
    spam_count integer DEFAULT 0 NOT NULL,
    ham_count integer DEFAULT 0 NOT NULL,
    atime integer DEFAULT 0 NOT NULL
);
ALTER TABLE ONLY bayes_token
    ADD CONSTRAINT bayes_token_pkey PRIMARY KEY (id, token);

Could I create a new db without indices, restore the data there, and 
then run a "delete from.. where duplicates"? Sometimes there's 3x the 
same content in the primary index, but I'd need to delete all except 
the one with the highest spam_count.

I just found I even have a duplicate in bayes_vars:

COPY bayes_vars (id, username, spam_count, ham_count, token_count, 
last_expire, last_atime_delta, last_expire_reduce, oldest_token_age, 
newest_token_age) FROM stdin;
1       vscan   194393  517531  2602114 1206030039      2764800 15304   
1203265204      1206057801
1       vscan   194398  517535  2602799 1206030039      2764800 15304   
1203265204      1206064729
4       vscan   18305   25403   2042983 1208143427      1382400 13268   
1206501543      1208216540
\.

Definition:
CREATE TABLE bayes_vars (
    id serial NOT NULL,
    username character varying(200) DEFAULT ''::character varying NOT 
NULL,
    spam_count integer DEFAULT 0 NOT NULL,
    ham_count integer DEFAULT 0 NOT NULL,
    token_count integer DEFAULT 0 NOT NULL,
    last_expire integer DEFAULT 0 NOT NULL,
    last_atime_delta integer DEFAULT 0 NOT NULL,
    last_expire_reduce integer DEFAULT 0 NOT NULL,
    oldest_token_age integer DEFAULT 2147483647 NOT NULL,
    newest_token_age integer DEFAULT 0 NOT NULL
);
ALTER TABLE ONLY bayes_vars
    ADD CONSTRAINT bayes_vars_pkey PRIMARY KEY (id);

But here, I could see the dups with SELECT, and even delete one record:

# select * from bayes_vars;
 id | username | spam_count | ham_count | token_count | last_expire | 
last_atime_delta | last_expire_reduce | oldest_token_age | 
newest_token_age
----+----------+------------+-----------+-------------+-------------+------------------+--------------------+------------------+------------------
  1 | vscan    |     194393 |    517531 |     2602114 |  1206030039 |          
2764800 |              15304 |       1203265204 |       1206057801
  1 | vscan    |     194398 |    517535 |     2602799 |  1206030039 |          
2764800 |              15304 |       1203265204 |       1206064729
  4 | vscan    |      18375 |     25828 |     2050196 |  1208229525 |          
1382400 |              13268 |       1206501543 |       1208240610
(3 Zeilen)

bayes_pg_v1=# delete from bayes_vars where spam_count =194393;
DELETE 1
bayes_pg_v1=# select * from bayes_vars;
 id | username | spam_count | ham_count | token_count | last_expire | 
last_atime_delta | last_expire_reduce | oldest_token_age | 
newest_token_age
----+----------+------------+-----------+-------------+-------------+------------------+--------------------+------------------+------------------
  1 | vscan    |     194398 |    517535 |     2602799 |  1206030039 |          
2764800 |              15304 |       1203265204 |       1206064729
  4 | vscan    |      18375 |     25829 |     2050215 |  1208229525 |          
1382400 |              13268 |       1206501543 |       1208240637
(2 Zeilen)

mfg zmi
-- 
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0676/846 914 666                      .network.your.ideas.
// PGP Key:         "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38  500E CE14 91F7 1C12 09B4
// Keyserver: www.keyserver.net                   Key-ID: 1C1209B4

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to