Re [GENERAL] Deleting 173000 records takes forever

2013-02-01 Thread hamann . w

Alexander Farber wrote:
delete from pref_users
where id not in (select distinct id from pref_money);

Hi,

try instead
delete from pref_users
where id in (select id from pref_users except select id from pref_money);

Regards
Wolfgang Hamann


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Deleting 173000 records takes forever, blocks async queries for unrelated records

2013-02-01 Thread Albe Laurenz
Alexander Farber wrote:
 in a Facebook game running on
 PostgreSQL 8.4.13 and having so many players:
 
 #  select count(*) from pref_users;
  count
 
  223964
 
 I am trying to get rid of inactive users,
 who just visited the canvas page, but
 never played (I'm sure, Facebook has
 a clever-sounding name for them):
 
 # select count(*) from pref_users
 where id not in (select distinct id from pref_money);
  count
 
  173936
 (1 row)
 
 So I call:
 
 # delete from pref_users
 where id not in (select distinct id from pref_money);
 
 but that query lasts forever and
 what's more troubling me - it blocks
 the async queries of my game daemon
 (the Perl function pg_ready starts returning
 false all the time and my game accumulates
 thousands of yet-to-be-executed SQL queries).
 
 The good news is, that my quad server
 doesn't hang - I just see 1 postmaster
 process at 90-100% CPU but total load is 20%.

I hope it's a backend and not the postmaster.

 Also my game daemon in Perl recovers
 and executes the thousands of queued
 up async queries, when I interrupt the
 above DELETE query with CTRL-C at
 the pgsql prompt - i.e. my game is not buggy.
 
 My question is how handle this?
 
 Why does deleting takes so long,
 is it because of CASCADES?
 
 And why does it make the pg_ready
 calls of my game daemon return false?
 The users I'm deleting aren't active,
 they shouldn't intersect with the
 async queries of my game daemon.
 
 Below are the both SQL tables involved,
 thank you for any insights.

It would help if you send EXPLAIN output for the DELETE statement.

 #  \d pref_money
 Table public.pref_money
  Column | Type  |Modifiers
 +---+-
  id | character varying(32) |
  money  | integer   | not null
  yw | character(7)  | default to_char(now(), 'IYYY-IW'::text)
 Indexes:
 pref_money_money_index btree (money DESC)
 pref_money_yw_index btree (yw)
 Foreign-key constraints:
 pref_money_id_fkey FOREIGN KEY (id) REFERENCES pref_users(id) ON
 DELETE CASCADE

The problem is very likely that you have no index on the id column.

That means that for each id deleted from pref_users, the cascading
delete has to perform a full table scan on pref_money to find the
corresponding rows.  This is very likely the cause of your problem.

Indeed, this table does not have a primary key.  That is usually a bad
idea.  How about PRIMARY KEY (id, yw)?
That should speed up the query.

 #  \d pref_users
Table public.pref_users
Column   |Type | Modifiers
 +-+
  id | character varying(32)   | not null
[...]
 Indexes:
 pref_users_pkey PRIMARY KEY, btree (id)
 Referenced by:
[many tables]

While you are at it, check all the other tables referencing pref_users
and make sure that they have an index on the referencing column.

Yours,
Laurenz Albe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Deleting 173000 records takes forever, blocks async queries for unrelated records

2013-02-01 Thread Michal Politowski
On Fri,  1 Feb 2013 10:38:26 +0100, Alexander Farber wrote:
 Hello,
 
 in a Facebook game running on
 PostgreSQL 8.4.13 and having so many players:
 
 #  select count(*) from pref_users;
  count
 
  223964
 
 I am trying to get rid of inactive users,
 who just visited the canvas page, but
 never played (I'm sure, Facebook has
 a clever-sounding name for them):
 
 # select count(*) from pref_users
 where id not in (select distinct id from pref_money);
  count
 
  173936
 (1 row)
 
 So I call:
 
 # delete from pref_users
 where id not in (select distinct id from pref_money);
 
 but that query lasts forever and
 what's more troubling me - it blocks
 the async queries of my game daemon
 (the Perl function pg_ready starts returning
 false all the time and my game accumulates
 thousands of yet-to-be-executed SQL queries).
 
 The good news is, that my quad server
 doesn't hang - I just see 1 postmaster
 process at 90-100% CPU but total load is 20%.
 
 Also my game daemon in Perl recovers
 and executes the thousands of queued
 up async queries, when I interrupt the
 above DELETE query with CTRL-C at
 the pgsql prompt - i.e. my game is not buggy.
 
 My question is how handle this?
 
 Why does deleting takes so long,
 is it because of CASCADES?
 
 And why does it make the pg_ready
 calls of my game daemon return false?
 The users I'm deleting aren't active,
 they shouldn't intersect with the
 async queries of my game daemon.
 
 Below are the both SQL tables involved,
 thank you for any insights.
 
 Regards
 Alex
 
 #  \d pref_money
 Table public.pref_money
  Column | Type  |Modifiers
 +---+-
  id | character varying(32) |
  money  | integer   | not null
  yw | character(7)  | default to_char(now(), 'IYYY-IW'::text)
 Indexes:
 pref_money_money_index btree (money DESC)
 pref_money_yw_index btree (yw)
 Foreign-key constraints:
 pref_money_id_fkey FOREIGN KEY (id) REFERENCES pref_users(id) ON
 DELETE CASCADE

I believe an index on pref_money.id could help.
Without it Postgres is, if I understand correctly, making 173936 table scans on 
pref_money
to try (and fail) to find for each deleted row the referencing row in that 
table.
 
 #  \d pref_users
[...]

-- 
Michał Politowski


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Deleting 173000 records takes forever, blocks async queries for unrelated records

2013-02-01 Thread Bèrto ëd Sèra
Hi,

What Albe said about PKs. I'm also not very fond of people using text
in keys (even if it's a varchar or a char field). Test the same thing
with a numeric key and you are likely to see a difference.

Apart from making sure your design is ok, you might want to keep this
stuff well in the background, firing it as a frequent (and small) job.
Sort of your own private vacuum job.

Something like

delete from pref_users
where
   id not in ( select distinct id
  from pref_money )
limit 100;

Where 100 may be any number of records you find to be compatible with
a smooth performance. Keep calling this stuff at a suitable interval
(X secs in between each call), and it will silently do the cleaning
without creating giant transactions.

You definitely want to make a good design BEFORE doing this, though.

Cheers
Bèrto

On 1 February 2013 09:38, Alexander Farber alexander.far...@gmail.com wrote:
 Hello,

 in a Facebook game running on
 PostgreSQL 8.4.13 and having so many players:

 #  select count(*) from pref_users;
  count
 
  223964

 I am trying to get rid of inactive users,
 who just visited the canvas page, but
 never played (I'm sure, Facebook has
 a clever-sounding name for them):

 # select count(*) from pref_users
 where id not in (select distinct id from pref_money);
  count
 
  173936
 (1 row)

 So I call:

 # delete from pref_users
 where id not in (select distinct id from pref_money);

 but that query lasts forever and
 what's more troubling me - it blocks
 the async queries of my game daemon
 (the Perl function pg_ready starts returning
 false all the time and my game accumulates
 thousands of yet-to-be-executed SQL queries).

 The good news is, that my quad server
 doesn't hang - I just see 1 postmaster
 process at 90-100% CPU but total load is 20%.

 Also my game daemon in Perl recovers
 and executes the thousands of queued
 up async queries, when I interrupt the
 above DELETE query with CTRL-C at
 the pgsql prompt - i.e. my game is not buggy.

 My question is how handle this?

 Why does deleting takes so long,
 is it because of CASCADES?

 And why does it make the pg_ready
 calls of my game daemon return false?
 The users I'm deleting aren't active,
 they shouldn't intersect with the
 async queries of my game daemon.

 Below are the both SQL tables involved,
 thank you for any insights.

 Regards
 Alex

 #  \d pref_money
 Table public.pref_money
  Column | Type  |Modifiers
 +---+-
  id | character varying(32) |
  money  | integer   | not null
  yw | character(7)  | default to_char(now(), 'IYYY-IW'::text)
 Indexes:
 pref_money_money_index btree (money DESC)
 pref_money_yw_index btree (yw)
 Foreign-key constraints:
 pref_money_id_fkey FOREIGN KEY (id) REFERENCES pref_users(id) ON
 DELETE CASCADE

 #  \d pref_users
Table public.pref_users
Column   |Type | Modifiers
 +-+
  id | character varying(32)   | not null
  first_name | character varying(64)   |
  last_name  | character varying(64)   |
  female | boolean |
  avatar | character varying(128)  |
  city   | character varying(64)   |
  login  | timestamp without time zone | default now()
  last_ip| inet|
  logout | timestamp without time zone |
  vip| timestamp without time zone |
  mail   | character varying(254)  |
  medals | integer | not null default 0
 Indexes:
 pref_users_pkey PRIMARY KEY, btree (id)
 Referenced by:
 TABLE pref_cards CONSTRAINT pref_cards_id_fkey FOREIGN KEY
 (id) REFERENCES pref_users(id) ON DELETE CASCADE
 TABLE pref_catch CONSTRAINT pref_catch_id_fkey FOREIGN KEY
 (id) REFERENCES pref_users(id) ON DELETE CASCADE
 TABLE pref_chat CONSTRAINT pref_chat_id_fkey FOREIGN KEY (id)
 REFERENCES pref_users(id) ON DELETE CASCADE
 TABLE pref_discuss CONSTRAINT pref_discuss_id_fkey FOREIGN KEY
 (id) REFERENCES pref_users(id) ON DELETE CASCADE
 TABLE pref_game CONSTRAINT pref_game_id_fkey FOREIGN KEY (id)
 REFERENCES pref_users(id) ON DELETE CASCADE
 TABLE pref_luck CONSTRAINT pref_luck_id_fkey FOREIGN KEY (id)
 REFERENCES pref_users(id) ON DELETE CASCADE
 TABLE pref_match CONSTRAINT pref_match_id_fkey FOREIGN KEY
 (id) REFERENCES pref_users(id) ON DELETE CASCADE
 TABLE pref_misere CONSTRAINT pref_misere_id_fkey FOREIGN KEY
 (id) REFERENCES pref_users(id) ON DELETE CASCADE
 TABLE pref_money CONSTRAINT pref_money_id_fkey FOREIGN KEY
 (id) REFERENCES pref_users(id) ON DELETE CASCADE
 TABLE pref_pass CONSTRAINT pref_pass_id_fkey FOREIGN KEY (id)
 REFERENCES pref_users(id) ON DELETE CASCADE
 TABLE pref_payment CONSTRAINT pref_payment_id_fkey FOREIGN KEY
 

Re: [GENERAL] Deleting 173000 records takes forever, blocks async queries for unrelated records

2013-02-01 Thread Bèrto ëd Sèra
righto. You need a CTE to do that

create table deletable (
  id bigint not null primary key);

create table condition ( id bigint not null primary key);

insert into deletable select generate_series(1,500);

insert into condition select generate_series(1,50);


WITH target AS (
select id
from deletable d
where
  d.id not in (select id from condition)
limit 10 )
delete
from deletable
where
  id in (select id from target);

This will open a hole from 51 to 60, next time from 61 to 70 etc...

Cheers
Bèrto

On 1 February 2013 11:30, Alexander Farber alexander.far...@gmail.com wrote:
 Thanks, I will add an index(id, yw) and check.

 The limit suggestion (yes, I already have a vacuum
 cronjob to purge spam-users from drupal_users etc.)
 doesn't work:

 # delete from pref_users
 where id not in (select distinct id from pref_money) limit 10;
 ERROR:  syntax error at or near limit
 LINE 2: ...ere id not in (select distinct id from pref_money) limit 10;


 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



-- 
==
If Pac-Man had affected us as kids, we'd all be running around in a
darkened room munching pills and listening to repetitive music.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Deleting 173000 records takes forever, blocks async queries for unrelated records

2013-02-01 Thread Carlo Stonebanks
# delete from pref_users
where id not in (select distinct id from pref_money) limit 10;
ERROR:  syntax error at or near limit
LINE 2: ...ere id not in (select distinct id from pref_money) limit 10;

Or this?

DELETE FROM pref_users
WHERE id IN (
SELECT id
FROM pref_users
WHERE id NOT IN (
SELECT DISTINCT id
FROM pref_money
)
LIMIT 10
);

Saw elsewhere another suggestion with EXCEPT, which is also sexy.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general