Re: [PERFORM] Slow Delete : Seq scan instead of index scan

2012-10-16 Thread Sylvain CAILLET
Hi Craig, 

Here are the outputs : 

flows=# explain analyze delete from agg_t377_incoming_a40_dst_net_f5 where 
start_date < 1346487911000; 
QUERY PLAN 
---
 
Seq Scan on agg_t377_incoming_a40_dst_net_f5 (cost=0.00..34448.96 rows=657622 
width=6) (actual time=3429.058..7135.901 rows=143 loops=1) 
Filter: (start_date < 1346487911000::bigint) 
Total runtime: 7136.191 ms 
(3 rows) 

flows=# \d agg_t377_incoming_a40_dst_net_f5 
Table "public.agg_t377_incoming_a40_dst_net_f5" 
Column | Type | Modifiers 
-++--- 
end_date | bigint | 
dst_network | inet | 
total_pkts | bigint | 
total_bytes | bigint | 
start_date | bigint | 
total_flows | bigint | 
Indexes: 
"agg_t377_incoming_a40_dst_net_f5_end_date" btree (end_date) 
"agg_t377_incoming_a40_dst_net_f5_start_date" btree (start_date) 

Thanks for your help, 

Sylvain 
- Mail original -

> On 10/16/2012 03:50 PM, Sylvain CAILLET wrote:
> > Hi to all,
> >
> > I've got a trouble with some delete statements. My db contains a
> > little
> > more than 1 tables and runs on a dedicated server (Debian 6 -
> > bi
> > quad - 16Gb - SAS disks raid 0). Most of the tables contains
> > between 2
> > and 3 million rows and no foreign keys exist between them. Each is
> > indexed (btree) on start_date / end_date fields (bigint). The
> > Postgresql
> > server has been tuned (I can give modified values if needed).
> >
> > I perform recurrent DELETE upon a table subset (~1900 tables) and
> > each
> > time, I delete a few lines (between 0 and 1200). Usually it takes
> > between 10s and more than 2mn. It seems to me to be a huge amount
> > of
> > time ! An EXPLAIN ANALYZE on a DELETE shows me that the planner
> > uses a
> > Seq Scan instead of an Index Scan.

> Can you post that (or paste to explain.depesz.com and link to it
> here)
> along with a "\d tablename" from psql?

> --
> Craig Ringer


Re: [PERFORM] Slow Delete : Seq scan instead of index scan

2012-10-16 Thread Sylvain CAILLET
Hi Sékine, 

You're right : my question is why the planner doesn't use the index ! My DELETE 
statements have WHERE clause like : start_date<134648610. They are executed 
to delete too old rows. 
My postgresql version is 8.4. Below is an example of a table (they all have the 
same structure) : 

CREATE TABLE agg_t100_outgoing_a39_src_net_f5 
( 
total_pkts bigint, 
end_date bigint, 
src_network inet, 
start_date bigint, 
total_flows bigint, 
total_bytes bigint 
) 
WITH ( 
OIDS=FALSE 
); 

CREATE INDEX agg_t100_outgoing_a39_src_net_f5_end_date 
ON agg_t100_outgoing_a39_src_net_f5 
USING btree 
(end_date); 

CREATE INDEX agg_t100_outgoing_a39_src_net_f5_start_date 
ON agg_t100_outgoing_a39_src_net_f5 
USING btree 
(start_date); 

I have investigated in the pg_stat_all_tables table and it seems the autovaccum 
/ autoanalyze don't do their job. Many tables have no last_autovacuum / 
last_autoanalyze dates ! So the planner doesn't have fresh stats to decide. 
Don't you think it could be a good reason for slow DELETE ? In this case, the 
trouble could come from the autovaccum configuration. 

Regards, 

Sylvain 
- Mail original -

> Hi Sylvain,

> Might sound like a nasty question, and gurus will correct me if I'm
> wrong, but first thing to investigate is why the index is not used :
> - You have 2/3 million rows per table so the planner should use the
> index. Seqscan is prefered for small tables.
> - Maybe the WHERE clause of your DELETE statement doesn't make use of
> your start and end date columns ? If so, in which order ?

> Please, provide with your Pg version and the table setup with the
> index.

> Regards,

> Sekine

> 2012/10/16 Sylvain CAILLET < scail...@alaloop.com >

> > Hi to all,
> 

> > I've got a trouble with some delete statements. My db contains a
> > little more than 1 tables and runs on a dedicated server
> > (Debian
> > 6 - bi quad - 16Gb - SAS disks raid 0). Most of the tables contains
> > between 2 and 3 million rows and no foreign keys exist between
> > them.
> > Each is indexed (btree) on start_date / end_date fields (bigint).
> > The Postgresql server has been tuned (I can give modified values if
> > needed).
> 

> > I perform recurrent DELETE upon a table subset (~1900 tables) and
> > each time, I delete a few lines (between 0 and 1200). Usually it
> > takes between 10s and more than 2mn. It seems to me to be a huge
> > amount of time ! An EXPLAIN ANALYZE on a DELETE shows me that the
> > planner uses a Seq Scan instead of an Index Scan. Autovaccum is on
> > and I expect the db stats to be updated in real time (pg_stats file
> > is stored in /dev/shm RAM disk for quick access).
> 

> > Do you have any idea about this trouble ?
> 

> > Sylvain Caillet
> 
> > Bureau : + 33 5 59 41 51 10
> 
> > scail...@alaloop.com
> 

> > ALALOOP S.A.S. - Technopole Izarbel - 64210 Bidart
> 
> > www.alaloop.com
> 


Re: [PERFORM] Create tables performance

2012-07-09 Thread Sylvain CAILLET
Yes, you're right ! The process checks if all these tables exist before 
creating them. So it might be the SELECT that takes time. To check existence, I 
use the following query : 
select * from pg_tables where tablename='the_table'; 
May be it's not the best way. And I launch a query per table ! Not good at all. 

Thank you all, I will optimize this. 

Sylvain 

- Mail original -

> On Sun, Jul 8, 2012 at 11:49 PM, Sylvain CAILLET
>  wrote:
> > Hi,
> >
> > Thank you all for your help.
> >
> > @Jeff : my daemon creates these tables at start time so it doesn't
> > do
> > anything else at the same time. The CPU is loaded between 20% and
> > 25%.

> How does it decide which tables to create? Is it querying the
> existing tables to figure out what new ones to make? Is the rest of
> the time going to IO wait?

> Cheers,

> Jeff


Re: [PERFORM] Create tables performance

2012-07-08 Thread Sylvain CAILLET
Hi, 


Thank you all for your help. 


@Jeff : my daemon creates these tables at start time so it doesn't do anything 
else at the same time. The CPU is loaded between 20% and 25%. 
@Richard : Sure the DB number of table is quite big and sure most of them have 
the same structure, but it's very hard to move it now so I have to deal with it 
for a while ! 
@Craig : I can't run any of the queries. Fo example, " CLUSTER 
pg_class_oid_index ON pg_catalog.pg_class; " throws a " ERROR: "pg_class" is a 
system catalog " exception. But, using VACUUM FULL, it's done in less than a 
second. Autovacuum is on but not tuned in postgresql configuration file. 


Sylvain Caillet 
----- Mail original -



On 07/06/2012 11:15 PM, Sylvain CAILLET wrote: 




Hi to all, 


I run Postgresql 8.3.9 on a dedicated server running with Debian 5.0.4, a 
strong bi quad-proc with RAM 16Go. My biggest db contains at least 100 000 
tables. Last time, I started a Java process I use to make some change on it, it 
created 170 new tables and it took one full minute. That is a very long time 
for such a process on such a server ! 


If you create and drop a lot of tables, you need to make sure you're vacuuming 
the pg_catalog tables frequently. Newer versions mostly take care of this for 
you, but on 8.3 you'll at minimum have to turn autovaccum right up. 

See what happens if you run in psql, as a Pg superuser (usually the "postgres" 
account): 

CLUSTER pg_class_oid_index ON pg_catalog.pg_class; 
CLUSTER pg_type_oid_index ON pg_catalog.pg_type; 
CLUSTER pg_attribute_relid_attnam_index ON pg_catalog.pg_attribute; 
CLUSTER pg_index_indexrelid_index ON pg_catalog.pg_index; 

I'm guessing you have severe table bloat in your catalogs, in which case this 
may help. I use CLUSTER instead of VACCUUM FULL because on old versions like 
8.3 it'll run faster and sort the indexes for you too. 





Do you think there could be some configuration tuning to do to improve the 
performance for create tables ? 
Or do I have to use tablespaces because 10 files in a single folder is a 
too many for OS ? 



That won't be a problem unless your OS and file system are truly crap. 

-- 
Craig Ringer 






[PERFORM] Create tables performance

2012-07-06 Thread Sylvain CAILLET

Hi to all, 


I run Postgresql 8.3.9 on a dedicated server running with Debian 5.0.4, a 
strong bi quad-proc with RAM 16Go. My biggest db contains at least 100 000 
tables. Last time, I started a Java process I use to make some change on it, it 
created 170 new tables and it took one full minute. That is a very long time 
for such a process on such a server ! 
Do you think there could be some configuration tuning to do to improve the 
performance for create tables ? 
Or do I have to use tablespaces because 10 files in a single folder is a 
too many for OS ? 
It's possible to migrate the DB in 9.1 version. Do you think it could solve the 
trouble ? 


Thank you all for your advices, 


Best regards 


Sylvain