[PERFORM] Any way to optimize GROUP BY queries?

2005-12-19 Thread Cristian Prieto








I have the following table:

 

CREATE TABLE mytmp (

    Adv
integer,

    Pub
integer,

    Web
integer,

    Tiempo
timestamp,

    Num
integer,

    Country
varchar(2)

);

 

CREATE INDEX idx_mytmp ON mytmp(adv, pub, web);

 

And with 16M rows this query:

 

SELECT adv, pub, web, country, date_trunc(‘hour’,
tiempo), sum(num)

FROM mytmp GROUP BY adv, pub, web, country, date_trunc(‘hour’,
tiempo)

 

I’ve tried to create index in different columns but it
seems that the group by clause doesn’t use the index in any way.

 

Is around there any stuff to accelerate the group by kind of
clauses?

 

Thanks a lot…

 

 

 








[PERFORM] PgPool and Postgresql sessions...

2005-11-16 Thread Cristian Prieto
Hi, I just have a little question, does PgPool keeps the same session
between different connections? I say it cuz I have a server with the
following specifications:

P4 3.2 ghz
80 gig sata drives x 2
1 gb ram
5 ips
1200 gb bandwidth
100 mbit/s port speed.

I am running a PgSQL 8.1 server with 100 max connection, pgpool with
num_init_children = 25 and max_pool = 4. I do the same queries all the time
(just a bunch of sps, but they are always the same). Using explain analyze I
get the fact that the sps are using a lot of time the first time they
execute (I guess preparing the plan and the sps I wrote en plpgsql) so I
would like to reuse the session the most possible. I need to serve 10M of
connection per day. Is this possible? (the client is a webapplication, I
repeat again, the queries are always the same).

Thanks a lot for your help...


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Performance difference between sql and pgsql function...

2005-11-02 Thread Cristian Prieto
There any performance differences between a SQL function written in SQL
language or PL/psSQL language? For example:

Create or replace function sp_getfreq(
Var1 integer
) returns Boolean as
$$
Declare
Myval Boolean;
Begin
Select var1 in (select var3 from table1) into myval;
Return myval;
End;
$$
Language ‘plpgsql’ stable; 

And with:

Create or replace function sp_getfreq(
Var1 integer
) returns boolean as
$$
Select $1 in (select var3 from table1);
$$
Language ‘sql’ stable;


I know the function is really simple, but in theory which of the three would
run faster?



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Text/Varchar performance...

2005-10-05 Thread Cristian Prieto
Hello, just a little question, It's preferable to use Text Fields or
varchar(255) fields in a table? Are there any performance differences in the
use of any of them?

Thanks a lot for your answer!


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] [GENERAL] Index use in BETWEEN statement...

2005-09-26 Thread Cristian Prieto

mydb=# explain analyze select locid from geoip_block where
'216.230.158.50'::inet between start_block and end_block;
  QUERY PLAN

--
 Seq Scan on geoip_block  (cost=0.00..78033.96 rows=230141 width=8) (actual
time=13015.538..13508.708 rows=1 loops=1)
   Filter: (('216.230.158.50'::inet >= start_block) AND
('216.230.158.50'::inet <= end_block))
 Total runtime: 13508.905 ms
(3 rows)

mydb=# alter table geoip_block add constraint pkey_geoip_block primary key
(start_block, end_block);
NOTICE:  ALTER TABLE / ADD PRIMARY KEY will create implicit index
"pkey_geoip_block" for table "geoip_block"
ALTER TABLE

mydb=# vacuum analyze geoip_block; 

mydb=# explain analyze select locid from geoip_block where
'216.230.158.50'::inet between start_block and end_block;
  QUERY PLAN

---
 Seq Scan on geoip_block  (cost=0.00..101121.01 rows=308324 width=8) (actual
time=12128.190..12631.550 rows=1 loops=1)
   Filter: (('216.230.158.50'::inet >= start_block) AND
('216.230.158.50'::inet <= end_block))
 Total runtime: 12631.679 ms
(3 rows)

mydb=#


As you see it still using a sequential scan in the table and ignores the
index, any other suggestion?

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Sean Davis
Sent: Lunes, 26 de Septiembre de 2005 10:24 a.m.
To: Cristian Prieto; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Index use in BETWEEN statement...

On 9/26/05 11:26 AM, "Cristian Prieto" <[EMAIL PROTECTED]> wrote:

> 
> Hello pals, I have the following table in Postgresql 8.0.1
> 
> Mydb# \d geoip_block
> Table "public.geoip_block"
>  Column|  Type  | Modifiers
> -++---
> locid   | bigint |
> start_block | inet   |
> end_block   | inet   |
> 
> mydb# explain analyze select locid from geoip_block where
> '216.230.158.50'::inet between start_block and end_block;
> QUERY PLAN
>

> ---
> Seq Scan on geoip_block  (cost=0.00..142772.86 rows=709688 width=8)
(actual
> time=14045.384..14706.927 rows=1 loops=1)
>  Filter: (('216.230.158.50'::inet >= start_block) AND
> ('216.230.158.50'::inet <= end_block))
> Total runtime: 14707.038 ms
> 
> Ok, now I decided to create a index to "speed" a little the query
> 
> Mydb# create index idx_ipblocks on geoip_block(start_block, end_block);
> CREATE INDEX
> 
> clickad=# explain analyze select locid from geoip_block where
> '216.230.158.50'::inet between start_block and end_block;
> QUERY PLAN
>

> --
> Seq Scan on geoip_block  (cost=0.00..78033.96 rows=230141 width=8) (actual
> time=12107.919..12610.199 rows=1 loops=1)
>  Filter: (('216.230.158.50'::inet >= start_block) AND
> ('216.230.158.50'::inet <= end_block))
> Total runtime: 12610.329 ms
> (3 rows)
> 
> I guess the planner is doing a sequential scan in the table, why not use
the
> compound index? Do you have any idea in how to speed up this query?

Did you vacuum analyze the table after creating the index?

Sean


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Index use in BETWEEN statement...

2005-09-23 Thread Cristian Prieto
Hello pals, I have the following table in Postgresql 8.0.1

Mydb# \d geoip_block
Table "public.geoip_block"
   Column|  Type  | Modifiers
-++---
 locid   | bigint |
 start_block | inet   |
 end_block   | inet   |

mydb# explain analyze select locid from geoip_block where
'216.230.158.50'::inet between start_block and end_block;
  QUERY PLAN

---
 Seq Scan on geoip_block  (cost=0.00..142772.86 rows=709688 width=8) (actual
time=14045.384..14706.927 rows=1 loops=1)
   Filter: (('216.230.158.50'::inet >= start_block) AND
('216.230.158.50'::inet <= end_block))
 Total runtime: 14707.038 ms

Ok, now I decided to create a index to "speed" a little the query

Mydb# create index idx_ipblocks on geoip_block(start_block, end_block);
CREATE INDEX

clickad=# explain analyze select locid from geoip_block where
'216.230.158.50'::inet between start_block and end_block;
  QUERY PLAN

--
 Seq Scan on geoip_block  (cost=0.00..78033.96 rows=230141 width=8) (actual
time=12107.919..12610.199 rows=1 loops=1)
   Filter: (('216.230.158.50'::inet >= start_block) AND
('216.230.158.50'::inet <= end_block))
 Total runtime: 12610.329 ms
(3 rows)

I guess the planner is doing a sequential scan in the table, why not use the
compound index? Do you have any idea in how to speed up this query?

Thanks a lot!


---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Any other idea for better performance?

2005-09-13 Thread Cristian Prieto



Hi, I've reading around there about some way to 
help pgsql with the data caching using memcached inside the sps in the database 
(not in the application), does anybody have success with that?
 
Thanks a lot!


Re: [PERFORM] About method of PostgreSQL's Optimizer

2005-09-13 Thread Cristian Prieto



I know you almost had read this, but I think it is 
a good paper to start with... 
 
http://lca2005.linux.org.au/Papers/Neil%20Conway/Inside%20the%20PostgreSQL%20Query%20Optimizer/pg_query_optimizer.pdf
 
Anyway, do you know where could I get more info and 
theory about database optimizer plan? (in general) I like that topic, thanks a 
lot man!

  - Original Message - 
  From: 
  Pryscila B Guttoski 
  To: pgsql-performance@postgresql.org 
  
  Sent: Tuesday, September 13, 2005 4:50 
  PM
  Subject: [PERFORM] About method of 
  PostgreSQL's Optimizer
  Hello all!On my master course, I'm studying the 
  PostgreSQL's optimizer.I don't know if anyone in this list have been 
  participated from the PostgreSQL's Optimizer development, but maybe someone 
  can help me on this question.PostgreSQL generates all possible plans of 
  executing the query (using an almost exhaustive search), then gives a cost to 
  each plan and finally the cheapest one is selected for execution.There are 
  other methods for query optimization, one of them is based on plan 
  transformations (for example, using A-Star algorithm) instead of plan 
  constructions used by PostgreSQL. Does anyone know why this method was 
  choosen? Are there any papers or researches about it?Thank's a 
  lot,Pryscila.