Re: [PERFORM] select on 1milion register = 6s

2007-07-28 Thread Scott Marlowe
On 7/28/07, Bruno Rodrigues Siqueira <[EMAIL PROTECTED]> wrote:
>
> Ok.
> QUERY PLAN
> Sort  (cost=11449.37..11449.40 rows=119 width=8) (actual
> time=14431.537..14431.538 rows=2 loops=1)
>   Sort Key: to_char(data_encerramento, '-mm'::text)
>   ->  HashAggregate  (cost=11448.79..11448.96 rows=119 width=8) (actual
> time=14431.521..14431.523 rows=2 loops=1)
> ->  Index Scan using detalhamento_bas_idx3003 on detalhamento_bas
> (cost=0.00..11442.95 rows=11679 width=8) (actual time=0.135..12719.155
> rows=2335819 loops=1)

See the row mismatch there?  It expects about 11k rows, gets back 2.3
million.  That's a pretty big misestimate.  Have you run analyze
recently on this table?

Is there a reason you're doing this:


to_char( data_encerramento ,'-mm')
between   '2006-12' and  '2007-01'

when you should be able to just do:

data_encerramento between   '2006-12-01' and  '2007-01-31'
?  that should be able to use good estimates from analyze.  My guess
is the planner is making a bad guess because of the way you're
handling the dates.

> SERVER
>   DELL PowerEdge 2950
>   XEON Quad-Core 3.0Ghz
>   4Gb RAM
>   Linux CentOS 5.0 64-bits
>  Postgres 8.1.4

>  Postgresql.conf
> # - Memory -
>
> shared_buffers = 5  # min 16 or max_connections*2, 8KB

400 Meg is kind of low for a server with 4 G ram.  25% is more
reasonable (i.e. 125000 buffers)

> work_mem = 3145728  # min 64, size in KB
> maintenance_work_mem = 4194304  # min 1024, size in KB

Whoa nellie!  thats ~ 3 Gig of work mem, and 4 gig of maintenance work
mem.  In a machine with 4 gig ram, that's a recipe for disaster.

Something more reasonable would be 128000 (~125Meg) for each since
you've limited your machine to 10 connections you should be ok.
setting work_mem too high can run your machine out of memory and into
a swap storm that will kill performance.

> fsync = off # turns forced synchronization on or
> off

So, the data in this database isn't important?  Cause that's what
fsync = off says to me.  Better to buy yourself a nice battery backed
caching RAID controller than turn off fsync.

> effective_cache_size = 41943040 # typically 8KB each

And you're machine has 343,604,830,208 bytes of memory available for
caching?  Seems a little high to me.

> random_page_cost = 1# units are one sequential page
> fetch

Seldom if ever is it a good idea to bonk the planner on the head with
random_page_cost=1.  setting it to 1.2 ot 1.4 is low enough, but 1.4
to 2.0 is more realistic.

> stats_start_collector = off
> #stats_command_string = off
> #stats_block_level = off
> #stats_row_level = off
> #stats_reset_on_server_start = off

I think you need stats_row_level on for autovacuum, but I'm not 100% sure.

Let us know what happens after fixing these settings and running
analyze and running explain analyze, with possible changes to the
query.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


RES: [PERFORM] select on 1milion register = 6s

2007-07-28 Thread Bruno Rodrigues Siqueira
Yes, i do.





-Mensagem original-
De: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Em nome de Hervé Piedvache
Enviada em: sábado, 28 de julho de 2007 16:57
Para: pgsql-performance@postgresql.org
Cc: Bruno Rodrigues Siqueira
Assunto: Re: [PERFORM] select on 1milion register = 6s

Do you have analyzed your table before doing this ?

Le samedi 28 juillet 2007, Bruno Rodrigues Siqueira a écrit :
> Friends,
>
>
>
>
>
>
>
> Who can help me? My SELECT in a base with 1 milion register,
> using  expression index = 6seconds.
>
>
>
>
>
> Please, I don't  know how to makes it better.
>
>
>
>
>
> Thanks



-- 
Hervé Piedvache

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


RES: [PERFORM] select on 1milion register = 6s

2007-07-28 Thread Bruno Rodrigues Siqueira
Yes.



Look this... and please, tell me if you can help me...


Thanks

Query

EXPLAIN
ANALYZE
select 
   to_char(data_encerramento,'mm/')  as opcoes_mes, 
   to_char(data_encerramento,'-mm') as ordem from detalhamento_bas
where

to_char( data_encerramento ,'-mm') 
between   '2006-12' and  '2007-01'

GROUP BY opcoes_mes, ordem
ORDER BY ordem DESC





QUERY RESULT

QUERY PLAN
Sort  (cost=11449.37..11449.40 rows=119 width=8) (actual
time=14431.537..14431.538 rows=2 loops=1)
  Sort Key: to_char(data_encerramento, '-mm'::text)
  ->  HashAggregate  (cost=11448.79..11448.96 rows=119 width=8) (actual
time=14431.521..14431.523 rows=2 loops=1)
->  Index Scan using detalhamento_bas_idx3003 on detalhamento_bas
(cost=0.00..11442.95 rows=11679 width=8) (actual time=0.135..12719.155
rows=2335819 loops=1)
  Index Cond: ((to_char(data_encerramento, '-mm'::text) >=
'2006-12'::text) AND (to_char(data_encerramento, '-mm'::text) <=
'2007-01'::text))
Total runtime: 14431.605 ms 








SERVER
  DELL PowerEdge 2950
  XEON Quad-Core 3.0Ghz
  4Gb RAM
  Linux CentOS 5.0 64-bits





 Postgres 8.1.4




 Postgresql.conf


# -
# PostgreSQL configuration file
# -
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced #
with '#' anywhere on a line. The complete list of option names and # allowed
values can be found in the PostgreSQL documentation. The # commented-out
settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it #
to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the # postmaster,
e.g. 'postmaster -c log_connections=on'. Some options # can be changed at
run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster # receives
a SIGHUP. If you edit the file on a running system, you have # to SIGHUP the
postmaster for the changes to take effect, or use # "pg_ctl reload". Some
settings, such as listen_addresses, require # a postmaster shutdown and
restart to take effect.


#---
# FILE LOCATIONS
#---

# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.

#data_directory = 'ConfigDir'   # use data in another directory
#hba_file = 'ConfigDir/pg_hba.conf' # host-based authentication file
#ident_file = 'ConfigDir/pg_ident.conf' # IDENT configuration file

# If external_pid_file is not explicitly set, no extra pid file is written.
#external_pid_file = '(none)'   # write an extra pid file


#---
# CONNECTIONS AND AUTHENTICATION
#---

# - Connection Settings -

listen_addresses = '*'  # what IP address(es) to listen on; 
# comma-separated list of addresses;
# defaults to 'localhost', '*' = all
#port = 5432 max_connections = 10 # note: increasing max_connections costs
~400 bytes of shared memory per # connection slot, plus lock space (see
max_locks_per_transaction).  You # might also need to raise shared_buffers
to support more connections.
#superuser_reserved_connections = 2
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777 # octal
#bonjour_name = ''  # defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60# 1-600, in seconds
#ssl = off
#password_encryption = on
#db_user_namespace = off

# Kerberos
#krb_server_keyfile = ''
#krb_srvname = 'postgres'
#krb_server_hostname = ''   # empty string matches any keytab
entry
#krb_caseins_users = off

# - TCP Keepalives -
# see 'man 7 tcp' for details

#tcp_keepalives_idle = 0# TCP_KEEPIDLE, in seconds;
# 0 selects the system default
#tcp_keepalives_interval = 0# TCP_KEEPINTVL, in seconds;
# 0 selects the system default
#tcp_keepalives_count = 0   # TCP_KEEPCNT;
# 0 selects the system default


#---
# RESOURCE USAGE (except WAL)
#---

# - Memory -

shared_buffers = 5  # min 16 or max_connections*2, 8KB
each
t

RES: RES: [PERFORM] select on 1milion register = 6s

2007-07-28 Thread Bruno Rodrigues Siqueira
Data_encerramento is a timestamp column


I will try your tip.
Thanks


-Mensagem original-
De: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Em nome de Ragnar
Enviada em: sábado, 28 de julho de 2007 19:36
Para: Bruno Rodrigues Siqueira
Cc: pgsql-performance@postgresql.org
Assunto: Re: RES: [PERFORM] select on 1milion register = 6s

On lau, 2007-07-28 at 17:12 -0300, Bruno Rodrigues Siqueira wrote:

> where
> 
> to_char( data_encerramento ,'-mm') 
> between   '2006-12' and  '2007-01'

assuming data_encerramento is a date column, try:
WHERE data_encerramento between   '2006-12-01' and  '2007-01-31'
  
gnari



---(end of broadcast)---
TIP 6: explain analyze is your friend


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: RES: [PERFORM] select on 1milion register = 6s

2007-07-28 Thread Ragnar
On lau, 2007-07-28 at 17:12 -0300, Bruno Rodrigues Siqueira wrote:

> where
> 
> to_char( data_encerramento ,'-mm') 
> between   '2006-12' and  '2007-01'

assuming data_encerramento is a date column, try:
WHERE data_encerramento between   '2006-12-01' and  '2007-01-31'
  
gnari



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Slow query with backwards index scan

2007-07-28 Thread Tilmann Singer
* Craig James <[EMAIL PROTECTED]> [20070728 22:00]:
> >>SELECT * FROM (
> >> (SELECT * FROM large_table lt
> >> WHERE lt.user_id = 12345
> >> ORDER BY created_at DESC LIMIT 10) AS q1
> >> UNION
> >> (SELECT * FROM large_table lt
> >> WHERE user_id IN (SELECT contact_id FROM relationships WHERE 
> >> user_id=12345)
> >> ORDER BY created_at DESC LIMIT 10) AS q2
> >>ORDER BY created_at DESC LIMIT 10;
> >
> >It's not possible to use ORDER BY or LIMIT within unioned queries.
> >
> >http://www.postgresql.org/docs/8.2/static/sql-select.html#SQL-UNION
> 
> If I'm reading this documentation correctly, it *is* possible, as long as 
> they're inside of a sub-select, as in this case.

I completely overlooked that obvious note in the documentation,
sorry. I tried it only with the aliases which fooled me into thinking
that doesn't work at all:

testdb=# (select 1 limit 1) as q1 union (select 2) as q2;
ERROR:  syntax error at or near "as"
LINE 1: (select 1 limit 1) as q1 union (select 2) as q2;
   ^
but this works:

testdb=# (select 1 limit 1) union (select 2);
 ?column?
--
1
2


Great - that works!

What I didn't realize in the original post is that the problem
actually seems to be how to retrieve the rows from large_table for the
correlated relationships in an efficient way - the second of the two
queries that could be UNIONed.

Using a subselect is efficient for the user with few relationships and
matched rows at the end of the sorted large_table:

testdb=# EXPLAIN ANALYZE SELECT * FROM large_table lt
 WHERE user_id IN (SELECT contact_id FROM relationships WHERE
 user_id=12345)
 ORDER BY created_at DESC LIMIT 10;

QUERY PLAN
---
 Limit  (cost=6963.94..6963.96 rows=10 width=621) (actual time=94.598..94.629 
rows=4 loops=1)
   ->  Sort  (cost=6963.94..6966.96 rows=1211 width=621) (actual 
time=94.592..94.602 rows=4 loops=1)
 Sort Key: lt.created_at
 ->  Nested Loop  (cost=39.52..6901.92 rows=1211 width=621) (actual 
time=85.670..94.547 rows=4 loops=1)
   ->  HashAggregate  (cost=39.52..39.53 rows=1 width=4) (actual 
time=23.549..23.552 rows=1 loops=1)
 ->  Bitmap Heap Scan on relationships  (cost=4.33..39.49 
rows=10 width=4) (actual time=23.526..23.530 rows=1 loops=1)
   Recheck Cond: (user_id = 12345)
   ->  Bitmap Index Scan on 
relationships_user_id_contact_id_index  (cost=0.00..4.33 rows=10 width=0) 
(actual time=0.027..0.027 rows=1 loops=1)
 Index Cond: (user_id = 12345)
   ->  Index Scan using large_user_id_started_at_index on 
large_table lt  (cost=0.00..6834.04 rows=2268 width=621) (actual 
time=62.108..70.952 rows=4 loops=1)
 Index Cond: (lt.user_id = relationships.contact_id)
 Total runtime: 94.875 ms


But the subselect is not fast for the user with many relationships and
matched rows at the beginning of the sorted large_table:

testdb=# EXPLAIN ANALYZE SELECT * FROM large_table lt
 WHERE user_id IN (SELECT contact_id FROM relationships WHERE
 user_id=5)
 ORDER BY created_at DESC LIMIT 10;

   QUERY PLAN 
-
 Limit  (cost=6963.94..6963.96 rows=10 width=621) (actual 
time=53187.349..53187.424 rows=10 loops=1)
   ->  Sort  (cost=6963.94..6966.96 rows=1211 width=621) (actual 
time=53187.341..53187.360 rows=10 loops=1)
 Sort Key: lt.created_at
 ->  Nested Loop  (cost=39.52..6901.92 rows=1211 width=621) (actual 
time=201.728..52673.800 rows=69018 loops=1)
   ->  HashAggregate  (cost=39.52..39.53 rows=1 width=4) (actual 
time=178.777..178.966 rows=40 loops=1)
 ->  Bitmap Heap Scan on relationships  (cost=4.33..39.49 
rows=10 width=4) (actual time=47.049..178.560 rows=40 loops=1)
   Recheck Cond: (user_id = 5)
   ->  Bitmap Index Scan on 
relationships_user_id_contact_id_index  (cost=0.00..4.33 rows=10 width=0) 
(actual time=28.721..28.721 rows=40 loops=1)
 Index Cond: (user_id = 5)
   ->  Index Scan using large_user_id_started_

Re: [PERFORM] Slow query with backwards index scan

2007-07-28 Thread andrew
As other posters have pointed out, you can overcome the ORDER BY/LIMIT 
restriction on UNIONs with parentheses. I think I misbalanced the parentheses 
in my original post, which would have caused an error if you just copied and 
pasted.

I don't think the limitation has to do with planning--just parsing the query.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


RES: [PERFORM] select on 1milion register = 6s

2007-07-28 Thread Bruno Rodrigues Siqueira

Ok.


Query

EXPLAIN 
ANALYZE
select 
   to_char(data_encerramento,'mm/')  as opcoes_mes, 
   to_char(data_encerramento,'-mm') as ordem
from detalhamento_bas
where

to_char( data_encerramento ,'-mm') 
between   '2006-12' and  '2007-01'

GROUP BY opcoes_mes, ordem
ORDER BY ordem DESC





QUERY RESULT

QUERY PLAN
Sort  (cost=11449.37..11449.40 rows=119 width=8) (actual
time=14431.537..14431.538 rows=2 loops=1)
  Sort Key: to_char(data_encerramento, '-mm'::text)
  ->  HashAggregate  (cost=11448.79..11448.96 rows=119 width=8) (actual
time=14431.521..14431.523 rows=2 loops=1)
->  Index Scan using detalhamento_bas_idx3003 on detalhamento_bas
(cost=0.00..11442.95 rows=11679 width=8) (actual time=0.135..12719.155
rows=2335819 loops=1)
  Index Cond: ((to_char(data_encerramento, '-mm'::text) >=
'2006-12'::text) AND (to_char(data_encerramento, '-mm'::text) <=
'2007-01'::text))
Total runtime: 14431.605 ms 








SERVER
  DELL PowerEdge 2950
  XEON Quad-Core 3.0Ghz
  4Gb RAM
  Linux CentOS 5.0 64-bits





 Postgres 8.1.4




 Postgresql.conf


# -
# PostgreSQL configuration file
# -
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have 
# to SIGHUP the postmaster for the changes to take effect, or use 
# "pg_ctl reload". Some settings, such as listen_addresses, require
# a postmaster shutdown and restart to take effect.


#---
# FILE LOCATIONS
#---

# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.

#data_directory = 'ConfigDir'   # use data in another directory
#hba_file = 'ConfigDir/pg_hba.conf' # host-based authentication file
#ident_file = 'ConfigDir/pg_ident.conf' # IDENT configuration file

# If external_pid_file is not explicitly set, no extra pid file is written.
#external_pid_file = '(none)'   # write an extra pid file


#---
# CONNECTIONS AND AUTHENTICATION
#---

# - Connection Settings -

listen_addresses = '*'  # what IP address(es) to listen on; 
# comma-separated list of addresses;
# defaults to 'localhost', '*' = all
#port = 5432
max_connections = 10
# note: increasing max_connections costs ~400 bytes of shared memory per 
# connection slot, plus lock space (see max_locks_per_transaction).  You
# might also need to raise shared_buffers to support more connections.
#superuser_reserved_connections = 2
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777 # octal
#bonjour_name = ''  # defaults to the computer name

# - Security & Authentication -

#authentication_timeout = 60# 1-600, in seconds
#ssl = off
#password_encryption = on
#db_user_namespace = off

# Kerberos
#krb_server_keyfile = ''
#krb_srvname = 'postgres'
#krb_server_hostname = ''   # empty string matches any keytab
entry
#krb_caseins_users = off

# - TCP Keepalives -
# see 'man 7 tcp' for details

#tcp_keepalives_idle = 0# TCP_KEEPIDLE, in seconds;
# 0 selects the system default
#tcp_keepalives_interval = 0# TCP_KEEPINTVL, in seconds;
# 0 selects the system default
#tcp_keepalives_count = 0   # TCP_KEEPCNT;
# 0 selects the system default


#---
# RESOURCE USAGE (except WAL)
#---

# - Memory -

shared_buffers = 5  # min 16 or max_connections*2, 8KB
each
temp_buffers = 1000 # min 1

Re: [PERFORM] Slow query with backwards index scan

2007-07-28 Thread Jeremy Harris

Tilmann Singer wrote:

* [EMAIL PROTECTED] <[EMAIL PROTECTED]> [20070728 21:05]:

Let's try putting the sort/limit in each piece of the UNION to speed them up 
separately.

SELECT * FROM (
 (SELECT * FROM large_table lt
 WHERE lt.user_id = 12345
 ORDER BY created_at DESC LIMIT 10) AS q1
 UNION
 (SELECT * FROM large_table lt
 WHERE user_id IN (SELECT contact_id FROM relationships WHERE user_id=12345)
 ORDER BY created_at DESC LIMIT 10) AS q2
ORDER BY created_at DESC LIMIT 10;


It's not possible to use ORDER BY or LIMIT within unioned queries.

http://www.postgresql.org/docs/8.2/static/sql-select.html#SQL-UNION


"ORDER BY and LIMIT can be attached to a subexpression if it is enclosed in 
parentheses"

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Slow query with backwards index scan

2007-07-28 Thread Craig James

Tilmann Singer wrote:

* [EMAIL PROTECTED] <[EMAIL PROTECTED]> [20070728 21:05]:

Let's try putting the sort/limit in each piece of the UNION to speed them up 
separately.

SELECT * FROM (
 (SELECT * FROM large_table lt
 WHERE lt.user_id = 12345
 ORDER BY created_at DESC LIMIT 10) AS q1
 UNION
 (SELECT * FROM large_table lt
 WHERE user_id IN (SELECT contact_id FROM relationships WHERE user_id=12345)
 ORDER BY created_at DESC LIMIT 10) AS q2
ORDER BY created_at DESC LIMIT 10;


It's not possible to use ORDER BY or LIMIT within unioned queries.

http://www.postgresql.org/docs/8.2/static/sql-select.html#SQL-UNION


If I'm reading this documentation correctly, it *is* possible, as long as 
they're inside of a sub-select, as in this case.

Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] select on 1milion register = 6s

2007-07-28 Thread Hervé Piedvache
Do you have analyzed your table before doing this ?

Le samedi 28 juillet 2007, Bruno Rodrigues Siqueira a écrit :
> Friends,
>
>
>
>
>
>
>
> Who can help me? My SELECT in a base with 1 milion register,
> using  expression index = 6seconds.
>
>
>
>
>
> Please, I don't  know how to makes it better.
>
>
>
>
>
> Thanks



-- 
Hervé Piedvache

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] select on 1milion register = 6s

2007-07-28 Thread Craig James

Bruno Rodrigues Siqueira wrote:
Who can help me? My SELECT in a base with 1 milion register, 
using  expression index = 6seconds…


Run your query using 


  EXPLAIN ANALYZE SELECT ... your query ...

and then post the results to this newsgroup.  Nobody can help until they see 
the results of EXPLAIN ANALYZE.  Also, include all other relevant information, 
such as Postgres version, operating system, amount of memory, and any changes 
you have made to the Postgres configuration file.

Craig



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] select on 1milion register = 6s

2007-07-28 Thread Bruno Rodrigues Siqueira
Friends, 

 

 

 

Who can help me? My SELECT in a base with 1 milion register,
using  expression index = 6seconds.

 

 

Please, I don't  know how to makes it better.

 

 

Thanks

 

 



Re: [PERFORM] Slow query with backwards index scan

2007-07-28 Thread Tilmann Singer
* [EMAIL PROTECTED] <[EMAIL PROTECTED]> [20070728 21:05]:
> Let's try putting the sort/limit in each piece of the UNION to speed them up 
> separately.
> 
> SELECT * FROM (
>  (SELECT * FROM large_table lt
>  WHERE lt.user_id = 12345
>  ORDER BY created_at DESC LIMIT 10) AS q1
>  UNION
>  (SELECT * FROM large_table lt
>  WHERE user_id IN (SELECT contact_id FROM relationships WHERE user_id=12345)
>  ORDER BY created_at DESC LIMIT 10) AS q2
> ORDER BY created_at DESC LIMIT 10;

It's not possible to use ORDER BY or LIMIT within unioned queries.

http://www.postgresql.org/docs/8.2/static/sql-select.html#SQL-UNION

Would that make sense at all given the way the postgresql planner
works?  Does that work in other DB's?


Til

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Slow query with backwards index scan

2007-07-28 Thread andrew
Tilmann Singer <[EMAIL PROTECTED]> wrote ..
> * Nis Jørgensen <[EMAIL PROTECTED]> [20070727 20:31]:
> > How does the "obvious" UNION query do - ie:
> > 
> > SELECT * FROM (
> > SELECT * FROM large_table lt
> > WHERE lt.user_id = 12345
> > 
> > UNION
> > 
> > SELECT * FROM large_table lt
> > WHERE user_id IN (SELECT contact_id FROM relationships WHERE user_id=12345)
> > ) q
> > 
> > ORDER BY created_at DESC LIMIT 10;

Let's try putting the sort/limit in each piece of the UNION to speed them up 
separately.

SELECT * FROM (
 (SELECT * FROM large_table lt
 WHERE lt.user_id = 12345
 ORDER BY created_at DESC LIMIT 10) AS q1
 UNION
 (SELECT * FROM large_table lt
 WHERE user_id IN (SELECT contact_id FROM relationships WHERE user_id=12345)
 ORDER BY created_at DESC LIMIT 10) AS q2
ORDER BY created_at DESC LIMIT 10;

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Vacuum looping?

2007-07-28 Thread Jim C. Nasby
On Fri, Jul 27, 2007 at 05:32:11PM -0400, Steven Flatt wrote:
> weren't convinced was doing anything), then start a manual vacuum with a
> higher vacuum_cost_limit to get things cleaned up quicker.

What are your vacuum_cost_* settings? If you set those too aggressively
you'll be in big trouble.

The second pass on the vacuum means that maintenance_work_memory isn't
large enough.
-- 
Jim Nasby  [EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpWfT3kiMjGA.pgp
Description: PGP signature


Re: [PERFORM] Vacuum looping?

2007-07-28 Thread Tom Lane
"Steven Flatt" <[EMAIL PROTECTED]> writes:
> The vacuum then just sat there.  What I can't understand is why it went back
> for a second pass of the pkey index?  There was nothing writing to the table
> once the vacuum began.  Is this behaviour expected?

Yes (hint: the numbers tell me what your maintenance_work_mem setting is).
You should have left it alone, probably, though there seems to be
something funny about your foo_1 index --- why was that so much slower
than the others for the first pass?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Slow query with backwards index scan

2007-07-28 Thread Tilmann Singer
* Nis Jørgensen <[EMAIL PROTECTED]> [20070727 20:31]:
> How does the "obvious" UNION query do - ie:
> 
> SELECT * FROM (
> SELECT * FROM large_table lt
> WHERE lt.user_id = 12345
> 
> UNION
> 
> SELECT * FROM large_table lt
> WHERE user_id IN (SELECT contact_id FROM relationships WHERE user_id=12345)
> ) q
> 
> ORDER BY created_at DESC LIMIT 10;

Great for the user with little data:

testdb=# EXPLAIN ANALYZE SELECT * FROM (
SELECT * FROM large_table lt
WHERE lt.user_id = 12345
UNION
SELECT * FROM large_table lt
WHERE user_id IN (SELECT contact_id FROM relationships WHERE user_id=12345)
) q
ORDER BY created_at DESC LIMIT 10;

 QUERY PLAN   

 Limit  (cost=14673.77..14673.80 rows=10 width=3140) (actual 
time=133.877..133.946 rows=10 loops=1)
   ->  Sort  (cost=14673.77..14682.22 rows=3378 width=3140) (actual 
time=133.870..133.894 rows=10 loops=1)
 Sort Key: q.created_at
 ->  Unique  (cost=14315.34..14442.01 rows=3378 width=622) (actual 
time=133.344..133.705 rows=38 loops=1)
   ->  Sort  (cost=14315.34..14323.78 rows=3378 width=622) (actual 
time=133.337..133.432 rows=38 loops=1)
 Sort Key: id, user_id, plaze_id, device, started_at, 
updated_at, status, "type", duration, permission, created_at, mac_address, 
subnet, msc
 ->  Append  (cost=0.00..14117.35 rows=3378 width=622) 
(actual time=39.144..133.143 rows=38 loops=1)
   ->  Index Scan using large_user_id_started_at_index 
on large_table lt  (cost=0.00..7243.59 rows=2158 width=622) (actual 
time=39.138..109.831 rows=34 loops=1)
 Index Cond: (user_id = 12345)
   ->  Nested Loop  (cost=42.78..6839.98 rows=1220 
width=622) (actual time=14.859..23.112 rows=4 loops=1)
 ->  HashAggregate  (cost=42.78..42.79 rows=1 
width=4) (actual time=8.092..8.095 rows=1 loops=1)
   ->  Bitmap Heap Scan on relationships  
(cost=4.34..42.75 rows=11 width=4) (actual time=8.067..8.070 rows=1 loops=1)
 Recheck Cond: (user_id = 12345)
 ->  Bitmap Index Scan on 
relationships_user_id_index  (cost=0.00..4.34 rows=11 width=0) (actual 
time=8.057..8.057 rows=1 loops=1)
   Index Cond: (user_id = 12345)
 ->  Index Scan using 
large_user_id_started_at_index on large_table lt  (cost=0.00..6768.48 rows=2297 
width=622) (actual time=6.751..14.970 rows=4 loops=1)
   Index Cond: (lt.user_id = 
relationships.contact_id)
 Total runtime: 134.220 ms


Not so great for the user with many early matches:

testdb=# EXPLAIN ANALYZE SELECT * FROM (
SELECT * FROM large_table lt
WHERE lt.user_id = 5
UNION
SELECT * FROM large_table lt
WHERE user_id IN (SELECT contact_id FROM relationships WHERE user_id=5)
) q
ORDER BY created_at DESC LIMIT 10;

   QUERY PLAN 

 Limit  (cost=14673.77..14673.80 rows=10 width=3140) (actual 
time=3326.304..3326.367 rows=10 loops=1)
   ->  Sort  (cost=14673.77..14682.22 rows=3378 width=3140) (actual 
time=3326.297..3326.318 rows=10 loops=1)
 Sort Key: q.created_at
 ->  Unique  (cost=14315.34..14442.01 rows=3378 width=622) (actual 
time=2413.070..3019.385 rows=69757 loops=1)
   ->  Sort  (cost=14315.34..14323.78 rows=3378 width=622) (actual 
time=2413.062..2590.354 rows=69757 loops=1)
 Sort Key: id, user_id, plaze_id, device, started_at, 
updated_at, status, "type", duration, permission, created_at, mac_address, 
subnet, msc
 ->  Append  (cost=0.00..14117.35 rows=3378 width=622) 
(actual time=0.067..1911.626 rows=69757 loops=1)
   ->  Index Scan using large_user_id_started_at_index 
on large_table lt  (cost=0.00..7243.59 rows=2158 width=622) (actual 
time=0.062..3.440 rows=739 loops=1)
 Index Cond: (user_id = 5)
   ->  Nested Loop  (cost=42.78..6839.98 rows=1220 
width=622) (actual time=0.451..1557.901 rows=69018 loops=1)
 ->  HashAggregate  (cost=42.78..42.79 rows=1 
width=4) (actual time=0.404..0.580 rows=40 loop