Re: [PERFORM] Slow Query

2005-07-14 Thread Ragnar Hafstað
On Thu, 2005-07-14 at 10:06 +1000, Marc McIntyre wrote:

> I'm having a problem with a query that performs a sequential scan on a 
> table when it should be performing an index scan. The interesting thing 
> is, when we dumped the database on another server, it performed an index 
> scan on that server.
...
> The EXPLAIN ANALYZE from the system performing an sequential scan:
> 
> QUERY PLAN
> Sort  (cost=30079.79..30079.89 rows=42 width=113) (actual 
> time=39889.989..39890.346 rows=260 loops=1)
...
> The EXPLAIN ANALYZE from the system performing an index scan scan:
> Sort  (cost=16873.64..16873.74 rows=40 width=113) (actual 
> time=2169.905..2169.912 rows=13 loops=1)

looks like the first query is returning 260 rows,
but the second one 13

this may not be your problem, but are you sure you are using the same
query on the same data here ?

gnari



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Planner issue

2005-03-22 Thread Ragnar Hafstað
On Tue, 2005-03-22 at 14:36 -0500, Alex Turner wrote:

> I will use an index 220-300, but not 200-300.
> ...
>  Seq Scan on propmain  (cost=0.00..15517.56 rows=6842 width=4) (actual
> time=0.039..239.760 rows=6847 loops=1)
> ...
>  Index Scan using propmain_listprice_i on propmain 
> (cost=0.00..22395.95 rows=6842 width=4) (actual time=0.084..25.751
> rows=6847 loops=1)

the rows estimates are accurate, so it is not a question of statistics
anymore.

first make sure effective_cache_size is correctly set, and then 
if that is not enough, you might try to lower random_page_cost a bit


gnari



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] How to read query plan

2005-03-13 Thread Ragnar Hafstað
On Sun, 2005-03-13 at 16:32 +0100, Miroslav Šulc wrote:
> Hi all,
> 
> I am new to PostgreSQL and query optimizations. We have recently moved 
> our project from MySQL to PostgreSQL and we are having performance 
> problem with one of our most often used queries. On MySQL the speed was 
> sufficient but PostgreSQL chooses time expensive query plan. I would 
> like to optimize it somehow but the query plan from EXPLAIN ANALYZE is 
> little bit cryptic to me.
> 

[snip output of EXPLAIN ANALYZE]

for those of us who have not yet reached the level where one can
infer it from the query plan, how abour showing us the actual
query too ?

but as an example of what to look for, consider the first few lines
(reformatted): 

> Merge Right Join  (cost=9868.84..9997.74 rows=6364 width=815) 
>   (actual time=9982.022..10801.216 rows=6364 loops=1)
>   Merge Cond: ("outer".idpk = "inner".cadastralunitidfk)
>   ->  Index Scan using cadastralunits_pkey on cadastralunits  
>   (cost=0.00..314.72 rows=13027 width=31)
>   (actual time=0.457..0.552 rows=63 loops=1)
>   ->  Sort  (cost=9868.84..9884.75 rows=6364 width=788)
> (actual time=9981.405..10013.708 rows=6364 loops=1)

notice that the index scan is expected to return 13027 rows, but
actually returns 63. this might influence the a choice of plan.

gnari




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Help with tuning this query

2005-03-02 Thread Ragnar Hafstað
On Wed, 2005-03-02 at 13:28 -0500, Ken Egervari wrote:
> >> select s.*
> >> from shipment s
> >> inner join carrier_code cc on s.carrier_code_id = cc.id
> >> inner join carrier c on cc.carrier_id = c.id
> >> inner join carrier_to_person ctp on ctp.carrier_id = c.id
> >> inner join person p on p.id = ctp.person_id
> >> inner join shipment_status cs on s.current_status_id = cs.id
> >> inner join release_code rc on cs.release_code_id = rc.id
> >> left join shipment_status ss on ss.shipment_id = s.id
> >> where
> >> p.id = :personId and
> >> s.is_purged = false and
> >> rc.number = '9' and
> >> cs is not null and
> >> cs.date >= current_date - 31
> >> order by cs.date desc
> >
> > I may be missing something, but it looks like the second join
> > on shipment_status (the left join) is not adding anything to your
> > results, except more work. ss is not used for output, nor in the where
> > clause, so what is its purpose ?
> ...  The second 
> left join is for eager loading so that I don't have to run a seperate query 
> to fetch the children for each shipment.  This really does improve 
> performance because otherwise you'll have to make N+1 queries to the 
> database, and that's just too much overhead.

are you saying that you are actually doing a
  select s.*,ss.* ...
?

> > if cs.date has an upper limit, it might be helpful to change the
> > condition to a BETWEEN
> 
> Well, I could create an upper limit. It would be the current date.  Would 
> adding in this redundant condition improve performance?

it might help the planner estimate better the number of cs rows 
affected. whether this improves performance depends on whether
the best plans are sensitive to this.

an EXPLAIN ANALYSE might reduce the guessing.

gnari



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Help with tuning this query

2005-03-02 Thread Ragnar Hafstað
On Wed, 2005-03-02 at 01:51 -0500, Ken Egervari wrote:
>  
> select s.*
> from shipment s
> inner join carrier_code cc on s.carrier_code_id = cc.id
> inner join carrier c on cc.carrier_id = c.id
> inner join carrier_to_person ctp on ctp.carrier_id = c.id
> inner join person p on p.id = ctp.person_id
> inner join shipment_status cs on s.current_status_id = cs.id
> inner join release_code rc on cs.release_code_id = rc.id
> left join shipment_status ss on ss.shipment_id = s.id
> where
> p.id = :personId and
> s.is_purged = false and
> rc.number = '9' and
> cs is not null and
> cs.date >= current_date - 31
> order by cs.date desc
> ... 
> shipment contains 40,000 rows
> shipment_status contains 80,000 rows

I may be missing something, but it looks like the second join
on shipment_status (the left join) is not adding anything to your
results, except more work. ss is not used for output, nor in the where
clause, so what is its purpose ?

if cs.date has an upper limit, it might be helpful to change the
condition to a BETWEEN

in any case, i would think you might need an index on
  shipment(carrier_code_id)
  shipment(current_status_id)
  shipment_status(id)

gnari




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] PgPool changes WAS: PostgreSQL clustering VS MySQL

2005-01-24 Thread Ragnar Hafstað
On Mon, 2005-01-24 at 09:52 -0800, Josh Berkus wrote:
> [about keeping connections open in web context]
> Ah, clarity problem here.I'm talking about connection pooling tools from 
> the client (webserver) side, such as Apache::DBI, PHP's pg_pconnect, 
> Jakarta's connection pools, etc.   Not pooling on the database server side, 
> which is what pgPool provides.

note that these sometimes do not provide connection pooling as such,
just persistent connections (Apache::DBI)

> Most of these tools allocate a database connection to an HTTP/middleware 
> client, and only release it after a specific period of inactivity.This 
> means that you *could* count on "web-user==connection" for purposes of 
> switching back and forth to the master -- as long as the connection-recycling 
> timeout were set higher than the pgPool switch-off period.

no. you can only count on web-server-process==connection, but not
web-user==connection, unless you can garantee that the same user
client always connects to same web-server process.

am i missing something ?

gnari



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] [SQL] OFFSET impact on Performance???

2005-01-20 Thread Ragnar Hafstað
On Thu, 2005-01-20 at 19:12 +, Ragnar Hafstað wrote:
> On Thu, 2005-01-20 at 11:59 -0500, Greg Stark wrote:
> 
> > The best way to do pages for is not to use offset or cursors but to use an
> > index. This only works if you can enumerate all the sort orders the
> > application might be using and can have an index on each of them.
> > 
> > To do this the query would look something like:
> > 
> > SELECT * FROM tab WHERE col > ? ORDER BY col LIMIT 50
> > 
> > Then you take note of the last value used on a given page and if the user
> > selects "next" you pass that as the starting point for the next page.
> 
> this will only work unchanged if the index is unique. imagine , for
> example if you have more than 50 rows with the same value of col.
> 
> one way to fix this is to use ORDER BY col,oid

and a slightly more complex WHERE clause as well, of course

gnari



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] index scan of whole table, can't see why

2005-01-20 Thread Ragnar Hafstað
On Wed, 2005-01-19 at 20:37 -0500, Dan Langille wrote:
> Hi folks,
> 
> Running on 7.4.2, recently vacuum analysed the three tables in 
> question.
> 
> The query plan in question changes dramatically when a WHERE clause 
> changes from ports.broken to ports.deprecated.  I don't see why.  
> Well, I do see why: a sequential scan of a 130,000 rows.  The query 
> goes from 13ms to 1100ms because the of this.  The full plans are at 
> http://rafb.net/paste/results/v8ccvQ54.html
> 
> I have tried some tuning by:
> 
>   set effective_cache_size to 4000, was 1000
>   set random_page_cost to 1, was 4
> 
> The resulting plan changes, but no speed improvment, are at 
> http://rafb.net/paste/results/rV8khJ18.html
> 

this just confirms that an indexscan is not always better than a
tablescan. by setting random_page_cost to 1, you deceiving the
planner into thinking that the indexscan is almost as effective
as a tablescan.

> Any suggestions please?  

did you try to increase sort_mem ?

gnari



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] index scan of whole table, can't see why

2005-01-20 Thread Ragnar Hafstað
On Wed, 2005-01-19 at 21:00 -0800, [EMAIL PROTECTED] wrote:
> Let's see if I have been paying enough attention to the SQL gurus. 
> The planner is making a different estimate of how many deprecated<>'' versus 
> how many broken <> ''. 
> I would try SET STATISTICS to a larger number on the ports table, and 
> re-analyze.

that should not help, as the estimate is accurate, according to the
explain analyze.

gnari



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Postgres Optimizer is not smart enough?

2005-01-12 Thread Ragnar Hafstað
On Thu, 2005-01-13 at 12:14 +1300, Mark Kirkwood wrote:

[snip some explains]

> 
> I have random_page_cost = 0.8 in my postgresql.conf. Setting it back to
> the default (4) results in a plan using test_id1.

it is not rational to have random_page_cost < 1.

if you see improvement with such a setting, it is as likely that 
something else is wrong, such as higher statistic targets needed,
or a much too low effective_cache setting. 

gnari



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Howto Increased performace ?

2004-12-27 Thread Ragnar Hafstað
On Mon, 2004-12-27 at 22:31 +0700, Amrit Angsusingh wrote:
>  [ [EMAIL PROTECTED] ]
> >
> > These are some settings that I am planning to start with for a 4GB RAM
> > dual
> > opteron system with a maximum of 100 connections:
> >
> >
> > shared_buffers 8192 (=67MB RAM)
> > sort_mem 4096 (=400MB RAM for 100 connections)
> > effective_cache_size 38(@8KB  =3.04GB RAM)
> > vacuum_mem 32768 KB
> > wal_buffers 64
> > checkpoint_segments 8
> >
> > In theory, effective cache size is the amount of memory left over for the
> > OS
> > to cache the filesystem after running all programs and having 100 users
> > connected, plus a little slack.

> I reduced the connection to 160 and configured as below there is some
> improvement in speed .
> shared_buffers = 27853 [Should I reduce it to nearly as you do and what
> will happen?]

at some point, more shared buffers will do less good than leaving the
memory to the OS to use as disk buffers. you might want to experiment
a bit with different values to find what suits your real-life conditions

> sort_mem = 8192
> vacuum_mem = 16384
> effective_cache_size = 81920 [Should I increase it to more than 20 ?]
as Iain wrote, this value is an indication of how much memory will be
available to the OS for disk cache.
when all other settings have been made, try to see how much memory your
OS has left under normal conditions, and adjust your setting
accordingly, if it differs significantly.
I have seen cases where an incorrect value (too low) influenced the
planner to use sequential scans instead of better indexscans,
presumably because of a higher ratio of estimated cache hits.

> Thanks for any comment again.
> 
> NB. There is a huge diaster in my country "Tsunamies" and all the people
> over the country include me felt into deep sorrow.

my condolescences.

> Amrit Angsusingh
> Thailand

gnari



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Speed in V8.0

2004-12-24 Thread Ragnar Hafstað
On Wed, 2004-12-22 at 00:03 +0100, Thomas Wegner wrote:
> Hello, i have a problem between V7.4.3 Cygwin and
> V8.0RC2 W2K. I have 2 systems:
> 
> 1. Production Machine
> - Dual P4 3000MHz
> - 2 GB RAM
> - W2K
> - PostgreSQL 7.4.3 under Cygwin
> - i connect to it over a DSL Line
> 2. Develop Machine
> - P4 1800MHz
> - 760 MB RAM
> - PostgreSQL Native Windows
> - local connection 100MB/FD
> 
> Both systems use the default postgresql.conf. Now the problem.
> I have an (unoptimized, dynamic) query wich was execute on the
> production machine over DSL in 2 seconds and on my develop
> machine, connected over local LAN, in 119 seconds!

has the development database been ANALYZED ?
 
gnari



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Howto Increased performace ?

2004-12-24 Thread Ragnar Hafstað
On Tue, 2004-12-21 at 16:31 +0700, Amrit Angsusingh wrote:
> I used postgresql 7.3.2-1 with RH 9 on a mechine of 2 Xeon 3.0 Ghz and ram
> of 4 Gb. Since 1 1/2 yr. when I started to use the database server after
> optimizing the postgresql.conf everything went fine until a couple of
> weeks ago , my database grew up to 3.5 Gb and there were more than 140
> concurent connections.
...
> shared_buffers = 25
this is much higher than usually adviced on this list.
try to reduce this to 25000
 
> effective_cache_size = 5000
and increase this instead, to say, 5


gnari



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Tips for a system with _extremely_ slow IO?

2004-12-21 Thread Ragnar Hafstað
On Fri, 2004-12-17 at 23:51 -0800, Ron Mayer wrote:
> Any advice for settings for extremely IO constrained systems?
> 
> A demo I've set up for sales seems to be spending much of it's time in 
> disk wait states.
> 
> 
> The particular system I'm working with is:
> Ext3 on Debian inside Microsoft VirtualPC on NTFS
> on WindowsXP on laptops of our sales team.

As this is only for demo purposes, you might consider turning fsync off,
although I have no idea if it would have any effect on your setup.

gnari



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] [NOVICE] \d output to a file

2004-12-15 Thread Ragnar Hafstað
On Wed, 2004-12-15 at 11:50 -0500, Tom Lane wrote:
> Geoffrey <[EMAIL PROTECTED]> writes:
> > sarlav kumar wrote:
> >> I would like to write the output of the \d command on all tables in a
> >> database to an output file.
> 
> > What is the OS?  On any UNIX variant you can do:
> > echo '\d' | psql > outputfile
> 
> Or use \o:
> 
> regression=# \o zzz1
> regression=# \d
or:
=# \d *
to get all tables as th OP wanted

> regression=# \o

gnari



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]