[PERFORM] Two different execution plan for the same request

2010-07-07 Thread JOUANIN Nicolas (44)




Hi,

I've trouble with some SQL request which have different execution plans
when ran on two different servers. One server is the development
environment, the othe rone is th pre-production env.
Both servers run postgreSQL 8.3.0 on Linux and :
- both databases contains the same data (pg_dump/pg_restore between
servers)
- instances have the same configuration parameters
- vaccum and analyze is run every day.
The only difference I can see is the hardware. The pre-preoduction env.
has more RAM, more CPU and a RAID5 disk array through LVM where data
are stored. 
Performances should be better on the pre-production but unfortunatelly
this is not the case.
Below are the execution plan on development env and pre-production. As
you can see pre-production performance are poor, far slower than dev.
env !
For information, enable_seqscan is turned off (some DBA advice). 
Also I can get the same execution plan on both environment by turning
off enable_mergejoin and enable_hashjoin on the pre-production. Then
execution matches and performances are much better.
Could anyone help to guess why both servers don't have the same
execution plans ? Can this be due to hardware difference ?

Let me know if you need more detailed informations on these
configurations.

Thanks.

Dev. environment :
EXPLAIN analyze SELECT DISTINCT
ConstantesTableNBienService.id,ConstantesTableNBienService.code,ConstantesTableNBienService.lib_code
FROM T_DEMANDE ConstantesTableDemande
LEFT OUTER JOIN T_OPERATION ConstantesTableOperation
 ON ConstantesTableDemande.id_tech =
ConstantesTableOperation.id_demande
LEFT OUTER JOIN T_BIEN_SERVICE ConstantesTableBienService
 ON ConstantesTableBienService.id_operation =
ConstantesTableOperation.id_tech
LEFT OUTER JOIN N_BIEN_SERVICE ConstantesTableNBienService
 ON ConstantesTableBienService.bs_code =
ConstantesTableNBienService.id
WHERE
 ConstantesTableDemande.id_tech = 'y+3eRapRQjW8mtL4wHd4/A=='
 AND ConstantesTableOperation.type_operation = 'acq'
 AND ConstantesTableNBienService.parent is null
ORDER BY ConstantesTableNBienService.code ASC;

QUERY PLAN 

Unique (cost=3586307.73..3586341.94 rows=205 width=123) (actual
time=440.626..440.875 rows=1 loops=1)
 - Sort (cost=3586307.73..3586316.28 rows=3421 width=123)
(actual time=440.624..440.723 rows=187 loops=1)
 Sort Key: constantestablenbienservice.code,
constantestablenbienservice.id, constantestablenbienservice.lib_code
 Sort Method: quicksort Memory: 24kB
 - Nested Loop Left Join (cost=40.38..3586106.91
rows=3421 width=123) (actual time=71.696..440.240 rows=187 loops=1)
 Filter: (constantestablenbienservice.parent IS NULL)
 - Nested Loop Left Join (cost=40.38..3554085.80
rows=6842 width=4) (actual time=66.576..433.797 rows=187 loops=1)
 - Nested Loop (cost=0.00..5041.46 rows=1246
width=25) (actual time=22.923..23.054 rows=30 loops=1)
 - Index Scan using t_demande_pkey on
t_demande constantestabledemande (cost=0.00..8.32 rows=1 width=25)
(actual time=5.534..5.537 rows=1 loops=1)
 Index Cond: ((id_tech)::text =
'y+3eRapRQjW8mtL4wHd4/A=='::text)
 - Index Scan using
idx_operation_demande on t_operation constantestableoperation
(cost=0.00..5020.68 rows=1246 width=50) (actual time=17.382..17.460
rows=30 loops=1)
 Index Cond:
((constantestableoperation.id_demande)::text =
'y+3eRapRQjW8mtL4wHd4/A=='::text)
 Filter:
((constantestableoperation.type_operation)::text = 'acq'::text)
 - Bitmap Heap Scan on t_bien_service
constantestablebienservice (cost=40.38..2836.96 rows=911 width=29)
(actual time=13.511..13.677 rows=6 loops=30)
 Recheck Cond:
((constantestablebienservice.id_operation)::text =
(constantestableoperation.id_tech)::text)
 - Bitmap Index Scan on
idx_bien_service_operation (cost=0.00..40.15 rows=911 width=0) (actual
time=13.144..13.144 rows=6 loops=30)
 Index Cond:
((constantestablebienservice.id_operation)::text =
(constantestableoperation.id_tech)::text)
 - Index Scan using n_bien_service_pkey on
n_bien_service constantestablenbienservice (cost=0.00..4.67 rows=1
width=127) (actual time=0.030..0.031 rows=1 loops=187)
 Index Cond: (constantestablebienservice.bs_code =
constantestablenbienservice.id)
Total runtime: 2.558 ms
(20 lignes)


Pre-production:
EXPLAIN analyze
SELECT DISTINCT
ConstantesTableNBienService.id,ConstantesTableNBienService.code,ConstantesTableNBienService.lib_code
FROM T_DEMANDE ConstantesTableDemande
LEFT OUTER JOIN T_OPERATION ConstantesTableOperation
 ON ConstantesTableDemande.id_tech =
ConstantesTableOperation.id_demande
LEFT OUTER JOIN T_BIEN_SERVICE ConstantesTableBienService
 ON ConstantesTableBienService.id_operation =
ConstantesTableOperation.id_tech
LEFT OUTER JOIN N_BIEN_SERVICE ConstantesTableNBienService
 ON ConstantesTableBienService.bs_code =
ConstantesTableNBienService.id
WHERE
 ConstantesTableDemande.id_tech = 

Re: [PERFORM] Two different execution plan for the same request

2010-07-07 Thread Yeb Havinga

JOUANIN Nicolas (44) wrote:

Hi,

I've trouble with some SQL request which have different execution 
plans when ran on two different servers. One server is the development 
environment, the othe rone is th pre-production env.

Both servers run postgreSQL 8.3.0 on Linux and :
 - both databases contains the same data (pg_dump/pg_restore between 
servers)

 - instances have the same configuration parameters
 - vaccum and analyze is run every day.
The only difference I can see is the hardware. The pre-preoduction 
env. has more RAM, more CPU and a RAID5 disk array through LVM where 
data are stored.

Hello Jouanin,

Could you give some more information following the guidelines from 
http://wiki.postgresql.org/wiki/SlowQueryQuestions ?


Essential are the contents from both conf files (comments may be removed).

regards,
Yeb Havinga


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Two different execution plan for the same request

2010-07-07 Thread Yeb Havinga

JOUANIN Nicolas (44) wrote:


The strange thing is that this morning explain analyze now gives a 
much better duration :


There were no modification made on the database except a restart 
yesterday evening and a vacuumdb --analyse ran at night.
If the earlier bad query was run on a fresh imported database that was 
not ANALYZEd, it would explain the different and likely bad plan. If you 
want to know for sure this is the cause, instead of e.g. faulty 
hardware, you could verify redoing the import + query without analyze.


regards,
Yeb Havinga


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [Slony1-general] WAL partition overloaded--by autovacuum?

2010-07-07 Thread Mark Kirkwood

On 07/07/10 13:10, Richard Yen wrote:


This leads me to believe that there was a sudden flurry of write activity that 
occurred, and the process that would flush WAL files to /db/data/ couldn't keep 
up, thereby filling up the disk.  I'm wondering if anyone else out there might 
be able to give me some insight or comments to my assessment--is it accurate?  
Any input would be helpful, and I'll try to make necessary architectural 
changes to keep this from happening again.



Do you have wal archiving enabled? (if so lets see your archive_command).

Cheers

Mark

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Two different execution plan for the same request

2010-07-07 Thread Guillaume Smet
Hi Nicolas,

On Wed, Jul 7, 2010 at 10:47 AM, JOUANIN Nicolas (44)
nicolas.joua...@dgfip.finances.gouv.fr wrote:
 There were no modification made on the database except a restart yesterday 
 evening and a vacuumdb --analyse ran at night.

It's not really surprising considering you probably kept the
default_statistics_target to 10 (it's the default in 8.3).

Consider raising it to 100 in your postgresql.conf (100 is the default
for newer versions), then reload, and run a new ANALYZE.

You might need to set it higher on specific columns if you have a lot
of data and your data distribution is weird.

And, btw, please upgrade to the latest 8.3.x.

HTH

-- 
Guillaume

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Two different execution plan for the same request

2010-07-07 Thread JOUANIN Nicolas (44)




It seems to work fine (same
execution plan and less duration) after :
- setting default_statistics_target to 100
- full vacuum with analyze
- reindexdb

Thanks.


 Message original 
Sujet: Re: [PERFORM] Two different execution plan for the same request
De: Guillaume Smet guillaume.s...@gmail.com
Pour: JOUANIN Nicolas (44)
nicolas.joua...@dgfip.finances.gouv.fr
Copie : Yeb Havinga yebhavi...@gmail.com,
pgsql-performance@postgresql.org
Date: 07/07/2010 10:59

  Hi Nicolas,

On Wed, Jul 7, 2010 at 10:47 AM, JOUANIN Nicolas (44)
nicolas.joua...@dgfip.finances.gouv.fr wrote:
  
  
There were no modification made on the database except a restart yesterday evening and a vacuumdb --analyse ran at night.

  
  
It's not really surprising considering you probably kept the
default_statistics_target to 10 (it's the default in 8.3).

Consider raising it to 100 in your postgresql.conf (100 is the default
for newer versions), then reload, and run a new ANALYZE.

You might need to set it higher on specific columns if you have a lot
of data and your data distribution is weird.

And, btw, please upgrade to the latest 8.3.x.

HTH

  






Re: [PERFORM] WAL partition overloaded--by autovacuum?

2010-07-07 Thread Kevin Grittner
Richard Yen wrote:
 
 the pg_xlog partition filled up (33GB)
 
 checkpoint_segments=16
 
 a sudden flurry of write activity
 
Was this database bulk-loaded in some way (like by restoring the
output of pg_dump, for example)?  If so, all rows inserted into all
tables would have the same (or very nearly the same) xmin value. At
some later time, virtually all tuples would need to be rewritten to
freeze them. This would be a logged operation, so WAL files would be
getting created rapidly.  If you have WAL archiving turned on, and
you can't copy the files out as fast as they're being created, this
might happen.
 
To avoid such a crushing mass freeze at an unpredictable time, we
always run VACUUM FREEZE ANALYZE on a bulk-loaded database before
turning on WAL archiving.
 
Of course, if this database wasn't bulk-loaded, this can't be your
problem
 
-Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Two different execution plan for the same request

2010-07-07 Thread Matthew Wakeling

On Wed, 7 Jul 2010, JOUANIN Nicolas (44) wrote:

It seems to work fine (same execution plan and less duration) after :
 - setting default_statistics_target to 100
 - full vacuum with analyze


Don't do VACUUM FULL.

Matthew

--
I suppose some of you have done a Continuous Maths course. Yes? Continuous
Maths? menacing stares from audience Whoah, it was like that, was it!
   -- Computer Science Lecturer
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Two equivalent WITH RECURSIVE queries, one of them slow.

2010-07-07 Thread Merlin Moncure
On Mon, Jul 5, 2010 at 2:07 AM, Octavio Alvarez
alvar...@alvarezp.ods.org wrote:
 Hello.

 I have a tree-like table with a three-field PK (name, date, id) and one
 parent field.
 It has 5k to 6k records as of now, but it will hold about 1 million records.

 I am trying the following WITH RECURSIVE query:

 WITH RECURSIVE t AS (
                 SELECT par.id AS tid, par.name, par.date, par.id, par.text,
 par.h_title, par.h_name, par.parent
                   FROM _books.par
        UNION
                 SELECT t.tid AS pid, p.name, p.date, p.id, p.text,
 p.h_title, p.h_name, p.parent
                   FROM t, _books.par p
                  WHERE p.name = t.name AND p.date = t.date AND t.id =
 p.parent
        )
  SELECT t.tid, t.name, t.date, t.id, t.text, t.h_title, t.h_name, t.parent
   FROM t WHERE name = 'cfx' AND date = '2009-08-19' AND tid = '28340';

 ... which takes 2547.503 ms

 However, if I try the same query but adding the same WHERE clause to the
 non-recursive term, I get much better results.


 WITH RECURSIVE t AS (
                 SELECT par.id AS tid, par.name, par.date, par.id, par.text,
 par.h_title, par.h_name, par.parent
                   FROM _books.par WHERE name = 'cfx' AND date = '2009-08-19'
 AND par.id = '28340'
        UNION
                 SELECT t.tid AS pid, p.name, p.date, p.id, p.text,
 p.h_title, p.h_name, p.parent
                   FROM t, _books.par p
                  WHERE p.name = t.name AND p.date = t.date AND t.id =
 p.parent
        )
  SELECT t.tid, t.name, t.date, t.id, t.text, t.h_title, t.h_name, t.parent
   FROM t WHERE name = 'cfx' AND date = '2009-08-19' AND tid = '28340';

 ... which takes 0.221 ms

If you want the fast plan, you might want to consider reworking your
query into a set returning function.  It's pretty easy to do:


create or replace function f(arg int) returns setof something as
$$
  with recursive foo as
  (
select * from bar where id = $1
  union all
[...]
  )
  select * from foo
$$ language sql;

Obviously, a pure view approach would be nicer but it just isn't going
to hapen at present.  CTE are currently problematic generally when you
need quals in the 'with' term, especially in the case of recursive
CTE.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] big data - slow select (speech search)

2010-07-07 Thread Michal Fapso
Hi Robert,

thank you for your help. I tried to cluster the table on
hyps_wordid_index and the query execution time dropped from 4.43 to
0.19 seconds which is not that far from Lucene's performance of 0.10
second.

Thanks a lot!
Miso Fapso

On 6 July 2010 02:25, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Jul 1, 2010 at 6:34 PM, Michal Fapso michal.fa...@gmail.com wrote:
 It took about 4.5 seconds. If I rerun it, it takes
 less than 2 miliseconds, but it is because of the cache. I need to
 optimize the first-run.

 laptop ASUS, CPU dual core T2300 1.66GHz, 1.5G RAM

 EXPLAIN ANALYZE SELECT h1.docid
 FROM hyps AS h1
 WHERE h1.wordid=65658;

  Bitmap Heap Scan on hyps h1  (cost=10.97..677.09 rows=171 width=4)
 (actual time=62.106..4416.864 rows=343 loops=1)
   Recheck Cond: (wordid = 65658)
   -  Bitmap Index Scan on hyps_wordid_index  (cost=0.00..10.92
 rows=171 width=0) (actual time=42.969..42.969 rows=343 loops=1)
         Index Cond: (wordid = 65658)
  Total runtime: 4432.015 ms

 If I run the same query in Lucene search engine, it takes 0.105
 seconds on the same data which is quite a huge difference.

 So PostgreSQL is reading 343 rows from disk in 4432 ms, or about 12
 ms/row.  I'm not an expert on seek times, but that might not really be
 that unreasonable, considering that those rows may be scattered all
 over the index and thus it may be basically random I/O.  Have you
 tried clustering hyps on hyps_wordid_index?  If you had a more
 sophisticated disk subsystem you could try increasing
 effective_io_concurrency but that's not going to help with only one
 spindle.

 If you run the same query in Lucene and it takes only 0.105 s, then
 Lucene is obviously doing a lot less I/O.  I doubt that any amount of
 tuning of your existing schema is going to produce that kind of result
 on PostgreSQL.  Using the full-text search stuff, or a gin index of
 some kind, might get you closer, but it's hard to beat a
 special-purpose engine that implements exactly the right algorithm for
 your use case.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Weird XFS WAL problem

2010-07-07 Thread Bruce Momjian
Greg Smith wrote:
 Kevin Grittner wrote:
  I don't know at the protocol level; I just know that write barriers
  do *something* which causes our controllers to wait for actual disk
  platter persistence, while fsync does not
 
 It's in the docs now:  
 http://www.postgresql.org/docs/9.0/static/wal-reliability.html
 
 FLUSH CACHE EXT is the ATAPI-6 call that filesystems use to enforce 
 barriers on that type of drive.  Here's what the relevant portion of the 
 ATAPI spec says:
 
 This command is used by the host to request the device to flush the 
 write cache. If there is data in the write
 cache, that data shall be written to the media.The BSY bit shall remain 
 set to one until all data has been
 successfully written or an error occurs.
 
 SAS systems have a similar call named SYNCHRONIZE CACHE.
 
 The improvement I actually expect to arrive here first is a reliable 
 implementation of O_SYNC/O_DSYNC writes.  Both SAS and SATA drives that 
 capable of doing Native Command Queueing support a write type called 
 Force Unit Access, which is essentially just like a direct write that 
 cannot be cached.  When we get more kernels with reliable sync writing 
 that maps under the hood to FUA, and can change wal_sync_method to use 
 them, the need to constantly call fsync for every write to the WAL will 
 go away.  Then the blow out the RAID cache when barriers are on 
 behavior will only show up during checkpoint fsyncs, which will make 
 things a lot better (albeit still not ideal).

Great information!  I have added the attached documentation patch to
explain the write-barrier/BBU interaction.  This will appear in the 9.0
documentation.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +
Index: doc/src/sgml/wal.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/wal.sgml,v
retrieving revision 1.66
diff -c -c -r1.66 wal.sgml
*** doc/src/sgml/wal.sgml	13 Apr 2010 14:15:25 -	1.66
--- doc/src/sgml/wal.sgml	7 Jul 2010 13:55:58 -
***
*** 48,68 
 some later time. Such caches can be a reliability hazard because the
 memory in the disk controller cache is volatile, and will lose its
 contents in a power failure.  Better controller cards have
!firsttermbattery-backed/ caches, meaning the card has a battery that
 maintains power to the cache in case of system power loss.  After power
 is restored the data will be written to the disk drives.
/para
  
para
 And finally, most disk drives have caches. Some are write-through
!while some are write-back, and the
!same concerns about data loss exist for write-back drive caches as
!exist for disk controller caches.  Consumer-grade IDE and SATA drives are
!particularly likely to have write-back caches that will not survive a
!power failure, though acronymATAPI-6/ introduced a drive cache
!flush command (FLUSH CACHE EXT) that some file systems use, e.g. acronymZFS/.
!Many solid-state drives (SSD) also have volatile write-back
!caches, and many do not honor cache flush commands by default.
 To check write caching on productnameLinux/ use
 commandhdparm -I/;  it is enabled if there is a literal*/ next
 to literalWrite cache/; commandhdparm -W/ to turn off
--- 48,74 
 some later time. Such caches can be a reliability hazard because the
 memory in the disk controller cache is volatile, and will lose its
 contents in a power failure.  Better controller cards have
!firsttermbattery-backed unit/ (acronymBBU/) caches, meaning
!the card has a battery that
 maintains power to the cache in case of system power loss.  After power
 is restored the data will be written to the disk drives.
/para
  
para
 And finally, most disk drives have caches. Some are write-through
!while some are write-back, and the same concerns about data loss
!exist for write-back drive caches as exist for disk controller
!caches.  Consumer-grade IDE and SATA drives are particularly likely
!to have write-back caches that will not survive a power failure,
!though acronymATAPI-6/ introduced a drive cache flush command
!(commandFLUSH CACHE EXT/) that some file systems use, e.g.
!acronymZFS/, acronymext4/.  (The SCSI command
!commandSYNCHRONIZE CACHE/ has long been available.) Many
!solid-state drives (SSD) also have volatile write-back caches, and
!many do not honor cache flush commands by default.
!   /para
! 
!   para
 To check write caching on productnameLinux/ use
 commandhdparm -I/;  it is enabled if there is a literal*/ next
 to literalWrite cache/; commandhdparm -W/ to turn off
***
*** 83,88 
--- 89,113 
/para
  
para
+Many file systems that use write barriers (e.g.  acronymZFS/,
+acronymext4/) internally 

Re: [PERFORM] [Slony1-general] WAL partition overloaded--by autovacuum?

2010-07-07 Thread Richard Yen
Sorry, I forgot to mention that archive_mode is off and commented out, and 
archive command is '' and commented out. 

Thanks for following up!
-- Richard

On Jul 7, 2010, at 1:58, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote:

 On 07/07/10 13:10, Richard Yen wrote:
 
 This leads me to believe that there was a sudden flurry of write activity 
 that occurred, and the process that would flush WAL files to /db/data/ 
 couldn't keep up, thereby filling up the disk.  I'm wondering if anyone else 
 out there might be able to give me some insight or comments to my 
 assessment--is it accurate?  Any input would be helpful, and I'll try to 
 make necessary architectural changes to keep this from happening again.
 
 
 Do you have wal archiving enabled? (if so lets see your archive_command).
 
 Cheers
 
 Mark
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Highly Efficient Custom Sorting

2010-07-07 Thread Eliot Gable
Thanks again for all the input and suggestions from people. I have this
sorting algorithm re-implemented in C now and it is somewhere 2ms to run it
now; though it is difficult to get a more accurate measure. There may be
some additional optimizations I can come up with, but for now, this will
work very well compared to the alternative methods.

On Tue, Jul 6, 2010 at 6:21 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Eliot Gable 
 egable+pgsql-performa...@gmail.comegable%2bpgsql-performa...@gmail.com
 writes:
  Do I need to somehow force the server to unload and then re-load this .so
  file each time I build a new version of it? If so, how do I do that?

 Start a new database session.

regards, tom lane




-- 
Eliot Gable

We do not inherit the Earth from our ancestors: we borrow it from our
children. ~David Brower

I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be a
crime. ~David Brower

Esse oportet ut vivas, non vivere ut edas. (Thou shouldst eat to live; not
live to eat.) ~Marcus Tullius Cicero


Re: [Slony1-general] [PERFORM] WAL partition overloaded--by autovacuum?

2010-07-07 Thread Richard Yen
On Jul 6, 2010, at 8:25 PM, Scott Marlowe wrote:

 Tell us what you can about your hardware setup.

Sorry, I made the bad assumption that the hardware setup would be 
irrelevant--dunno why I thought that.

My hardware setup is 2 FusionIO 160GB drives in a RAID-1 configuration, running 
on an HP DL360 G5

I think I figured out the problem:

-- I figured that pg_xlog and data/base could both be on the FusionIO drive, 
since there would be no latency when there are no spindles.
-- However, I didn't take into account the fact that pg_xlog might grow in size 
when autovacuum does its work when vacuuming to prevent XID wraparound.  I 
*just* discovered this when one of my other replication nodes decided to die on 
me and fill up its disk.
-- Unfortunately, my db is 114GB (including indexes) or 60GB (without indexes), 
leaving ~37GB for pg_xlog (since they are sharing a partition).  So I'm 
guessing what happened was that when autovacuum ran to prevent XID wraparound, 
it takes each table and changes the XID, and it gets recorded in WAL, causing 
WAL to bloat.  This this the correct understanding?

Question for now is, documentation says:
 There will always be at least one WAL segment file, and will normally not be 
 more than (2 + checkpoint_completion_target) * checkpoint_segments + 1 files. 
 Each segment file is normally 16 MB (though this size can be altered when 
 building the server). You can use this to estimate space requirements for 
 WAL. Ordinarily, when old log segment files are no longer needed, they are 
 recycled (renamed to become the next segments in the numbered sequence). If, 
 due to a short-term peak of log output rate, there are more than 3 * 
 checkpoint_segments + 1 segment files, the unneeded segment files will be 
 deleted instead of recycled until the system gets back under this limit.

This means my pg_xlog partition should be (2 + checkpoint_completion_target) * 
checkpoint_segments + 1 = 41 files, or 656MB.  Then, if there are more than 49 
files, unneeded segment files will be deleted, but in this case all segment 
files are needed, so they never got deleted.  Perhaps we should add in the docs 
that pg_xlog should be the size of the DB or larger?

--Richard




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Highly Efficient Custom Sorting

2010-07-07 Thread Kenneth Marshall
Hi Eliot,

Would you mind posting your code for reference. It is nice to
have working examples when trying to figure out how it all fits
together.

Regards,
Ken

On Wed, Jul 07, 2010 at 03:23:12PM -0400, Eliot Gable wrote:
 Thanks again for all the input and suggestions from people. I have this
 sorting algorithm re-implemented in C now and it is somewhere 2ms to run it
 now; though it is difficult to get a more accurate measure. There may be
 some additional optimizations I can come up with, but for now, this will
 work very well compared to the alternative methods.
 
 On Tue, Jul 6, 2010 at 6:21 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 
  Eliot Gable 
  egable+pgsql-performa...@gmail.comegable%2bpgsql-performa...@gmail.com
  writes:
   Do I need to somehow force the server to unload and then re-load this .so
   file each time I build a new version of it? If so, how do I do that?
 
  Start a new database session.
 
 regards, tom lane
 
 
 
 
 -- 
 Eliot Gable
 
 We do not inherit the Earth from our ancestors: we borrow it from our
 children. ~David Brower
 
 I decided the words were too conservative for me. We're not borrowing from
 our children, we're stealing from them--and it's not even considered to be a
 crime. ~David Brower
 
 Esse oportet ut vivas, non vivere ut edas. (Thou shouldst eat to live; not
 live to eat.) ~Marcus Tullius Cicero

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] big data - slow select (speech search)

2010-07-07 Thread Robert Haas
On Wed, Jul 7, 2010 at 9:31 AM, Michal Fapso michal.fa...@gmail.com wrote:
 thank you for your help. I tried to cluster the table on
 hyps_wordid_index and the query execution time dropped from 4.43 to
 0.19 seconds which is not that far from Lucene's performance of 0.10
 second.

Dang.  Nice!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Logarithmic data frequency distributions and the query planner

2010-07-07 Thread Jerry Gamache
On 8.1, I have a very interesting database where the distributions of 
some values in a multi-million rows table is logarithmic (i.e. the most 
frequent value is an order of magnitude more frequent than the next 
ones). If I analyze the table, the statistics become extremely skewed 
towards the most frequent values and this prevents the planner from 
giving any good results on queries that do not target these entries.


In a recent case, the planner estimated that the number of returned rows 
would be ~13% of the table size and from this bad assumption generated a 
very slow conservative plan that executed in days. If I set the 
statistics at zero for that table, the planner uses a hardcoded ratio 
(seems like 0.5%) for the number of returned rows and this helps 
generating a plan that executes in 3 minutes (still sub-optimal, but not 
as bad).


Generating partial index for the less frequent cases helps, but this 
solution is not flexible enough for our needs as the number of complex 
queries grow. We are mostly left with pre-generating a lot of temporary 
tables whenever the planner over-evaluates the number of rows generated 
by a subquery (query execution was trimmed from 3 minutes to 30 seconds 
using this technique) or using the OFFSET 0 tweak, but it would be nice 
if the planner could handle this on its own.


Am I missing something obvious? Setting the statistics for this table to 
zero seems awkward even if it gives good results.

Jerry.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Logarithmic data frequency distributions and the query planner

2010-07-07 Thread Tom Lane
Jerry Gamache jerry.gama...@idilia.com writes:
 On 8.1, I have a very interesting database where the distributions of 
 some values in a multi-million rows table is logarithmic (i.e. the most 
 frequent value is an order of magnitude more frequent than the next 
 ones). If I analyze the table, the statistics become extremely skewed 
 towards the most frequent values and this prevents the planner from 
 giving any good results on queries that do not target these entries.

Highly skewed distributions are hardly unusual, and I'm not aware that
the planner is totally incapable of dealing with them.  You do need a
large enough stats target to get down into the tail of the
distribution (the default target for 8.1 is probably too small for you).
It might be that there have been some other relevant improvements since
8.1, too ...

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance on new linux box

2010-07-07 Thread Tom Lane
Ryan Wexler r...@iridiumsuite.com writes:
 Postgresql was previously running on a single cpu linux machine with 2 gigs
 of memory and a single sata drive (v8.3).  Basically a desktop with linux on
 it.  I experienced slow performance.

 So, I finally moved it to a real server.  A dually zeon centos machine with
 6 gigs of memory and raid 10, postgres 8.4.  But, I am now experiencing even
 worse performance issues.

I'm wondering if you moved to a kernel+filesystem version that actually
enforces fsync, from one that didn't.  If so, the apparently faster
performance on the old box was being obtained at the cost of (lack of)
crash safety.  That probably goes double for your windows-box comparison
point.

You could try test_fsync from the Postgres sources to confirm that
theory, or do some pgbench benchmarking to have more quantifiable
numbers.

See past discussions about write barriers in this list's archives for
more detail.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance on new linux box

2010-07-07 Thread Rob Wultsch
On Wed, Jul 7, 2010 at 4:06 PM, Ryan Wexler r...@iridiumsuite.com wrote:
 Postgresql was previously running on a single cpu linux machine with 2 gigs
 of memory and a single sata drive (v8.3).  Basically a desktop with linux on
 it.  I experienced slow performance.

 So, I finally moved it to a real server.  A dually zeon centos machine with
 6 gigs of memory and raid 10, postgres 8.4.  But, I am now experiencing even
 worse performance issues.

 My system is consistently highly transactional.  However, there is also
 regular complex queries and occasional bulk loads.

 On the new system the bulk loads are extremely slower than on the previous
 machine and so are the more complex queries.  The smaller transactional
 queries seem comparable but i had expected an improvement.  Performing a db
 import via psql -d databas -f dbfile illustrates this problem.  It takes 5
 hours to run this import.  By contrast, if I perform this same exact import
 on my crappy windows box with only 2 gigs of memory and default postgres
 settings it takes 1 hour.  Same deal with the old linux machine.  How is
 this possible?

 Here are some of my key config settings:
 max_connections = 100
 shared_buffers = 768MB
 effective_cache_size = 2560MB
 work_mem = 16MB
 maintenance_work_mem = 128MB
 checkpoint_segments = 7
 checkpoint_timeout = 7min
 checkpoint_completion_target = 0.5

 I have tried varying the shared_buffers size from 128 all the way to 1500mbs
 and got basically the same result.   Is there a setting change I should be
 considering?

 Does 8.4 have performance problems or is this unique to me?

 thanks



I think the most likely explanation is that the crappy box lied about
fsync'ing data and your server is not. Did you purchase a raid card
with a bbu? If so, can you set the write cache policy to write-back?

-- 
Rob Wultsch
wult...@gmail.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance on new linux box

2010-07-07 Thread Andy Colson

On 07/07/2010 06:06 PM, Ryan Wexler wrote:

Postgresql was previously running on a single cpu linux machine with 2 gigs of 
memory and a single sata drive (v8.3).  Basically a desktop with linux on it.  
I experienced slow performance.

So, I finally moved it to a real server.  A dually zeon centos machine with 6 
gigs of memory and raid 10, postgres 8.4.  But, I am now experiencing even 
worse performance issues.

My system is consistently highly transactional.  However, there is also regular 
complex queries and occasional bulk loads.

On the new system the bulk loads are extremely slower than on the previous 
machine and so are the more complex queries.  The smaller transactional queries 
seem comparable but i had expected an improvement.  Performing a db import via 
psql -d databas -f dbfile illustrates this problem.  It takes 5 hours to run 
this import.  By contrast, if I perform this same exact import on my crappy 
windows box with only 2 gigs of memory and default postgres settings it takes 1 
hour.  Same deal with the old linux machine.  How is this possible?

Here are some of my key config settings:
max_connections = 100
shared_buffers = 768MB
effective_cache_size = 2560MB
work_mem = 16MB
maintenance_work_mem = 128MB
checkpoint_segments = 7
checkpoint_timeout = 7min
checkpoint_completion_target = 0.5

I have tried varying the shared_buffers size from 128 all the way to 1500mbs 
and got basically the same result.   Is there a setting change I should be 
considering?

Does 8.4 have performance problems or is this unique to me?

thanks



Yeah, I inherited a server (the quotes are sarcastic air quotes), with really bad disk 
IO... er.. really safe disk IO.  Try the dd test.  On my desktop I get 60-70 meg a second.  On this 
server (I laugh) I got about 20.  I had to go out of my way (way out) to enable the 
disk caching, and even then only got 50 meg a second.

http://www.westnet.com/~gsmith/content/postgresql/pg-disktesting.htm

-Andy


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance