[PERFORM] Two different execution plan for the same request
Hi, I've trouble with some SQL request which have different execution plans when ran on two different servers. One server is the development environment, the othe rone is th pre-production env. Both servers run postgreSQL 8.3.0 on Linux and : - both databases contains the same data (pg_dump/pg_restore between servers) - instances have the same configuration parameters - vaccum and analyze is run every day. The only difference I can see is the hardware. The pre-preoduction env. has more RAM, more CPU and a RAID5 disk array through LVM where data are stored. Performances should be better on the pre-production but unfortunatelly this is not the case. Below are the execution plan on development env and pre-production. As you can see pre-production performance are poor, far slower than dev. env ! For information, enable_seqscan is turned off (some DBA advice). Also I can get the same execution plan on both environment by turning off enable_mergejoin and enable_hashjoin on the pre-production. Then execution matches and performances are much better. Could anyone help to guess why both servers don't have the same execution plans ? Can this be due to hardware difference ? Let me know if you need more detailed informations on these configurations. Thanks. Dev. environment : EXPLAIN analyze SELECT DISTINCT ConstantesTableNBienService.id,ConstantesTableNBienService.code,ConstantesTableNBienService.lib_code FROM T_DEMANDE ConstantesTableDemande LEFT OUTER JOIN T_OPERATION ConstantesTableOperation ON ConstantesTableDemande.id_tech = ConstantesTableOperation.id_demande LEFT OUTER JOIN T_BIEN_SERVICE ConstantesTableBienService ON ConstantesTableBienService.id_operation = ConstantesTableOperation.id_tech LEFT OUTER JOIN N_BIEN_SERVICE ConstantesTableNBienService ON ConstantesTableBienService.bs_code = ConstantesTableNBienService.id WHERE ConstantesTableDemande.id_tech = 'y+3eRapRQjW8mtL4wHd4/A==' AND ConstantesTableOperation.type_operation = 'acq' AND ConstantesTableNBienService.parent is null ORDER BY ConstantesTableNBienService.code ASC; QUERY PLAN Unique (cost=3586307.73..3586341.94 rows=205 width=123) (actual time=440.626..440.875 rows=1 loops=1) - Sort (cost=3586307.73..3586316.28 rows=3421 width=123) (actual time=440.624..440.723 rows=187 loops=1) Sort Key: constantestablenbienservice.code, constantestablenbienservice.id, constantestablenbienservice.lib_code Sort Method: quicksort Memory: 24kB - Nested Loop Left Join (cost=40.38..3586106.91 rows=3421 width=123) (actual time=71.696..440.240 rows=187 loops=1) Filter: (constantestablenbienservice.parent IS NULL) - Nested Loop Left Join (cost=40.38..3554085.80 rows=6842 width=4) (actual time=66.576..433.797 rows=187 loops=1) - Nested Loop (cost=0.00..5041.46 rows=1246 width=25) (actual time=22.923..23.054 rows=30 loops=1) - Index Scan using t_demande_pkey on t_demande constantestabledemande (cost=0.00..8.32 rows=1 width=25) (actual time=5.534..5.537 rows=1 loops=1) Index Cond: ((id_tech)::text = 'y+3eRapRQjW8mtL4wHd4/A=='::text) - Index Scan using idx_operation_demande on t_operation constantestableoperation (cost=0.00..5020.68 rows=1246 width=50) (actual time=17.382..17.460 rows=30 loops=1) Index Cond: ((constantestableoperation.id_demande)::text = 'y+3eRapRQjW8mtL4wHd4/A=='::text) Filter: ((constantestableoperation.type_operation)::text = 'acq'::text) - Bitmap Heap Scan on t_bien_service constantestablebienservice (cost=40.38..2836.96 rows=911 width=29) (actual time=13.511..13.677 rows=6 loops=30) Recheck Cond: ((constantestablebienservice.id_operation)::text = (constantestableoperation.id_tech)::text) - Bitmap Index Scan on idx_bien_service_operation (cost=0.00..40.15 rows=911 width=0) (actual time=13.144..13.144 rows=6 loops=30) Index Cond: ((constantestablebienservice.id_operation)::text = (constantestableoperation.id_tech)::text) - Index Scan using n_bien_service_pkey on n_bien_service constantestablenbienservice (cost=0.00..4.67 rows=1 width=127) (actual time=0.030..0.031 rows=1 loops=187) Index Cond: (constantestablebienservice.bs_code = constantestablenbienservice.id) Total runtime: 2.558 ms (20 lignes) Pre-production: EXPLAIN analyze SELECT DISTINCT ConstantesTableNBienService.id,ConstantesTableNBienService.code,ConstantesTableNBienService.lib_code FROM T_DEMANDE ConstantesTableDemande LEFT OUTER JOIN T_OPERATION ConstantesTableOperation ON ConstantesTableDemande.id_tech = ConstantesTableOperation.id_demande LEFT OUTER JOIN T_BIEN_SERVICE ConstantesTableBienService ON ConstantesTableBienService.id_operation = ConstantesTableOperation.id_tech LEFT OUTER JOIN N_BIEN_SERVICE ConstantesTableNBienService ON ConstantesTableBienService.bs_code = ConstantesTableNBienService.id WHERE ConstantesTableDemande.id_tech =
Re: [PERFORM] Two different execution plan for the same request
JOUANIN Nicolas (44) wrote: Hi, I've trouble with some SQL request which have different execution plans when ran on two different servers. One server is the development environment, the othe rone is th pre-production env. Both servers run postgreSQL 8.3.0 on Linux and : - both databases contains the same data (pg_dump/pg_restore between servers) - instances have the same configuration parameters - vaccum and analyze is run every day. The only difference I can see is the hardware. The pre-preoduction env. has more RAM, more CPU and a RAID5 disk array through LVM where data are stored. Hello Jouanin, Could you give some more information following the guidelines from http://wiki.postgresql.org/wiki/SlowQueryQuestions ? Essential are the contents from both conf files (comments may be removed). regards, Yeb Havinga -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Two different execution plan for the same request
JOUANIN Nicolas (44) wrote: The strange thing is that this morning explain analyze now gives a much better duration : There were no modification made on the database except a restart yesterday evening and a vacuumdb --analyse ran at night. If the earlier bad query was run on a fresh imported database that was not ANALYZEd, it would explain the different and likely bad plan. If you want to know for sure this is the cause, instead of e.g. faulty hardware, you could verify redoing the import + query without analyze. regards, Yeb Havinga -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [Slony1-general] WAL partition overloaded--by autovacuum?
On 07/07/10 13:10, Richard Yen wrote: This leads me to believe that there was a sudden flurry of write activity that occurred, and the process that would flush WAL files to /db/data/ couldn't keep up, thereby filling up the disk. I'm wondering if anyone else out there might be able to give me some insight or comments to my assessment--is it accurate? Any input would be helpful, and I'll try to make necessary architectural changes to keep this from happening again. Do you have wal archiving enabled? (if so lets see your archive_command). Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Two different execution plan for the same request
Hi Nicolas, On Wed, Jul 7, 2010 at 10:47 AM, JOUANIN Nicolas (44) nicolas.joua...@dgfip.finances.gouv.fr wrote: There were no modification made on the database except a restart yesterday evening and a vacuumdb --analyse ran at night. It's not really surprising considering you probably kept the default_statistics_target to 10 (it's the default in 8.3). Consider raising it to 100 in your postgresql.conf (100 is the default for newer versions), then reload, and run a new ANALYZE. You might need to set it higher on specific columns if you have a lot of data and your data distribution is weird. And, btw, please upgrade to the latest 8.3.x. HTH -- Guillaume -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Two different execution plan for the same request
It seems to work fine (same execution plan and less duration) after : - setting default_statistics_target to 100 - full vacuum with analyze - reindexdb Thanks. Message original Sujet: Re: [PERFORM] Two different execution plan for the same request De: Guillaume Smet guillaume.s...@gmail.com Pour: JOUANIN Nicolas (44) nicolas.joua...@dgfip.finances.gouv.fr Copie : Yeb Havinga yebhavi...@gmail.com, pgsql-performance@postgresql.org Date: 07/07/2010 10:59 Hi Nicolas, On Wed, Jul 7, 2010 at 10:47 AM, JOUANIN Nicolas (44) nicolas.joua...@dgfip.finances.gouv.fr wrote: There were no modification made on the database except a restart yesterday evening and a vacuumdb --analyse ran at night. It's not really surprising considering you probably kept the default_statistics_target to 10 (it's the default in 8.3). Consider raising it to 100 in your postgresql.conf (100 is the default for newer versions), then reload, and run a new ANALYZE. You might need to set it higher on specific columns if you have a lot of data and your data distribution is weird. And, btw, please upgrade to the latest 8.3.x. HTH
Re: [PERFORM] WAL partition overloaded--by autovacuum?
Richard Yen wrote: the pg_xlog partition filled up (33GB) checkpoint_segments=16 a sudden flurry of write activity Was this database bulk-loaded in some way (like by restoring the output of pg_dump, for example)? If so, all rows inserted into all tables would have the same (or very nearly the same) xmin value. At some later time, virtually all tuples would need to be rewritten to freeze them. This would be a logged operation, so WAL files would be getting created rapidly. If you have WAL archiving turned on, and you can't copy the files out as fast as they're being created, this might happen. To avoid such a crushing mass freeze at an unpredictable time, we always run VACUUM FREEZE ANALYZE on a bulk-loaded database before turning on WAL archiving. Of course, if this database wasn't bulk-loaded, this can't be your problem -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Two different execution plan for the same request
On Wed, 7 Jul 2010, JOUANIN Nicolas (44) wrote: It seems to work fine (same execution plan and less duration) after : - setting default_statistics_target to 100 - full vacuum with analyze Don't do VACUUM FULL. Matthew -- I suppose some of you have done a Continuous Maths course. Yes? Continuous Maths? menacing stares from audience Whoah, it was like that, was it! -- Computer Science Lecturer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Two equivalent WITH RECURSIVE queries, one of them slow.
On Mon, Jul 5, 2010 at 2:07 AM, Octavio Alvarez alvar...@alvarezp.ods.org wrote: Hello. I have a tree-like table with a three-field PK (name, date, id) and one parent field. It has 5k to 6k records as of now, but it will hold about 1 million records. I am trying the following WITH RECURSIVE query: WITH RECURSIVE t AS ( SELECT par.id AS tid, par.name, par.date, par.id, par.text, par.h_title, par.h_name, par.parent FROM _books.par UNION SELECT t.tid AS pid, p.name, p.date, p.id, p.text, p.h_title, p.h_name, p.parent FROM t, _books.par p WHERE p.name = t.name AND p.date = t.date AND t.id = p.parent ) SELECT t.tid, t.name, t.date, t.id, t.text, t.h_title, t.h_name, t.parent FROM t WHERE name = 'cfx' AND date = '2009-08-19' AND tid = '28340'; ... which takes 2547.503 ms However, if I try the same query but adding the same WHERE clause to the non-recursive term, I get much better results. WITH RECURSIVE t AS ( SELECT par.id AS tid, par.name, par.date, par.id, par.text, par.h_title, par.h_name, par.parent FROM _books.par WHERE name = 'cfx' AND date = '2009-08-19' AND par.id = '28340' UNION SELECT t.tid AS pid, p.name, p.date, p.id, p.text, p.h_title, p.h_name, p.parent FROM t, _books.par p WHERE p.name = t.name AND p.date = t.date AND t.id = p.parent ) SELECT t.tid, t.name, t.date, t.id, t.text, t.h_title, t.h_name, t.parent FROM t WHERE name = 'cfx' AND date = '2009-08-19' AND tid = '28340'; ... which takes 0.221 ms If you want the fast plan, you might want to consider reworking your query into a set returning function. It's pretty easy to do: create or replace function f(arg int) returns setof something as $$ with recursive foo as ( select * from bar where id = $1 union all [...] ) select * from foo $$ language sql; Obviously, a pure view approach would be nicer but it just isn't going to hapen at present. CTE are currently problematic generally when you need quals in the 'with' term, especially in the case of recursive CTE. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] big data - slow select (speech search)
Hi Robert, thank you for your help. I tried to cluster the table on hyps_wordid_index and the query execution time dropped from 4.43 to 0.19 seconds which is not that far from Lucene's performance of 0.10 second. Thanks a lot! Miso Fapso On 6 July 2010 02:25, Robert Haas robertmh...@gmail.com wrote: On Thu, Jul 1, 2010 at 6:34 PM, Michal Fapso michal.fa...@gmail.com wrote: It took about 4.5 seconds. If I rerun it, it takes less than 2 miliseconds, but it is because of the cache. I need to optimize the first-run. laptop ASUS, CPU dual core T2300 1.66GHz, 1.5G RAM EXPLAIN ANALYZE SELECT h1.docid FROM hyps AS h1 WHERE h1.wordid=65658; Bitmap Heap Scan on hyps h1 (cost=10.97..677.09 rows=171 width=4) (actual time=62.106..4416.864 rows=343 loops=1) Recheck Cond: (wordid = 65658) - Bitmap Index Scan on hyps_wordid_index (cost=0.00..10.92 rows=171 width=0) (actual time=42.969..42.969 rows=343 loops=1) Index Cond: (wordid = 65658) Total runtime: 4432.015 ms If I run the same query in Lucene search engine, it takes 0.105 seconds on the same data which is quite a huge difference. So PostgreSQL is reading 343 rows from disk in 4432 ms, or about 12 ms/row. I'm not an expert on seek times, but that might not really be that unreasonable, considering that those rows may be scattered all over the index and thus it may be basically random I/O. Have you tried clustering hyps on hyps_wordid_index? If you had a more sophisticated disk subsystem you could try increasing effective_io_concurrency but that's not going to help with only one spindle. If you run the same query in Lucene and it takes only 0.105 s, then Lucene is obviously doing a lot less I/O. I doubt that any amount of tuning of your existing schema is going to produce that kind of result on PostgreSQL. Using the full-text search stuff, or a gin index of some kind, might get you closer, but it's hard to beat a special-purpose engine that implements exactly the right algorithm for your use case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
Greg Smith wrote: Kevin Grittner wrote: I don't know at the protocol level; I just know that write barriers do *something* which causes our controllers to wait for actual disk platter persistence, while fsync does not It's in the docs now: http://www.postgresql.org/docs/9.0/static/wal-reliability.html FLUSH CACHE EXT is the ATAPI-6 call that filesystems use to enforce barriers on that type of drive. Here's what the relevant portion of the ATAPI spec says: This command is used by the host to request the device to flush the write cache. If there is data in the write cache, that data shall be written to the media.The BSY bit shall remain set to one until all data has been successfully written or an error occurs. SAS systems have a similar call named SYNCHRONIZE CACHE. The improvement I actually expect to arrive here first is a reliable implementation of O_SYNC/O_DSYNC writes. Both SAS and SATA drives that capable of doing Native Command Queueing support a write type called Force Unit Access, which is essentially just like a direct write that cannot be cached. When we get more kernels with reliable sync writing that maps under the hood to FUA, and can change wal_sync_method to use them, the need to constantly call fsync for every write to the WAL will go away. Then the blow out the RAID cache when barriers are on behavior will only show up during checkpoint fsyncs, which will make things a lot better (albeit still not ideal). Great information! I have added the attached documentation patch to explain the write-barrier/BBU interaction. This will appear in the 9.0 documentation. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + Index: doc/src/sgml/wal.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/wal.sgml,v retrieving revision 1.66 diff -c -c -r1.66 wal.sgml *** doc/src/sgml/wal.sgml 13 Apr 2010 14:15:25 - 1.66 --- doc/src/sgml/wal.sgml 7 Jul 2010 13:55:58 - *** *** 48,68 some later time. Such caches can be a reliability hazard because the memory in the disk controller cache is volatile, and will lose its contents in a power failure. Better controller cards have !firsttermbattery-backed/ caches, meaning the card has a battery that maintains power to the cache in case of system power loss. After power is restored the data will be written to the disk drives. /para para And finally, most disk drives have caches. Some are write-through !while some are write-back, and the !same concerns about data loss exist for write-back drive caches as !exist for disk controller caches. Consumer-grade IDE and SATA drives are !particularly likely to have write-back caches that will not survive a !power failure, though acronymATAPI-6/ introduced a drive cache !flush command (FLUSH CACHE EXT) that some file systems use, e.g. acronymZFS/. !Many solid-state drives (SSD) also have volatile write-back !caches, and many do not honor cache flush commands by default. To check write caching on productnameLinux/ use commandhdparm -I/; it is enabled if there is a literal*/ next to literalWrite cache/; commandhdparm -W/ to turn off --- 48,74 some later time. Such caches can be a reliability hazard because the memory in the disk controller cache is volatile, and will lose its contents in a power failure. Better controller cards have !firsttermbattery-backed unit/ (acronymBBU/) caches, meaning !the card has a battery that maintains power to the cache in case of system power loss. After power is restored the data will be written to the disk drives. /para para And finally, most disk drives have caches. Some are write-through !while some are write-back, and the same concerns about data loss !exist for write-back drive caches as exist for disk controller !caches. Consumer-grade IDE and SATA drives are particularly likely !to have write-back caches that will not survive a power failure, !though acronymATAPI-6/ introduced a drive cache flush command !(commandFLUSH CACHE EXT/) that some file systems use, e.g. !acronymZFS/, acronymext4/. (The SCSI command !commandSYNCHRONIZE CACHE/ has long been available.) Many !solid-state drives (SSD) also have volatile write-back caches, and !many do not honor cache flush commands by default. ! /para ! ! para To check write caching on productnameLinux/ use commandhdparm -I/; it is enabled if there is a literal*/ next to literalWrite cache/; commandhdparm -W/ to turn off *** *** 83,88 --- 89,113 /para para +Many file systems that use write barriers (e.g. acronymZFS/, +acronymext4/) internally
Re: [PERFORM] [Slony1-general] WAL partition overloaded--by autovacuum?
Sorry, I forgot to mention that archive_mode is off and commented out, and archive command is '' and commented out. Thanks for following up! -- Richard On Jul 7, 2010, at 1:58, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: On 07/07/10 13:10, Richard Yen wrote: This leads me to believe that there was a sudden flurry of write activity that occurred, and the process that would flush WAL files to /db/data/ couldn't keep up, thereby filling up the disk. I'm wondering if anyone else out there might be able to give me some insight or comments to my assessment--is it accurate? Any input would be helpful, and I'll try to make necessary architectural changes to keep this from happening again. Do you have wal archiving enabled? (if so lets see your archive_command). Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Highly Efficient Custom Sorting
Thanks again for all the input and suggestions from people. I have this sorting algorithm re-implemented in C now and it is somewhere 2ms to run it now; though it is difficult to get a more accurate measure. There may be some additional optimizations I can come up with, but for now, this will work very well compared to the alternative methods. On Tue, Jul 6, 2010 at 6:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: Eliot Gable egable+pgsql-performa...@gmail.comegable%2bpgsql-performa...@gmail.com writes: Do I need to somehow force the server to unload and then re-load this .so file each time I build a new version of it? If so, how do I do that? Start a new database session. regards, tom lane -- Eliot Gable We do not inherit the Earth from our ancestors: we borrow it from our children. ~David Brower I decided the words were too conservative for me. We're not borrowing from our children, we're stealing from them--and it's not even considered to be a crime. ~David Brower Esse oportet ut vivas, non vivere ut edas. (Thou shouldst eat to live; not live to eat.) ~Marcus Tullius Cicero
Re: [Slony1-general] [PERFORM] WAL partition overloaded--by autovacuum?
On Jul 6, 2010, at 8:25 PM, Scott Marlowe wrote: Tell us what you can about your hardware setup. Sorry, I made the bad assumption that the hardware setup would be irrelevant--dunno why I thought that. My hardware setup is 2 FusionIO 160GB drives in a RAID-1 configuration, running on an HP DL360 G5 I think I figured out the problem: -- I figured that pg_xlog and data/base could both be on the FusionIO drive, since there would be no latency when there are no spindles. -- However, I didn't take into account the fact that pg_xlog might grow in size when autovacuum does its work when vacuuming to prevent XID wraparound. I *just* discovered this when one of my other replication nodes decided to die on me and fill up its disk. -- Unfortunately, my db is 114GB (including indexes) or 60GB (without indexes), leaving ~37GB for pg_xlog (since they are sharing a partition). So I'm guessing what happened was that when autovacuum ran to prevent XID wraparound, it takes each table and changes the XID, and it gets recorded in WAL, causing WAL to bloat. This this the correct understanding? Question for now is, documentation says: There will always be at least one WAL segment file, and will normally not be more than (2 + checkpoint_completion_target) * checkpoint_segments + 1 files. Each segment file is normally 16 MB (though this size can be altered when building the server). You can use this to estimate space requirements for WAL. Ordinarily, when old log segment files are no longer needed, they are recycled (renamed to become the next segments in the numbered sequence). If, due to a short-term peak of log output rate, there are more than 3 * checkpoint_segments + 1 segment files, the unneeded segment files will be deleted instead of recycled until the system gets back under this limit. This means my pg_xlog partition should be (2 + checkpoint_completion_target) * checkpoint_segments + 1 = 41 files, or 656MB. Then, if there are more than 49 files, unneeded segment files will be deleted, but in this case all segment files are needed, so they never got deleted. Perhaps we should add in the docs that pg_xlog should be the size of the DB or larger? --Richard -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Highly Efficient Custom Sorting
Hi Eliot, Would you mind posting your code for reference. It is nice to have working examples when trying to figure out how it all fits together. Regards, Ken On Wed, Jul 07, 2010 at 03:23:12PM -0400, Eliot Gable wrote: Thanks again for all the input and suggestions from people. I have this sorting algorithm re-implemented in C now and it is somewhere 2ms to run it now; though it is difficult to get a more accurate measure. There may be some additional optimizations I can come up with, but for now, this will work very well compared to the alternative methods. On Tue, Jul 6, 2010 at 6:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: Eliot Gable egable+pgsql-performa...@gmail.comegable%2bpgsql-performa...@gmail.com writes: Do I need to somehow force the server to unload and then re-load this .so file each time I build a new version of it? If so, how do I do that? Start a new database session. regards, tom lane -- Eliot Gable We do not inherit the Earth from our ancestors: we borrow it from our children. ~David Brower I decided the words were too conservative for me. We're not borrowing from our children, we're stealing from them--and it's not even considered to be a crime. ~David Brower Esse oportet ut vivas, non vivere ut edas. (Thou shouldst eat to live; not live to eat.) ~Marcus Tullius Cicero -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] big data - slow select (speech search)
On Wed, Jul 7, 2010 at 9:31 AM, Michal Fapso michal.fa...@gmail.com wrote: thank you for your help. I tried to cluster the table on hyps_wordid_index and the query execution time dropped from 4.43 to 0.19 seconds which is not that far from Lucene's performance of 0.10 second. Dang. Nice! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Logarithmic data frequency distributions and the query planner
On 8.1, I have a very interesting database where the distributions of some values in a multi-million rows table is logarithmic (i.e. the most frequent value is an order of magnitude more frequent than the next ones). If I analyze the table, the statistics become extremely skewed towards the most frequent values and this prevents the planner from giving any good results on queries that do not target these entries. In a recent case, the planner estimated that the number of returned rows would be ~13% of the table size and from this bad assumption generated a very slow conservative plan that executed in days. If I set the statistics at zero for that table, the planner uses a hardcoded ratio (seems like 0.5%) for the number of returned rows and this helps generating a plan that executes in 3 minutes (still sub-optimal, but not as bad). Generating partial index for the less frequent cases helps, but this solution is not flexible enough for our needs as the number of complex queries grow. We are mostly left with pre-generating a lot of temporary tables whenever the planner over-evaluates the number of rows generated by a subquery (query execution was trimmed from 3 minutes to 30 seconds using this technique) or using the OFFSET 0 tweak, but it would be nice if the planner could handle this on its own. Am I missing something obvious? Setting the statistics for this table to zero seems awkward even if it gives good results. Jerry. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Logarithmic data frequency distributions and the query planner
Jerry Gamache jerry.gama...@idilia.com writes: On 8.1, I have a very interesting database where the distributions of some values in a multi-million rows table is logarithmic (i.e. the most frequent value is an order of magnitude more frequent than the next ones). If I analyze the table, the statistics become extremely skewed towards the most frequent values and this prevents the planner from giving any good results on queries that do not target these entries. Highly skewed distributions are hardly unusual, and I'm not aware that the planner is totally incapable of dealing with them. You do need a large enough stats target to get down into the tail of the distribution (the default target for 8.1 is probably too small for you). It might be that there have been some other relevant improvements since 8.1, too ... regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance on new linux box
Ryan Wexler r...@iridiumsuite.com writes: Postgresql was previously running on a single cpu linux machine with 2 gigs of memory and a single sata drive (v8.3). Basically a desktop with linux on it. I experienced slow performance. So, I finally moved it to a real server. A dually zeon centos machine with 6 gigs of memory and raid 10, postgres 8.4. But, I am now experiencing even worse performance issues. I'm wondering if you moved to a kernel+filesystem version that actually enforces fsync, from one that didn't. If so, the apparently faster performance on the old box was being obtained at the cost of (lack of) crash safety. That probably goes double for your windows-box comparison point. You could try test_fsync from the Postgres sources to confirm that theory, or do some pgbench benchmarking to have more quantifiable numbers. See past discussions about write barriers in this list's archives for more detail. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance on new linux box
On Wed, Jul 7, 2010 at 4:06 PM, Ryan Wexler r...@iridiumsuite.com wrote: Postgresql was previously running on a single cpu linux machine with 2 gigs of memory and a single sata drive (v8.3). Basically a desktop with linux on it. I experienced slow performance. So, I finally moved it to a real server. A dually zeon centos machine with 6 gigs of memory and raid 10, postgres 8.4. But, I am now experiencing even worse performance issues. My system is consistently highly transactional. However, there is also regular complex queries and occasional bulk loads. On the new system the bulk loads are extremely slower than on the previous machine and so are the more complex queries. The smaller transactional queries seem comparable but i had expected an improvement. Performing a db import via psql -d databas -f dbfile illustrates this problem. It takes 5 hours to run this import. By contrast, if I perform this same exact import on my crappy windows box with only 2 gigs of memory and default postgres settings it takes 1 hour. Same deal with the old linux machine. How is this possible? Here are some of my key config settings: max_connections = 100 shared_buffers = 768MB effective_cache_size = 2560MB work_mem = 16MB maintenance_work_mem = 128MB checkpoint_segments = 7 checkpoint_timeout = 7min checkpoint_completion_target = 0.5 I have tried varying the shared_buffers size from 128 all the way to 1500mbs and got basically the same result. Is there a setting change I should be considering? Does 8.4 have performance problems or is this unique to me? thanks I think the most likely explanation is that the crappy box lied about fsync'ing data and your server is not. Did you purchase a raid card with a bbu? If so, can you set the write cache policy to write-back? -- Rob Wultsch wult...@gmail.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance on new linux box
On 07/07/2010 06:06 PM, Ryan Wexler wrote: Postgresql was previously running on a single cpu linux machine with 2 gigs of memory and a single sata drive (v8.3). Basically a desktop with linux on it. I experienced slow performance. So, I finally moved it to a real server. A dually zeon centos machine with 6 gigs of memory and raid 10, postgres 8.4. But, I am now experiencing even worse performance issues. My system is consistently highly transactional. However, there is also regular complex queries and occasional bulk loads. On the new system the bulk loads are extremely slower than on the previous machine and so are the more complex queries. The smaller transactional queries seem comparable but i had expected an improvement. Performing a db import via psql -d databas -f dbfile illustrates this problem. It takes 5 hours to run this import. By contrast, if I perform this same exact import on my crappy windows box with only 2 gigs of memory and default postgres settings it takes 1 hour. Same deal with the old linux machine. How is this possible? Here are some of my key config settings: max_connections = 100 shared_buffers = 768MB effective_cache_size = 2560MB work_mem = 16MB maintenance_work_mem = 128MB checkpoint_segments = 7 checkpoint_timeout = 7min checkpoint_completion_target = 0.5 I have tried varying the shared_buffers size from 128 all the way to 1500mbs and got basically the same result. Is there a setting change I should be considering? Does 8.4 have performance problems or is this unique to me? thanks Yeah, I inherited a server (the quotes are sarcastic air quotes), with really bad disk IO... er.. really safe disk IO. Try the dd test. On my desktop I get 60-70 meg a second. On this server (I laugh) I got about 20. I had to go out of my way (way out) to enable the disk caching, and even then only got 50 meg a second. http://www.westnet.com/~gsmith/content/postgresql/pg-disktesting.htm -Andy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance