[PERFORM] Profiler for PostgreSQL
Is there any MS-SQL Server like 'Profiler' available for PostgreSQL? A profiler is a tool that monitors the database server and outputs a detailed trace of all the transactions/queries that areexecuted ona database during a specified period of time. Kindly let me know ifany of you knows of sucha toolfor PostgreSQL. Agha Asif Raza
Re: [PERFORM] Quad Opteron stuck in the mud
Dan Harris [EMAIL PROTECTED] writes: I keep the entire database vacuumed regularly. How often is regularly? We get frequent posts from people who think daily or every 4 hours is often enough. If the table is very busy you can need vacuums as often as every 15 minutes. Also, if you've done occasional massive batch updates like you describe here you may need a VACUUM FULL or alternatively a CLUSTER command to compact the table -- vacuum identifies the free space but if you've doubled the size of your table with a large update that's a lot more free space than you want hanging around waiting to be used. For example, as I'm writing this, I am running an UPDATE statement that will affect a small part of the table, and is querying on an indexed boolean field. ... update eventactivity set ftindex = false where ftindex = true; ( added the where clause because I don't want to alter where ftindex is null ) It's definitely worthwhile doing an EXPLAIN UPDATE... to see if this even used the index. It sounds like it did a sequential scan. Sequential scans during updates are especially painful. If there isn't free space lying around in the page where the updated record lies then another page has to be used or a new page added. If you're doing a massive update you can exhaust the free space available making the update have to go back and forth between the page being read and the end of the table where pages are being written. # vmstat output ( as I am waiting for this to finish ): procs ---memory-- ---swap-- -io --system-- cpu r b swpd freebuff cache si sobibo incs us sy id wa 0 1 5436 2823908 26140 918370401 2211 540 694 336 9 2 76 13 [I assume you ran vmstat 10 or some other interval and then waited for at least the second line? The first line outputted from vmstat is mostly meaningless] Um. That's a pretty meager i/o rate. Just over 2MB/s. The cpu is 76% idle which sounds fine but that could be one processor pegged at 100% while the others are idle. If this query is the only one running on the system then it would behave just like that. Is it possible you have some foreign keys referencing these records that you're updating? In which case every record being updated might be causing a full table scan on another table (or multiple other tables). If those tables are entirely in cache then it could cause these high cpu low i/o symptoms. Or are there any triggers on this table? -- greg ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Profiler for PostgreSQL
Try turning on query logging and using the 'pqa' utility on pgfoundry.org. Chris Agha Asif Raza wrote: Is there any MS-SQL Server like 'Profiler' available for PostgreSQL? A profiler is a tool that monitors the database server and outputs a detailed trace of all the transactions/queries that are executed on a database during a specified period of time. Kindly let me know if any of you knows of such a tool for PostgreSQL. Agha Asif Raza ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] JFS fastest filesystem for PostgreSQL?
[reposted due to delivery error -jwb] I just took delivery of a new system, and used the opportunity to benchmark postgresql 8.0 performance on various filesystems. The system in question runs Linux 2.6.12, has one CPU and 1GB of system memory, and 5 7200RPM SATA disks attached to an Areca hardware RAID controller having 128MB of cache. The caches are all write-back. I ran pgbench with a scale factor of 1000 and a total of 100,000 transactions per run. I varied the number of clients between 10 and 100. It appears from my test JFS is much faster than both ext3 and XFS for this workload. JFS and XFS were made with the mkfs defaults. ext3 was made with -T largefile4 and -E stride=32. The deadline scheduler was used for all runs (anticipatory scheduler is much worse). Here's the result, in transactions per second. ext3 jfs xfs - 10 Clients 55 81 68 100 Clients 61 100 64 -jwb ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Profiler for PostgreSQL
Agha Asif Raza wrote: Is there any MS-SQL Server like 'Profiler' available for PostgreSQL? A profiler is a tool that monitors the database server and outputs a detailed trace of all the transactions/queries that are executed on a database during a specified period of time. Kindly let me know if any of you knows of such a tool for PostgreSQL. Agha Asif Raza Sure see log_statement in postgresql.conf. There are a lot of settings in there to control what is logged. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] performance problems ... 100 cpu utilization
David Mitchell wrote: What is the load average on this machine? Do you do many updates? If you do a lot of updates, perhaps you haven't vacuumed recently. We were seeing similar symptoms when we started load testing our stuff and it turned out we were vacuuming too infrequently. The load average at the 100% utilization point was about 30! A vacuum analyze was done before the test was started. I believe there are many more selects than updates happening at any one time. Dennis ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] JFS fastest filesystem for PostgreSQL?
I was wondering - have you had a chance to run the same benchmarks on ReiserFS (ideally both 3 and 4, with notail)? I'd be quite interested to see how it performs in this situation since it's my fs of choice for most things. Thanks, Dmitri -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey W. Baker Sent: Thursday, July 14, 2005 2:34 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] JFS fastest filesystem for PostgreSQL? [reposted due to delivery error -jwb] I just took delivery of a new system, and used the opportunity to benchmark postgresql 8.0 performance on various filesystems. The system in question runs Linux 2.6.12, has one CPU and 1GB of system memory, and 5 7200RPM SATA disks attached to an Areca hardware RAID controller having 128MB of cache. The caches are all write-back. I ran pgbench with a scale factor of 1000 and a total of 100,000 transactions per run. I varied the number of clients between 10 and 100. It appears from my test JFS is much faster than both ext3 and XFS for this workload. JFS and XFS were made with the mkfs defaults. ext3 was made with -T largefile4 and -E stride=32. The deadline scheduler was used for all runs (anticipatory scheduler is much worse). Here's the result, in transactions per second. ext3 jfs xfs - 10 Clients 55 81 68 100 Clients 61 100 64 -jwb ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] slow joining very large table to smaller ones
I'm trying to improve the speed of this query: explain select recordtext from eventactivity inner join ( select incidentid from k_r where id = 94 ) a using ( incidentid ) inner join ( select incidentid from k_b where id = 107 ) b using ( incidentid ); QUERY PLAN -- Merge Join (cost=2747.29..4249364.96 rows=11968693 width=35) Merge Cond: ((outer.incidentid)::text = inner.?column2?) - Merge Join (cost=1349.56..4230052.73 rows=4413563 width=117) Merge Cond: ((outer.incidentid)::text = inner.?column2?) - Index Scan using eventactivity1 on eventactivity (cost=0.00..4051200.28 rows=44519781 width=49) - Sort (cost=1349.56..1350.85 rows=517 width=68) Sort Key: (k_b.incidentid)::text - Index Scan using k_b_idx on k_b (cost=0.00..1326.26 rows=517 width=68) Index Cond: (id = 107) - Sort (cost=1397.73..1399.09 rows=542 width=68) Sort Key: (k_r.incidentid)::text - Index Scan using k_r_idx on k_r (cost=0.00..1373.12 rows=542 width=68) Index Cond: (id = 94) (13 rows) There are many millions of rows in eventactivity. There are a few ten-thousand rows in k_r and k_b. There is an index on 'incidentid' in all three tables. There should only be less than 100 rows matched in k_r and k_b total. That part on its own is very very fast. But, it should have those 100 or so incidentids extracted in under a second and then go into eventactivity AFTER doing that. At least, that's my intention to make this fast. Right now, it looks like pg is trying to sort the entire eventactivity table for the merge join which is taking several minutes to do. Can I rephrase this so that it does the searching through k_r and k_b FIRST and then go into eventactivity using the index on incidentid? It seems like that shouldn't be too hard to make fast but my SQL query skills are only average. Thanks -Dan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] JFS fastest filesystem for PostgreSQL?
On 7/14/05, Jeffrey W. Baker [EMAIL PROTECTED] wrote: [reposted due to delivery error -jwb] I just took delivery of a new system, and used the opportunity to benchmark postgresql 8.0 performance on various filesystems. The system in question runs Linux 2.6.12, has one CPU and 1GB of system memory, and 5 7200RPM SATA disks attached to an Areca hardware RAID controller having 128MB of cache. The caches are all write-back. I ran pgbench with a scale factor of 1000 and a total of 100,000 transactions per run. I varied the number of clients between 10 and 100. It appears from my test JFS is much faster than both ext3 and XFS for this workload. JFS and XFS were made with the mkfs defaults. ext3 was made with -T largefile4 and -E stride=32. The deadline scheduler was used for all runs (anticipatory scheduler is much worse). Here's the result, in transactions per second. ext3 jfs xfs - 10 Clients 55 81 68 100 Clients 61 100 64 If you still have a chance, could you do tests with other journaling options for ext3 (journal=writeback, journal=data)? And could you give figures about performace of other IO elevators? I mean, you wrote that anticipatory is much wore -- how much worse? :) Could you give numbers for deadline,anticipatory,cfq elevators? :) And, additionally would it be possible to give numbers for bonnie++ results? To see how does pgbench to bonnie++ relate? Regards, Dawid ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] JFS fastest filesystem for PostgreSQL?
Quoting Jeffrey W. Baker [EMAIL PROTECTED]: Here's the result, in transactions per second. ext3 jfs xfs -- --- 10 Clients 55 81 68 100 Clients 61 100 64 Was fsync true? And have you tried ext2? Legend has it that ext2 is the fastest thing going for synchronous writes (besides O_DIRECT or raw) because there's no journal. -jwb ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] JFS fastest filesystem for PostgreSQL?
Did you seperate the data the transaction log? I've noticed less than optimal performance on xfs if the transaction log is on the xfs data partition, and it's silly to put the xlog on a journaled filesystem anyway. Try putting xlog on an ext2 for all the tests. Mike Stone ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Slow Query
On Thu, 2005-07-14 at 10:06 +1000, Marc McIntyre wrote: I'm having a problem with a query that performs a sequential scan on a table when it should be performing an index scan. The interesting thing is, when we dumped the database on another server, it performed an index scan on that server. ... The EXPLAIN ANALYZE from the system performing an sequential scan: QUERY PLAN Sort (cost=30079.79..30079.89 rows=42 width=113) (actual time=39889.989..39890.346 rows=260 loops=1) ... The EXPLAIN ANALYZE from the system performing an index scan scan: Sort (cost=16873.64..16873.74 rows=40 width=113) (actual time=2169.905..2169.912 rows=13 loops=1) looks like the first query is returning 260 rows, but the second one 13 this may not be your problem, but are you sure you are using the same query on the same data here ? gnari ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] JFS fastest filesystem for PostgreSQL?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dawid Kuroczko wrote: | | If you still have a chance, could you do tests with other journaling | options for ext3 (journal=writeback, journal=data)? And could you | give figures about performace of other IO elevators? I mean, you | wrote that anticipatory is much wore -- how much worse? :) Could | you give numbers for deadline,anticipatory,cfq elevators? :) | | And, additionally would it be possible to give numbers for bonnie++ | results? To see how does pgbench to bonnie++ relate? | Hello, list. I've been thinking on this one for a while - I'm not sure as to what ratio pgbench has with regard to stressing CPU vs. I/O. There is one thing that's definitely worth mentioning though: in the tests that I've been doing with bonnie++ and iozone at my former job, while building a distributed indexing engine, jfs was the one filesystem with the least strain on the CPU, which might be one of the deciding factors in making it look good for a particular workload. I'm afraid I don't have any concrete figures to offer as the material itself was classified. I can tell though that we've been comparing it with both ext2 and ext3, as well as xfs, and notably, xfs was the worst CPU hog of all. The CPU load difference between jfs and xfs was about 10% in favor of jfs in all random read/write tests, and the interesting thing is, jfs managed to shuffle around quite a lot of data: the mbps/cpu% ratio in xfs was much worse. As expected, there wasn't much difference in block transfer tests, but jfs was slightly winning in the area of CPU consumption and slightly lagging in the transfer rate field. What is a little bit concerning though, is the fact that some Linux distributors like SuSE have removed jfs support from their admin tooling quotedue to technical problems with jfs/quote (http://your-local-suse-mirror/.../suse/i386/9.3/docu/RELEASE-NOTES.en.html#14) I'm curious as to what this means - did they have problems integrating it into their toolchain or are there actual problems going on in jfs currently? Kind regards, - -- Grega Bremec gregab at p0f dot net -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFC1ld4fu4IwuB3+XoRAqEyAJ0TS9son+brhbQGtV7Cw7T8wa9W2gCfZ02/ dWm/E/Dc99TyKbxxl2tKaZc= =nvv3 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] JFS fastest filesystem for PostgreSQL?
On Thu, 2005-07-14 at 10:03 +0200, Dawid Kuroczko wrote: On 7/14/05, Jeffrey W. Baker [EMAIL PROTECTED] wrote: [reposted due to delivery error -jwb] I just took delivery of a new system, and used the opportunity to benchmark postgresql 8.0 performance on various filesystems. The system in question runs Linux 2.6.12, has one CPU and 1GB of system memory, and 5 7200RPM SATA disks attached to an Areca hardware RAID controller having 128MB of cache. The caches are all write-back. I ran pgbench with a scale factor of 1000 and a total of 100,000 transactions per run. I varied the number of clients between 10 and 100. It appears from my test JFS is much faster than both ext3 and XFS for this workload. JFS and XFS were made with the mkfs defaults. ext3 was made with -T largefile4 and -E stride=32. The deadline scheduler was used for all runs (anticipatory scheduler is much worse). Here's the result, in transactions per second. ext3 jfs xfs - 10 Clients 55 81 68 100 Clients 61 100 64 If you still have a chance, could you do tests with other journaling options for ext3 (journal=writeback, journal=data)? And could you give figures about performace of other IO elevators? I mean, you wrote that anticipatory is much wore -- how much worse? :) Could you give numbers for deadline,anticipatory,cfq elevators? :) And, additionally would it be possible to give numbers for bonnie++ results? To see how does pgbench to bonnie++ relate? Phew, that's a lot of permutations. At 20-30 minutes per run, I'm thinking 5-8 hours or so. Still, for you dear readers, I'll somehow accomplish this tedious feat. As for Bonnie, JFS is a good 60-80% faster than ext3. See my message to ext3-users yesterday. Using bonnie++ with a 10GB fileset, in MB/s: ext3jfsxfs Read 112 188141 Write 97 157167 Rewrite 51 71 60 -jwb ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] JFS fastest filesystem for PostgreSQL?
On Thu, Jul 14, 2005 at 02:15:52PM +0200, Grega Bremec wrote: I'm curious as to what this means - did they have problems integrating it into their toolchain or are there actual problems going on in jfs currently? I've found jfs to be the least stable linux filesystem and won't allow it anywhere near an important system. YMMV. Mike Stone ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] slow joining very large table to smaller ones
Dan Harris wrote: I'm trying to improve the speed of this query: explain select recordtext from eventactivity inner join ( select incidentid from k_r where id = 94 ) a using ( incidentid ) inner join ( select incidentid from k_b where id = 107 ) b using ( incidentid ); You might try giving it a little bit more freedom with: EXPLAIN ANALYZE SELECT recordtext FROM eventactivity, k_r, k_b WHERE eventactivity.incidentid = k_r.incidentid AND eventactivity.incidentid = k_b.incidentid AND k_r.id = 94 AND k_b.id = 107 -- AND k_r.incidentid = k_b.incidentid ; I'm pretty sure that would give identical results, just let the planner have a little bit more freedom about how it does it. Also the last line is commented out, because I think it is redundant. You might also try: EXPLAIN ANALYZE SELECT recordtext FROM eventactivity JOIN k_r USING (incidentid) JOIN k_b USING (incidentid) WHERE k_r.id = 94 AND k_b.id = 107 ; Also, if possible give us the EXPLAIN ANALYZE so that we know if the planner is making accurate estimates. (You might send an EXPLAIN while waiting for the EXPLAIN ANALYZE to finish) You can also try disabling merge joins, and see how that changes things. QUERY PLAN -- Merge Join (cost=2747.29..4249364.96 rows=11968693 width=35) Merge Cond: ((outer.incidentid)::text = inner.?column2?) - Merge Join (cost=1349.56..4230052.73 rows=4413563 width=117) Merge Cond: ((outer.incidentid)::text = inner.?column2?) - Index Scan using eventactivity1 on eventactivity (cost=0.00..4051200.28 rows=44519781 width=49) - Sort (cost=1349.56..1350.85 rows=517 width=68) Sort Key: (k_b.incidentid)::text - Index Scan using k_b_idx on k_b (cost=0.00..1326.26 rows=517 width=68) Index Cond: (id = 107) - Sort (cost=1397.73..1399.09 rows=542 width=68) Sort Key: (k_r.incidentid)::text - Index Scan using k_r_idx on k_r (cost=0.00..1373.12 rows=542 width=68) Index Cond: (id = 94) (13 rows) There are many millions of rows in eventactivity. There are a few ten-thousand rows in k_r and k_b. There is an index on 'incidentid' in all three tables. There should only be less than 100 rows matched in k_r and k_b total. That part on its own is very very fast. But, it should have those 100 or so incidentids extracted in under a second and then go into eventactivity AFTER doing that. At least, that's my intention to make this fast. Well, postgres is estimating around 500 rows each, is that way off? Try just doing: EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107; EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94; And see if postgres estimates the number of rows properly. I assume you have recently VACUUM ANALYZEd, which means you might need to update the statistics target (ALTER TABLE k_b ALTER COLUMN incidientid SET STATISTICS 100) default is IIRC 10, ranges from 1-1000, higher is more accurate, but makes ANALYZE slower. Right now, it looks like pg is trying to sort the entire eventactivity table for the merge join which is taking several minutes to do. Can I rephrase this so that it does the searching through k_r and k_b FIRST and then go into eventactivity using the index on incidentid? It seems like that shouldn't be too hard to make fast but my SQL query skills are only average. To me, it looks like it is doing an index scan (on k_b.id) through k_b first, sorting the results by incidentid, then merge joining that with eventactivity. I'm guessing you actually want it to merge k_b and k_r to get extra selectivity before joining against eventactivity. I think my alternate forms would let postgres realize this. But if not, you could try: EXPLAIN ANALYZE SELECT recordtext FROM eventactivity JOIN (SELECT incidentid FROM k_r JOIN k_b USING (incidentid) WHERE k_r.id = 94 AND k_b.id = 107) USING (incidentid); I don't know how selective your keys are, but one of these queries should probably structure it better for the planner. It depends a lot on how selective your query is. If you have 100M rows, the above query looks like it expects k_r to restrict it to 44M rows, and k_r + k_b down to 11M rows, which really should be a seq scan ( 10% of the rows = seq scan). But if you are saying the selectivity is mis-estimated it could be different. John =:- Thanks -Dan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq signature.asc Description: OpenPGP digital signature
Re: [PERFORM] Quad Opteron stuck in the mud
On Thu, Jul 14, 2005 at 12:28:05AM -0600, Dan Harris wrote: Ok, so I remounted this drive as ext2 shortly before sending my first email today. It wasn't enough time for me to notice the ABSOLUTELY HUGE difference in performance change. Ext3 must really be crappy for postgres, or at least is on this box. Now that it's ext2, this thing is flying like never before. My CPU utilization has skyrocketed, telling me that the disk IO was constraining it immensely. Were you using the default journal settings for ext3? An interesting experiment would be to use the other journal options (particularly data=writeback). From the mount manpage: data=journal / data=ordered / data=writeback Specifies the journalling mode for file data. Metadata is always journaled. To use modes other than ordered on the root file system, pass the mode to the kernel as boot parameter, e.g. rootflags=data=journal. journal All data is committed into the journal prior to being written into the main file system. ordered This is the default mode. All data is forced directly out to the main file system prior to its metadata being committed to the journal. writeback Data ordering is not preserved - data may be written into the main file system after its metadata has been commit- ted to the journal. This is rumoured to be the highest- throughput option. It guarantees internal file system integrity, however it can allow old data to appear in files after a crash and journal recovery. -- Alvaro Herrera (alvherre[a]alvh.no-ip.org) Officer Krupke, what are we to do? Gee, officer Krupke, Krup you! (West Side Story, Gee, Officer Krupke) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Quad Opteron stuck in the mud
On Jul 14, 2005, at 9:47 AM, Alvaro Herrera wrote: On Thu, Jul 14, 2005 at 12:28:05AM -0600, Dan Harris wrote: . Ext3 must really be crappy for postgres, or at least is on this box. Were you using the default journal settings for ext3? Yes, I was. Next time I get a chance to reboot this box, I will try writeback and compare the benchmarks to my previous config. Thanks for the tip. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] lots of updates on small table
On Thu, Jul 14, 2005 at 03:08:30PM +1000, Alison Winters wrote: Hi, Our application requires a number of processes to select and update rows from a very small (10 rows) Postgres table on a regular and frequent basis. These processes often run for weeks at a time, but over the space of a few days we find that updates start getting painfully slow. We are running a full vacuum/analyze and reindex on the table every day, Full vacuum, eh? I wonder if what you really need is very frequent non-full vacuum. Say, once in 15 minutes (exact rate depending on dead tuple rate.) -- Alvaro Herrera (alvherre[a]alvh.no-ip.org) World domination is proceeding according to plan(Andrew Morton) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] lots of updates on small table
On Thu, 2005-07-14 at 15:08 +1000, Alison Winters wrote: Hi, Our application requires a number of processes to select and update rows from a very small (10 rows) Postgres table on a regular and frequent basis. These processes often run for weeks at a time, but over the Are these long running transactions or is the process issuing many short transactions? If your transaction lasts a week, then a daily vacuum isn't really doing anything. I presume you also run ANALYZE in some shape or form periodically? space of a few days we find that updates start getting painfully slow. We are running a full vacuum/analyze and reindex on the table every day, If they're short transactions, run vacuum (not vacuum full) every 100 or so updates. This might even be once a minute. Analyze periodically as well. -- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] JFS fastest filesystem for PostgreSQL?
On Wed, Jul 13, 2005 at 11:33:41PM -0700, Jeffrey W. Baker wrote: [reposted due to delivery error -jwb] I just took delivery of a new system, and used the opportunity to benchmark postgresql 8.0 performance on various filesystems. The system in question runs Linux 2.6.12, has one CPU and 1GB of system memory, and 5 7200RPM SATA disks attached to an Areca hardware RAID controller having 128MB of cache. The caches are all write-back. I ran pgbench with a scale factor of 1000 and a total of 100,000 transactions per run. I varied the number of clients between 10 and 100. It appears from my test JFS is much faster than both ext3 and XFS for this workload. JFS and XFS were made with the mkfs defaults. ext3 was made with -T largefile4 and -E stride=32. The deadline scheduler was used for all runs (anticipatory scheduler is much worse). Here's the result, in transactions per second. ext3 jfs xfs - 10 Clients 55 81 68 100 Clients 61 100 64 BTW, it'd be interesting to see how UFS on FreeBSD compared. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: Where do you want to go today? Linux: Where do you want to go tomorrow? FreeBSD: Are you guys coming, or what? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] large table vs multiple smal tables
On Wed, Jul 13, 2005 at 12:08:54PM +0200, Nicolas Beaume wrote: Hello I have a large database with 4 large tables (each containing at least 200 000 rows, perhaps even 1 or 2 million) and i ask myself if it's better to split them into small tables (e.g tables of 2000 rows) to speed the access and the update of those tables (considering that i will have few update but a lot of reading). 2 million rows is nothing unless you're on a 486 or something. As for your other question, remember the first rule of performance tuning: don't tune unless you actually need to. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: Where do you want to go today? Linux: Where do you want to go tomorrow? FreeBSD: Are you guys coming, or what? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Profiler for PostgreSQL
On Thu, 2005-07-14 at 14:29 +0800, Christopher Kings-Lynne wrote: Try turning on query logging and using the 'pqa' utility on pgfoundry.org. Have you got that to work for 8 ? pqa 1.5 doesn't even work with its own test file. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Quad Opteron stuck in the mud
Dan Harris [EMAIL PROTECTED] writes: Well, once every day, but there aren't a ton of inserts or updates going on a daily basis. Maybe 1,000 total inserts? It's actually deletes and updates that matter. not inserts. I have a feeling I'm going to need to do a cluster soon. I have done several mass deletes and reloads on it. CLUSTER effectively does a VACUUM FULL but takes a different approach and writes out a whole new table, which if there's lots of free space is faster than moving records around to compact the table. I tried that, and indeed it was using an index, although after reading Simon's post, I realize that was kind of dumb to have an index on a bool. I have since removed it. If there are very few records (like well under 10%) with that column equal to false (or very few equal to true) then it's not necessarily useless. But probably more useful is a partial index on some other column. Something like CREATE INDEX ON pk WHERE flag = false; No foreign keys or triggers. Note that I'm talking about foreign keys in *other* tables that refer to columns in this table. Every update on this table would have to scan those other tables looking for records referencing the updated rows. Ok, so I remounted this drive as ext2 shortly before sending my first email today. It wasn't enough time for me to notice the ABSOLUTELY HUGE difference in performance change. Ext3 must really be crappy for postgres, or at least is on this box. Now that it's ext2, this thing is flying like never before. My CPU utilization has skyrocketed, telling me that the disk IO was constraining it immensely. I always knew that it might be a little faster, but the box feels like it can breathe again and things that used to be IO intensive and run for an hour or more are now running in 5 minutes. I'm a little worried about not having a journalized file system, but that performance difference will keep me from switching back ( at least to ext3! ). Maybe someday I will try XFS. @spock(Fascinating). I wonder if ext3 might be issuing IDE cache flushes on every fsync (to sync the journal) whereas ext2 might not be issuing any cache flushes at all. If the IDE cache is never being flushed then you'll see much better performance but run the risk of data loss in a power failure or hardware failure. (But not in the case of an OS crash, or at least no more than otherwise.) You could also try using the -O journal_dev option to put the ext3 journal on a separate device. -- greg ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] slow joining very large table to smaller ones
On Thu, Jul 14, 2005 at 04:29:58PM -0600, Dan Harris wrote: Ok, I tried this one. My ssh keeps getting cut off by a router somewhere between me and the server due to inactivity timeouts, so all I know is that both the select and explain analyze are taking over an hour to run. Try running the query as a script with nohup redirect the output to a file. Mike Stone ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] slow joining very large table to smaller ones
Dan Harris wrote: On Jul 14, 2005, at 9:42 AM, John A Meinel wrote: ... Did you try doing this to see how good the planners selectivity estimates are? Well, postgres is estimating around 500 rows each, is that way off? Try just doing: EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107; EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94; These should be fast queries. John =:- signature.asc Description: OpenPGP digital signature
Re: [PERFORM] slow joining very large table to smaller ones
Dan Harris [EMAIL PROTECTED] writes: Here's the explain select for that one, since that's the best I can get. explain select recordtext from eventactivity,k_r,k_b where eventactivity.incidentid = k_r.incidentid and eventactivity.incidentid = k_b.incidentid and k_r.id = 94 and k_b.id = 107; QUERY PLAN -- Merge Join (cost=9624.61..4679590.52 rows=151009549 width=35) Merge Cond: ((outer.incidentid)::text = inner.?column2?) - Merge Join (cost=4766.92..4547684.26 rows=16072733 width=117) Merge Cond: ((outer.incidentid)::text = inner.?column2?) - Index Scan using eventactivity1 on eventactivity (cost=0.00..4186753.16 rows=46029271 width=49) - Sort (cost=4766.92..4771.47 rows=1821 width=68) Sort Key: (k_b.incidentid)::text - Index Scan using k_b_idx on k_b (cost=0.00..4668.31 rows=1821 width=68) Index Cond: (id = 107) - Sort (cost=4857.69..4862.39 rows=1879 width=68) Sort Key: (k_r.incidentid)::text - Index Scan using k_r_idx on k_r (cost=0.00..4755.52 rows=1879 width=68) Index Cond: (id = 94) (13 rows) There's something awfully fishy here. The 8.0 planner is definitely capable of figuring out that it ought to join the smaller relations first. As an example, using 8.0.3+ (CVS branch tip) I did regression=# create table eventactivity(incidentid varchar, recordtext text); CREATE TABLE regression=# create table k_r(incidentid varchar); CREATE TABLE regression=# create table k_b(incidentid varchar); CREATE TABLE regression=# explain select recordtext from eventactivity inner join (select incidentid from k_r) a using (incidentid) inner join (select incidentid from k_b) b using (incidentid); (Being too impatient to actually fill the eventactivity table with 36M rows of data, I just did some debugger magic to make the planner think that that was the table size...) The default plan looks like Merge Join (cost=16137814.70..36563453.23 rows=136170 width=32) Merge Cond: ((outer.incidentid)::text = inner.?column3?) - Merge Join (cost=170.85..290.48 rows=7565 width=64) Merge Cond: (outer.?column2? = inner.?column2?) - Sort (cost=85.43..88.50 rows=1230 width=32) Sort Key: (k_r.incidentid)::text - Seq Scan on k_r (cost=0.00..22.30 rows=1230 width=32) - Sort (cost=85.43..88.50 rows=1230 width=32) Sort Key: (k_b.incidentid)::text - Seq Scan on k_b (cost=0.00..22.30 rows=1230 width=32) - Sort (cost=16137643.84..16227643.84 rows=3600 width=64) Sort Key: (eventactivity.incidentid)::text - Seq Scan on eventactivity (cost=0.00..108.00 rows=3600 width=64) and if I set enable_mergejoin TO 0; I get Hash Join (cost=612.54..83761451.54 rows=136170 width=32) Hash Cond: ((outer.incidentid)::text = (inner.incidentid)::text) - Seq Scan on eventactivity (cost=0.00..108.00 rows=3600 width=64) - Hash (cost=504.62..504.62 rows=7565 width=64) - Hash Join (cost=25.38..504.62 rows=7565 width=64) Hash Cond: ((outer.incidentid)::text = (inner.incidentid)::text) - Seq Scan on k_r (cost=0.00..22.30 rows=1230 width=32) - Hash (cost=22.30..22.30 rows=1230 width=32) - Seq Scan on k_b (cost=0.00..22.30 rows=1230 width=32) which is the plan I would judge Most Likely To Succeed based on what we know about Dan's problem. (The fact that the planner is estimating it as twice as expensive as the mergejoin comes from the fact that with no statistics about the join keys, the planner deliberately estimates hash join as expensive, because it can be pretty awful in the presence of many equal keys.) So the planner is certainly capable of finding the desired plan, even without any tweaking of the query text. This means that what we have is mainly a statistical problem. Have you ANALYZEd these tables recently? If so, may we see the pg_stats rows for incidentid in all three tables? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] slow joining very large table to smaller ones
Dan Harris wrote: On Jul 14, 2005, at 9:42 AM, John A Meinel wrote: You might try giving it a little bit more freedom with: EXPLAIN ANALYZE SELECT recordtext FROM eventactivity, k_r, k_b WHERE eventactivity.incidentid = k_r.incidentid AND eventactivity.incidentid = k_b.incidentid AND k_r.id = 94 AND k_b.id = 107 -- AND k_r.incidentid = k_b.incidentid ; I'm pretty sure that would give identical results, just let the planner have a little bit more freedom about how it does it. Also the last line is commented out, because I think it is redundant. Ok, I tried this one. My ssh keeps getting cut off by a router somewhere between me and the server due to inactivity timeouts, so all I know is that both the select and explain analyze are taking over an hour to run. Here's the explain select for that one, since that's the best I can get. explain select recordtext from eventactivity,k_r,k_b where eventactivity.incidentid = k_r.incidentid and eventactivity.incidentid = k_b.incidentid and k_r.id = 94 and k_b.id = 107; QUERY PLAN -- Merge Join (cost=9624.61..4679590.52 rows=151009549 width=35) Merge Cond: ((outer.incidentid)::text = inner.?column2?) - Merge Join (cost=4766.92..4547684.26 rows=16072733 width=117) Merge Cond: ((outer.incidentid)::text = inner.?column2?) - Index Scan using eventactivity1 on eventactivity (cost=0.00..4186753.16 rows=46029271 width=49) - Sort (cost=4766.92..4771.47 rows=1821 width=68) Sort Key: (k_b.incidentid)::text - Index Scan using k_b_idx on k_b (cost=0.00..4668.31 rows=1821 width=68) Index Cond: (id = 107) - Sort (cost=4857.69..4862.39 rows=1879 width=68) Sort Key: (k_r.incidentid)::text - Index Scan using k_r_idx on k_r (cost=0.00..4755.52 rows=1879 width=68) Index Cond: (id = 94) (13 rows) If anything, the estimations have gotten worse. As now it thinks there will be 1800 rows returned each, whereas you were thinking it would be more around 100. Since you didn't say, you did VACUUM ANALYZE recently, right? ... You can also try disabling merge joins, and see how that changes things. Are there any negative sideaffects of doing this? If the planner is estimating things correctly, you want to give it the most flexibility of plans to pick from, because sometimes a merge join is faster (postgres doesn't pick things because it wants to go slower). The only reason for the disable flags is that sometimes the planner doesn't estimate correctly. Usually disabling a method is not the final solution, but a way to try out different methods, and see what happens to the results. Using: SET enable_mergejoin TO off; You can disable it just for the current session (not for the entire database). Which is the recommended way if you have a query that postgres is messing up on. (Usually it is correct elsewhere). Well, postgres is estimating around 500 rows each, is that way off? Try just doing: EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107; EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94; Once again, do this and post the results. We might just need to tweak your settings so that it estimates the number of rows correctly, and we don't need to do anything else. And see if postgres estimates the number of rows properly. I assume you have recently VACUUM ANALYZEd, which means you might need to update the statistics target (ALTER TABLE k_b ALTER COLUMN incidientid SET STATISTICS 100) default is IIRC 10, ranges from 1-1000, higher is more accurate, but makes ANALYZE slower. ... EXPLAIN ANALYZE SELECT recordtext FROM eventactivity JOIN (SELECT incidentid FROM k_r JOIN k_b USING (incidentid) WHERE k_r.id = 94 AND k_b.id = 107) USING (incidentid); This one looks like the same plan as the others: explain select recordtext from eventactivity join ( select incidentid from k_r join k_b using (incidentid) where k_r.id = 94 and k_b.id = 107 ) a using (incidentid ); Well, the planner is powerful enough to flatten nested selects. To make it less intelligent you can do: SET join_collapse_limit 1; or SET join_collapse_limit 0; Which should tell postgres to not try and get tricky with your query. Again, *usually* the planner knows better than you do. So again just do it to see what you get. The problem is that if you are only using EXPLAIN SELECT, you will probably get something which *looks* worse. Because if it looked better, the planner would have used it. That is why you really need the EXPLAIN ANALYZE, so that you can see where the planner is incorrect in it's estimates. QUERY PLAN
Re: [PERFORM] slow joining very large table to smaller ones
John A Meinel [EMAIL PROTECTED] writes: What I don't understand is that the planner is actually estimating that joining against the new table is going to *increase* the number of returned rows. It evidently thinks that incidentid in the k_r table is pretty nonunique. We really need to look at the statistics data to see what's going on. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] slow joining very large table to smaller ones
Tom Lane wrote: John A Meinel [EMAIL PROTECTED] writes: What I don't understand is that the planner is actually estimating that joining against the new table is going to *increase* the number of returned rows. It evidently thinks that incidentid in the k_r table is pretty nonunique. We really need to look at the statistics data to see what's going on. regards, tom lane Okay, sure. What about doing this, then: EXPLAIN ANALYZE SELECT recordtext FROM eventactivity JOIN (SELECT DISTINCT incidentid FROM k_r JOIN k_b USING (incidentid) WHERE k_r.id = ?? AND k_b.id = ??) USING (incidentid) ; Since I assume that eventactivity is the only table with recordtext, and that you don't get any columns from k_r and k_b, meaning it would be pointless to get duplicate incidentids. I may be misunderstanding what the query is trying to do, but depending on what is in k_r and k_b, is it possible to use a UNIQUE INDEX rather than just an index on incidentid? There is also the possibility of EXPLAIN ANALYZE SELECT recordtext FROM eventactivtity JOIN (SELECT incidentid FROM k_r WHERE k_r.id = ?? UNION SELECT incidentid FROM k_b WHERE k_b.id = ??) USING (incidentid) ; But both of these would mean that you don't actually want columns from k_r or k_b, just a unique list of incident ids. But first, I agree, we should make sure the pg_stats values are reasonable. John =:- signature.asc Description: OpenPGP digital signature
Re: [PERFORM] lots of updates on small table
Hi, Our application requires a number of processes to select and update rows from a very small (10 rows) Postgres table on a regular and frequent basis. These processes often run for weeks at a time, but over the space of a few days we find that updates start getting painfully slow. We are running a full vacuum/analyze and reindex on the table every day, Full vacuum, eh? I wonder if what you really need is very frequent non-full vacuum. Say, once in 15 minutes (exact rate depending on dead tuple rate.) Is there a difference between vacuum and vacuum full? Currently we have a cron job going every hour that does: VACUUM FULL VERBOSE ANALYZE plc_fldio REINDEX TABLE plc_fldio The most recent output was this: INFO: --Relation public.plc_fldio-- INFO: Pages 1221: Changed 3, reaped 256, Empty 0, New 0; Tup 108137: Vac 4176, Keep/VTL 108133/108133, UnUsed 19, MinLen 84, MaxLen 84; Re-using: Free/Avail. Space 445176/371836; EndEmpty/Avail. Pages 0/256. CPU 0.04s/0.14u sec elapsed 0.18 sec. INFO: Index plcpage_idx: Pages 315; Tuples 108137: Deleted 4176. CPU 0.03s/0.04u sec elapsed 0.14 sec. INFO: Rel plc_fldio: Pages: 1221 -- 1221; Tuple(s) moved: 0. CPU 0.03s/0.04u sec elapsed 0.36 sec. INFO: Analyzing public.plc_fldio VACUUM REINDEX We'll up it to every 15 minutes, but i don't know if that'll help because even with the current vacuuming the updates are still getting slower and slower over the course of several days. What really puzzles me is why restarting the processes fixes it. Does PostgreSQL keep some kind of backlog of transactions all for one database connection? Isn't it normal to have processes that keep a single database connection open for days at a time? Regarding the question another poster asked: all the transactions are very short. The table is essentially a database replacement for a shared memory segment - it contains a few rows of byte values that are constantly updated byte-at-a-time to communicate data between different industrial control processes. Thanks for the thoughts everyone, Alison ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] lots of updates on small table
[EMAIL PROTECTED] (Alison Winters) writes: Our application requires a number of processes to select and update rows from a very small (10 rows) Postgres table on a regular and frequent basis. These processes often run for weeks at a time, but over the space of a few days we find that updates start getting painfully slow. No wonder, considering that your less than 10 rows table contains something upwards of 10 tuples: INFO: --Relation public.plc_fldio-- INFO: Pages 1221: Changed 3, reaped 256, Empty 0, New 0; Tup 108137: Vac 4176, Keep/VTL 108133/108133, UnUsed 19, MinLen 84, MaxLen 84; Re-using: Free/Avail. Space 445176/371836; EndEmpty/Avail. Pages 0/256. CPU 0.04s/0.14u sec elapsed 0.18 sec. What you need to do is find out why VACUUM is unable to reclaim all those dead row versions. The reason is likely that some process is sitting on a open transaction for days at a time. Isn't it normal to have processes that keep a single database connection open for days at a time? Database connection, sure. Single transaction, no. Regarding the question another poster asked: all the transactions are very short. Somewhere you have one that isn't. Try watching the backends with ps, or look at the pg_stat_activity view if your version of PG has it, to see which sessions are staying idle in transaction indefinitely. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] lots of updates on small table
Alison Winters wrote: Hi, Our application requires a number of processes to select and update rows from a very small (10 rows) Postgres table on a regular and frequent basis. These processes often run for weeks at a time, but over the space of a few days we find that updates start getting painfully slow. We are running a full vacuum/analyze and reindex on the table every day, Full vacuum, eh? I wonder if what you really need is very frequent non-full vacuum. Say, once in 15 minutes (exact rate depending on dead tuple rate.) Is there a difference between vacuum and vacuum full? Currently we have a cron job going every hour that does: VACUUM FULL VERBOSE ANALYZE plc_fldio REINDEX TABLE plc_fldio VACUUM FULL exclusively locks the table (so that nothing else can happen) and the compacts it as much as it can. You almost definitely want to only VACUUM every 15min, maybe VACUUM FULL 1/day. VACUUM FULL is more for when you haven't been VACUUMing often enough. Or have major changes to your table. Basically VACUUM marks rows as empty and available for reuse, VACUUM FULL removes empty space (but requires a full lock, because it is moving rows around). If anything, I would estimate that VACUUM FULL would be hurting your performance. But it may happen fast enough not to matter. The most recent output was this: INFO: --Relation public.plc_fldio-- INFO: Pages 1221: Changed 3, reaped 256, Empty 0, New 0; Tup 108137: Vac 4176, Keep/VTL 108133/108133, UnUsed 19, MinLen 84, MaxLen 84; Re-using: Free/Avail. Space 445176/371836; EndEmpty/Avail. Pages 0/256. CPU 0.04s/0.14u sec elapsed 0.18 sec. INFO: Index plcpage_idx: Pages 315; Tuples 108137: Deleted 4176. CPU 0.03s/0.04u sec elapsed 0.14 sec. INFO: Rel plc_fldio: Pages: 1221 -- 1221; Tuple(s) moved: 0. CPU 0.03s/0.04u sec elapsed 0.36 sec. INFO: Analyzing public.plc_fldio VACUUM REINDEX We'll up it to every 15 minutes, but i don't know if that'll help because even with the current vacuuming the updates are still getting slower and slower over the course of several days. What really puzzles me is why restarting the processes fixes it. Does PostgreSQL keep some kind of backlog of transactions all for one database connection? Isn't it normal to have processes that keep a single database connection open for days at a time? I believe that certain locks are grabbed per session. Or at least there is some command that you can run, which you don't want to run in a maintained connection. (It might be VACUUM FULL, I don't remember which one it is). But the fact that your application works at all seems to be that it isn't acquiring any locks. I know VACUUM cannot clean up any rows that are visible in one of the transactions, I don't know if this includes active connections or not. Regarding the question another poster asked: all the transactions are very short. The table is essentially a database replacement for a shared memory segment - it contains a few rows of byte values that are constantly updated byte-at-a-time to communicate data between different industrial control processes. Thanks for the thoughts everyone, Alison Is it possible to have some sort of timer that would recognize it has been connected for too long, drop the database connection, and reconnect? I don't know that it would solve anything, but it would be something you could try. John =:- signature.asc Description: OpenPGP digital signature
Re: [PERFORM] slow joining very large table to smaller ones
On Jul 14, 2005, at 5:12 PM, John A Meinel wrote: Dan Harris wrote: Well, postgres is estimating around 500 rows each, is that way off? Try just doing: EXPLAIN ANALYZE SELECT incidentid FROM k_b WHERE id = 107; EXPLAIN ANALYZE SELECT incidentid FROM k_r WHERE id = 94; Once again, do this and post the results. We might just need to tweak your settings so that it estimates the number of rows correctly, and we don't need to do anything else. Ok, sorry I missed these the first time through: explain analyze select incidentid from k_b where id = 107; QUERY PLAN Index Scan using k_b_idx on k_b (cost=0.00..1926.03 rows=675 width=14) (actual time=0.042..298.394 rows=2493 loops=1) Index Cond: (id = 107) Total runtime: 299.103 ms select count(*) from k_b; count 698350 ( sorry! I think I said this one only had tens of thousands in it ) explain analyze select incidentid from k_r where id = 94; QUERY PLAN - Index Scan using k_r_idx on k_r (cost=0.00..2137.61 rows=757 width=14) (actual time=0.092..212.187 rows=10893 loops=1) Index Cond: (id = 94) Total runtime: 216.498 ms (3 rows) select count(*) from k_r; count 671670 That one is quite a bit slower, yet it's the same table structure and same index as k_b, also it has fewer records. I did run VACUUM ANALYZE immediately before running these queries. It seems a lot better with the join_collapse set. \ Well, the planner is powerful enough to flatten nested selects. To make it less intelligent you can do: SET join_collapse_limit 1; or SET join_collapse_limit 0; Which should tell postgres to not try and get tricky with your query. Again, *usually* the planner knows better than you do. So again just do it to see what you get. Ok, when join_collapse_limit = 1 I get this now: explain analyze select recordtext from eventactivity join ( select incidentid from k_r join k_b using (incidentid) where k_r.id = 94 and k_b.id = 107 ) a using (incidentid ); QUERY PLAN --- Nested Loop (cost=0.00..156509.08 rows=2948 width=35) (actual time=1.555..340.625 rows=24825 loops=1) - Nested Loop (cost=0.00..5361.89 rows=6 width=28) (actual time=1.234..142.078 rows=366 loops=1) - Index Scan using k_b_idx on k_b (cost=0.00..1943.09 rows=681 width=14) (actual time=0.423..56.974 rows=2521 loops=1) Index Cond: (id = 107) - Index Scan using k_r_idx on k_r (cost=0.00..5.01 rows=1 width=14) (actual time=0.031..0.031 rows=0 loops=2521) Index Cond: ((k_r.id = 94) AND ((k_r.incidentid)::text = (outer.incidentid)::text)) - Index Scan using eventactivity1 on eventactivity (cost=0.00..25079.55 rows=8932 width=49) (actual time=0.107..0.481 rows=68 loops=366) Index Cond: ((eventactivity.incidentid)::text = (outer.incidentid)::text) Total runtime: 347.975 ms MUCH better! Maybe you can help me understand what I did and if I need to make something permanent to get this behavior from now on? If you have analyzed recently can you do: SELECT relname, reltuples FROM pg_class WHERE relname='eventactivity'; It is a cheaper form than SELECT count(*) FROM eventactivity to get an approximate estimate of the number of rows. But if it isn't too expensive, please also give the value from SELECT count(*) FROM eventactivity. Again, that helps us know if your tables are up-to-date. Sure: select relname, reltuples from pg_class where relname='eventactivity'; relname| reltuples ---+- eventactivity | 3.16882e+07 select count(*) from eventactivity; count -- 31871142 I don't know how selective your keys are, but one of these queries should probably structure it better for the planner. It depends a lot on how selective your query is. eventactivity currently has around 36 million rows in it. There should only be maybe 200-300 incidentids at most that will be matched with the combination of k_b and k_r. That's why I was thinking I could somehow get a list of just the incidentids that matched the id = 94 and id = 107 in k_b and k_r first. Then, I would only need to grab a few hundred out of 36 million rows from eventactivity. Well, you can also try: SELECT count(*) FROM k_b JOIN k_r USING (incidentid) WHERE k_b.id=?? AND k_r.id=?? ; That will tell you how many rows they have in common. select count(*)
Re: [PERFORM] lots of updates on small table
On Fri, Jul 15, 2005 at 09:42:12AM +1000, Alison Winters wrote: Our application requires a number of processes to select and update rows from a very small (10 rows) Postgres table on a regular and frequent basis. These processes often run for weeks at a time, but over the space of a few days we find that updates start getting painfully slow. We are running a full vacuum/analyze and reindex on the table every day, Full vacuum, eh? I wonder if what you really need is very frequent non-full vacuum. Say, once in 15 minutes (exact rate depending on dead tuple rate.) Is there a difference between vacuum and vacuum full? Yes. Vacuum full is more aggresive in compacting the table. Though it really works the same in the presence of long-running transactions: tuples just can't be removed. The most recent output was this: INFO: --Relation public.plc_fldio-- INFO: Pages 1221: Changed 3, reaped 256, Empty 0, New 0; Tup 108137: Vac 4176, Keep/VTL 108133/108133, UnUsed 19, MinLen 84, MaxLen 84; Re-using: Free/Avail. Space 445176/371836; EndEmpty/Avail. Pages 0/256. CPU 0.04s/0.14u sec elapsed 0.18 sec. INFO: Index plcpage_idx: Pages 315; Tuples 108137: Deleted 4176. CPU 0.03s/0.04u sec elapsed 0.14 sec. INFO: Rel plc_fldio: Pages: 1221 -- 1221; Tuple(s) moved: 0. CPU 0.03s/0.04u sec elapsed 0.36 sec. INFO: Analyzing public.plc_fldio Hmm, so it seems your hourly vacuum is enough. I think the bloat theory can be trashed. Unless I'm reading this output wrong; I don't remember the details of this vacuum output. We'll up it to every 15 minutes, but i don't know if that'll help because even with the current vacuuming the updates are still getting slower and slower over the course of several days. What really puzzles me is why restarting the processes fixes it. I wonder if the problem may be plan caching. I didn't pay full attention to the description of your problem, so I don't remember if it could be an issue, but it's something to consider. Does PostgreSQL keep some kind of backlog of transactions all for one database connection? No. There could be a problem if you had very long transactions, but apparently this isn't your problem. Isn't it normal to have processes that keep a single database connection open for days at a time? I guess it depends on exactly what you do with it. I know of at least one case where an app keeps a connection open for months, without a problem. (It's been running for four or five years, and monthly uptime for that particular daemon is not unheard of.) -- Alvaro Herrera (alvherre[a]alvh.no-ip.org) Everybody understands Mickey Mouse. Few understand Hermann Hesse. Hardly anybody understands Einstein. And nobody understands Emperor Norton. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] lots of updates on small table
On Thu, Jul 14, 2005 at 08:28:24PM -0400, Alvaro Herrera wrote: On Fri, Jul 15, 2005 at 09:42:12AM +1000, Alison Winters wrote: INFO: Pages 1221: Changed 3, reaped 256, Empty 0, New 0; Tup 108137: Vac 4176, Keep/VTL 108133/108133, UnUsed 19, MinLen 84, MaxLen 84; Re-using: Free/Avail. Space 445176/371836; EndEmpty/Avail. Pages 0/256. Hmm, so it seems your hourly vacuum is enough. I think the bloat theory can be trashed. Unless I'm reading this output wrong; I don't remember the details of this vacuum output. Ok, so I was _very_ wrong :-) Sorry. -- Alvaro Herrera (alvherre[a]alvh.no-ip.org) Essentially, you're proposing Kevlar shoes as a solution for the problem that you want to walk around carrying a loaded gun aimed at your foot. (Tom Lane) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] lots of updates on small table
Hi all, No wonder, considering that your less than 10 rows table contains something upwards of 10 tuples: INFO: --Relation public.plc_fldio-- INFO: Pages 1221: Changed 3, reaped 256, Empty 0, New 0; Tup 108137: Vac 4176, Keep/VTL 108133/108133, UnUsed 19, MinLen 84, MaxLen 84; Re-using: Free/Avail. Space 445176/371836; EndEmpty/Avail. Pages 0/256. CPU 0.04s/0.14u sec elapsed 0.18 sec. What you need to do is find out why VACUUM is unable to reclaim all those dead row versions. The reason is likely that some process is sitting on a open transaction for days at a time. Cheers mate, that was one of our theories but we weren't sure if it'd be worth rebuilding everything to check. We've been compiling without the -t (autocommit) flag to ecpg, and i believe what's happening is sometimes a transaction is begun and then the processes cycle around doing hardware i/o and never commit or only commit way too late. What we're going to try now is remove all the begins and commits from the code and compile with -t to make sure that any updates happen immediately. Hopefully that'll avoid any hanging transactions. We'll also set up a 10-minutely vacuum (not full) as per some other suggestions here. I'll let you know how it goes - we'll probably slot everything in on Monday so we have a week to follow it. Thanks everyone Alison ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] slow joining very large table to smaller ones
On Jul 14, 2005, at 7:15 PM, John A Meinel wrote: Is the distribution of your rows uneven? Meaning do you have more rows with a later id than an earlier one? There are definitely some id's that will have many times more than the others. If I group and count them, the top 10 are fairly dominant in the table. Hmm.. How to do it permanantly? Well you could always issue set join_collapse set 1; select * from But obviously that isn't what you prefer. :) I think there are things you can do to make merge join more expensive than a nested loop, but I'm not sure what they are. Maybe someone else has some ideas to encourage this behavior for future work? Setting it on a per-connection basis is doable, but would add some burden to us in code. What I really don't understand is that the estimates dropped as well. The actual number of estimate rows drops to 3k instead of 1M. The real question is why does the planner think it will be so expensive? select count(*) from k_b join k_r using (incidentid) where k_b.id=107 and k_r.id=94; count --- 373 Well, this says that they are indeed much more selective. Each one has 1k rows, but together you end up with only 400. Is this a bad thing? Is this not selective enough to make it much faster? Overall, I'm much happier now after seeing the new plan come about, if I can find a way to make that join_collapse behavior permanent, I can certainly live with these numbers. Thanks again for your continued efforts. -Dan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] slow joining very large table to smaller ones
Dan Harris wrote: On Jul 14, 2005, at 7:15 PM, John A Meinel wrote: Is the distribution of your rows uneven? Meaning do you have more rows with a later id than an earlier one? There are definitely some id's that will have many times more than the others. If I group and count them, the top 10 are fairly dominant in the table. That usually skews the estimates. Since the estimate is more of an average (unless the statistics are higher). Hmm.. How to do it permanantly? Well you could always issue set join_collapse set 1; select * from But obviously that isn't what you prefer. :) I think there are things you can do to make merge join more expensive than a nested loop, but I'm not sure what they are. Maybe someone else has some ideas to encourage this behavior for future work? Setting it on a per-connection basis is doable, but would add some burden to us in code. My biggest question is why the planner things the Nested Loop would be so expensive. Have you tuned any of the parameters? It seems like something is out of whack. (cpu_tuple_cost, random_page_cost, etc...) What I really don't understand is that the estimates dropped as well. The actual number of estimate rows drops to 3k instead of 1M. The real question is why does the planner think it will be so expensive? select count(*) from k_b join k_r using (incidentid) where k_b.id=107 and k_r.id=94; count --- 373 Well, this says that they are indeed much more selective. Each one has 1k rows, but together you end up with only 400. Is this a bad thing? Is this not selective enough to make it much faster? Yes, being more selective is what makes it faster. But the planner doesn't seem to notice it properly. Overall, I'm much happier now after seeing the new plan come about, if I can find a way to make that join_collapse behavior permanent, I can certainly live with these numbers. I'm sure there are pieces to tune, but I've reached my limits of parameters to tweak :) Thanks again for your continued efforts. -Dan John =:- signature.asc Description: OpenPGP digital signature
[PERFORM] Indexing Function called on VACUUM and sorting ?
The question appeared because of strange issues with functional indexes. It seems they are recalculated even where it is obviously not needed. \d+ test: i | integer | | t | text| | x | text| | i_i btree (i) x_i btree (xpath_string(x, 'data'::text)) x_ii btree (xpath_string(x, 'movie/characters/character'::text)) x_iii btree (xpath_string(x, 'movie/rating'::text)) 1) When I run VACUUM FULL ANALYZE VERBOSE OR VACUUM ANALYZE After text INFO: analyzing public.test INFO: test: scanned 733 of 733 pages, containing 1 live rows and 0 dead rows; 3000 rows in sample, 1 estimated total rows a lot of xpath_string calls occur. Does VACUUM rebuild indexes ? What for to recalculate that all? It makes VACUUMing very slow. Simple VACUUM call does not lead to such function calls. 2) When I do select * from test order by xpath_string(x, 'movie/rating'::text) limit 1000 offset 10; Planner uses index x_iii (as it should, ok here): Limit - Index scan. But many of calls to xpath_string occur in execution time. Why ? Index is calculated already and everything is so immutable.. Please answer if you have any ideas.. Functional indexes seemed so great first, but now I uncover weird issues I can't understand.. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] large table vs multiple smal tables
Nicolas, These sizes would not be considered large. I would leave them as single tables. Ken On Wed, Jul 13, 2005 at 12:08:54PM +0200, Nicolas Beaume wrote: Hello I have a large database with 4 large tables (each containing at least 200 000 rows, perhaps even 1 or 2 million) and i ask myself if it's better to split them into small tables (e.g tables of 2000 rows) to speed the access and the update of those tables (considering that i will have few update but a lot of reading). Do you think it would be efficient ? Nicolas, wondering if he hadn't be too greedy -- - ? soyez ce que vous voudriez avoir l'air d'?tre ? Lewis Caroll ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] PostgresSQL vs. Firebird
Before I ask, I don't want to start a war. Can someone here give me an honest opinion of how PostgresSQL (PG) is better than Firebird on Windows? I've just recently started reading the Firebird NG and a poster over there has brought up some serious issues with Firebird, but they seem to not take the issues seriously. I first wanted to go with Firebird for 2 reasons... Very easy to configure and very easy to install. I assumed that the database worked ok, but I'm not so sure now. So, I've decided to give PG a try...I've downloaded it, but haven't installed it yet. So any provable information that you can provide as to why/how PG is better/faster/easier/reliable than Firebird would be greatly appreciated. Thanks ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] JFS fastest filesystem for PostgreSQL?
I just took delivery of a new system, and used the opportunity to benchmark postgresql 8.0 performance on various filesystems. The system in question runs Linux 2.6.12, has one CPU and 1GB of system memory, and 5 7200RPM SATA disks attached to an Areca hardware RAID controller having 128MB of cache. The caches are all write-back. I ran pgbench with a scale factor of 1000 and a total of 100,000 transactions per run. I varied the number of clients between 10 and 100. It appears from my test JFS is much faster than both ext3 and XFS for this workload. JFS and XFS were made with the mkfs defaults. ext3 was made with -T largefile4 and -E stride=32. The deadline scheduler was used for all runs (anticipatory scheduler is much worse). Here's the result, in transactions per second. ext3 jfs xfs - 10 Clients 55 81 68 100 Clients 61 100 64 -jwb ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Functional index is 5 times slower than the basic one
VACUUM FULL ANALYZE is performed right before tests. UPDATE test SET t = xpath_string(x, 'movie/rating'::text); is performed also to make selects equal. Xpath_string is IMMUTABLE. Table public.test Column | Type | Modifiers | Description +--+---+- i | integer | | t | text | | x | text | | d | double precision | | Indexes: floatind btree (d) i_i btree (i) CLUSTER t_ind btree (t) t_x_ind btree (t, xpath_string(x, 'data'::text)) x_i btree (xpath_string(x, 'data'::text)) x_ii btree (xpath_string(x, 'movie/characters/character'::text)) x_iii btree (xpath_string(x, 'movie/rating'::text)) Has OIDs: no explain analyze select count(*) from ( select * from test order by xpath_string(x, 'movie/rating'::text) limit 1000 offset 10 ) a; QUERY PLAN Aggregate (cost=342.37..342.37 rows=1 width=0) (actual time=403.580..403.584 rows=1 loops=1) - Subquery Scan a (cost=3.27..339.87 rows=1000 width=0) (actual time=4.252..398.261 rows=1000 loops=1) - Limit (cost=3.27..329.87 rows=1000 width=969) (actual time=4.242..389.557 rows=1000 loops=1) - Index Scan using x_iii on test (cost=0.00..3266.00 rows=1 width=969) (actual time=0.488..381.049 rows=1010 loops=1) Total runtime: 403.695 ms explain analyze select count(*) from ( select * from test order by t limit 1000 offset 10 ) a; QUERY PLAN Aggregate (cost=339.84..339.84 rows=1 width=0) (actual time=26.662..26.666 rows=1 loops=1) - Subquery Scan a (cost=3.24..337.34 rows=1000 width=0) (actual time=0.228..22.416 rows=1000 loops=1) - Limit (cost=3.24..327.34 rows=1000 width=969) (actual time=0.217..14.244 rows=1000 loops=1) - Index Scan using t_ind on test (cost=0.00..3241.00 rows=1 width=969) (actual time=0.099..6.371 rows=1010 loops=1) Total runtime: 26.749 ms ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Indexing Function called on VACUUM and sorting ?
It seems functional indexes are recalculated even where it is obviously not needed. \d+ test: i | integer | | t | text| | x | text| | i_i btree (i) x_iii btree (xpath_string(x, 'movie/rating'::text)) 1) When I run VACUUM FULL ANALYZE VERBOSE OR VACUUM ANALYZE a lot of xpath_string calls occur. Does VACUUM rebuild indexes ? What for to recalculate that all? It makes VACUUMing very slow. Simple VACUUM call does not lead to such function calls. 2) When I do select * from test order by xpath_string(x, 'movie/rating'::text) limit 1000 offset 10; Planner uses index x_iii (as it should, ok here): Limit - Index scan. But many of calls to xpath_string occur in execution time. Why ? Index is calculated already and everything is so immutable.. Please answer if you have any ideas.. Functional indexes seemed so great first, but now I uncover weird issues I can't understand.. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq