Re: [PERFORM] General performance questions about postgres on Apple
On Fri, 20 Feb 2004, Sean Shanny wrote: max_connections = 100 # - Memory - shared_buffers = 16000 # min 16, at least max_connections*2, 8KB each sort_mem = 256000 # min 64, size in KB You might wanna drop sort_mem somewhat and just set it during your imports to something big like 512000 or larger. That way with 100 users during the day you won't have to worry about swap storms, and when you run your updates, you get all that sort_mem. Actual row count in the temp table: select count(*) from referral_temp ; 502347 Actual row count in d_referral table: select count(*) from d_referral ; 27908024 Note: that an analyze had not been performed on the referral_temp table prior to the explain analyze run. explain analyze SELECT t1.id, t2.md5, t2.url from referral_temp t2 LEFT OUTER JOIN d_referral t1 ON t2.md5 = t1.referral_md5 Nested Loop Left Join (cost=0.00..3046.00 rows=1001 width=68) (actual time=136.513..6440616.541 rows=502347 loops=1) - Seq Scan on referral_temp t2 (cost=0.00..20.00 rows=1000 width=64) (actual time=21.730..10552.421 rows=502347 loops=1) - Index Scan using d_referral_referral_md5_key on d_referral t1 (cost=0.00..3.01 rows=1 width=40) (actual time=12.768..14.022 rows=1 loops=502347) Index Cond: (outer.md5 = t1.referral_md5) Thanks. --sean Total runtime: 6441969.698 ms (5 rows) Here is an explain analyze after the analyze was done. Unfortunately I think a lot of the data was still in cache when I did this again :-( explain analyze SELECT t1.id, t2.md5, t2.url from referral_temp t2 LEFT OUTER JOIN d_referral t1 ON t2.md5 = t1.referral_md5; Nested Loop Left Join (cost=0.00..1468759.69 rows=480082 width=149) (actual time=69.576..3226854.850 rows=502347 loops=1) - Seq Scan on referral_temp t2 (cost=0.00..16034.81 rows=480081 width=145) (actual time=11.206..4003.521 rows=502347 loops=1) - Index Scan using d_referral_referral_md5_key on d_referral t1 (cost=0.00..3.01 rows=1 width=40) (actual time=6.396..6.402 rows=1 loops=502347) Index Cond: (outer.md5 = t1.referral_md5) Total runtime: 3227830.752 ms Hmmm. It looks like postgresql is still picking a nested loop when it should be sorting something faster. Try doing a set enable_nestloop = off and see what you get. If that makes it faster, you may want to adjust the costs of the cpu_* stuff higher to see if that can force it to do the right thing. Looking at the amount of time taken by the nested loop, it looks like the problem to me. And why are you doing a left join of ONE row from one table against the whole temp table? Do you really need to do that? since there's only one row in the source table, and I'd guess is only matches one or a few rows from the temp table, this means you're gonna have that one row and a bunch of null filled rows to go with it. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Disappointing performance in db migrated from MS SQL
On Fri, 13 Feb 2004 [EMAIL PROTECTED] wrote: Josh, the disks in the new system should be substantially faster than the old. Both are Ultra160 SCSI RAID 5 arrays, but the new system has 15k RPM disks, as opposed to the 10k RPM disks in the old system. Spindle speed does not correlate with 'throughput' in any easy way. What controllers are you using for these disks? This is doubly so with a good RAID card with battery backed cache. I'd bet that 10k rpm drives on a cached array card will beat an otherwise equal setup with 15k rpm disks and no cache. I know that losing the cache slows my system down to a crawl (i.e. set it to write thru instead of write back.) comparitively speaking. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Tables on multiple disk drives
On Tue, 17 Feb 2004, Craig Thomas wrote: On Tue, 17 Feb 2004, Konstantin Tokar wrote: Hi! Does PostgreSQL allow to create tables and indices of a single database on multiple disk drives with a purpose of increase performance as Oracle database does? If a symbolic reference is the only method then the next question is: how can it be determined what file is referred to what table and index? You're life will be simpler, and your setup will be faster without having to muck about with it, if you just buy a good RAID controller with battery backed cache. LSI/Megaraid and Adaptec both make serviceable controllers for reasonable prices, and as you add drives, the speed just goes up, no muddling around with sym links. This works to a limited extent. For very large databases, maximum throughput of I/O is the paramount factor for database performance. With raid controllers, your LUN is still limited to a small number of disks. PostgreSQL can only write on a file system, but Oracle, SAP DB, DB2, etc can write directly to disk (raw I/O). With large databases it is advantageous to spread a table across 100's of disks, if the table is quite large. I don't know of any manufacturer that creates a 100 disk raid array yet. You can run up to four LSI / Megaraids in one box, each with 3 UW SCSI interfaces, and they act as one unit. That's 3*4*15 = 180 disks max. With FC AL connections and four cards, it would be possible to approach 1000 drives. Of course, I'm not sure how fast any RAID card setup is gonna be with that many drives, but ya never know. My guess is that before you go there you buy a big external RAID box built for speed. We have a couple of 200+ drive external RAID5 storage boxes at work that are quite impressive. Some of the problem can be addressed by using a volume manager (such as LVM in Linux, or Veritas on Unix-like systems). This allows one to create a volume using partitions from many disks. One can then create a file system and mount it on the volume. Pretty much RAID arrays in software, which means no battery backed cache, which means it'll be fast at reading, but probably pretty slow at writes, epsecially if there's a lot of parallel access waiting to write to the database. However, to get the best performance, Raw I/O capability is the best way to go. Unsupported statement made as fact. I'm not saying it can't or isn't true, but my experience has been that large RAID5 arrays are a great compromise between maximum performance and reliability, giving a good measure of each. It doesn't take 100 drives to do well, even a dozen to two dozen will get you in the same basic range as splitting out files by hand with sym links without all the headache of chasing down the files, shutting down the database, linking it over etc... ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [ADMIN] [PERFORM] Quad processor options
On Wed, 12 May 2004, Grega Bremec wrote: ...and on Tue, May 11, 2004 at 03:02:24PM -0600, scott.marlowe used the keyboard: If you get the LSI megaraid, make sure you're running the latest megaraid 2 driver, not the older, slower 1.18 series. If you are running linux, look for the dkms packaged version. dkms, (Dynamic Kernel Module System) automagically compiles and installs source rpms for drivers when you install them, and configures the machine to use them to boot up. Most drivers seem to be slowly headed that way in the linux universe, and I really like the simplicity and power of dkms. Hi, Given the fact LSI MegaRAID seems to be a popular solution around here, and many of you folx use Linux as well, I thought sharing this piece of info might be of use. Running v2 megaraid driver on a 2.4 kernel is actually not a good idea _at_ _all_, as it will silently corrupt your data in the event of a disk failure. Sorry to have to say so, but we tested it (on kernels up to 2.4.25, not sure about 2.4.26 yet) and it comes out it doesn't do hotswap the way it should. Somehow the replaced disk drives are not _really_ added to the array, which continues to work in degraded mode for a while and (even worse than that) then starts to think the replaced disk is in order without actually having resynced it, thus beginning to issue writes to non-existant areas of it. The 2.6 megaraid driver indeed seems to be a merged version of the above driver and the old one, giving both improved performance and correct functionality in the event of a hotswap taking place. This doesn't make any sense to me, since the hot swapping is handled by the card autonomously. I also tested it with a hot spare and pulled one drive and it worked fine during our acceptance testing. However, I've got a hot spare machine I can test on, so I'll try it again and see if I can make it fail. when testing it, was the problem present in certain RAID configurations or only one type or what? I'm curious to try and reproduce this problem, since I've never heard of it before. Also, what firmware version were those megaraid cards, ours is fairly new, as we got it at the beginning of this year, and I'm wondering if it is a firmware issue. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
On Tue, 11 May 2004, Paul Tuckfield wrote: If you are having a write storm or bursty writes that's burying performance, a scsi raid controler with writeback cache will greatly improve the situation, but I do believe they run around $1-2k. If it's write specific problem, the cache matters more than the striping, except to say that write specfic perf problems should avoid raid5 Actually, a single channel MegaRAID 320-1 (single channel ultra 320) is only $421 at http://www.siliconmechanics.com/c248/u320-scsi.php It works pretty well for me, having 6 months of a production server on one with zero hickups and very good performance. They have a dual channel intel card for only $503, but I'm not at all familiar with that card. The top of the line megaraid is the 320-4, which is only $1240, which ain't bad for a four channel RAID controller. Battery backed cache is an addon, but I think it's only about $80 or so. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
On Tue, 11 May 2004, Paul Tuckfield wrote: Love that froogle. It looks like a nice card. One thing I didn't get straight is if the cache is writethru or write back. If the original posters problem is truly a burst write problem (and not linux caching or virtual memory overcommitment) then writeback is key. the MegaRaid can be configured either way. it defaults to writeback if the battery backed cache is present, I believe. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Quad processor options
On Tue, 11 May 2004, Allan Wind wrote: On 2004-05-11T15:29:46-0600, scott.marlowe wrote: The other nice thing about the LSI cards is that you can install 1 and the act like one big RAID array. i.e. install two cards with a 20 drive RAID0 then make a RAID1 across them, and if one or the other cards itself fails, you've still got 100% of your data sitting there. Nice to know you can survive the complete failure of one half of your chain. ... unless that dying controller corrupted your file system. Depending on your tolerance for risk, you may not want to operate for long with a file system in an unknown state. It would have to be the primary controller for that to happen. The way the LSI's work is that you disable the BIOS on the 2nd to 4th cards, and the first card, with the active BIOS acts as the primary controller. In this case, that means the main card is doing the RAID1 work, then handing off the data to the subordinate cards. The subordinate cards do all their own RAID0 work. mobo ---controller 1--array1 of disks in RAID0 .|--controller 2--array2 of disks in RAID0 and whichever controller fails just kind of disappears. Note that if it is the master controller, then you'll have to shut down and enable the BIOS on one of the secondardy (now primary) controllers. So while it's possible for the master card failing to corrupt the RAID1 set, it's still a more reliable system that with just one card. But nothing is 100% reliable, sadly. Btw, the Intel and LSI Logic RAID controller cards have suspeciously similar specificationsi, so I would be surprised if one is an OEM. Hmmm. I'll take a closer look. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Quad processor options
On Tue, 11 May 2004, Bjoern Metzdorf wrote: scott.marlowe wrote: Well, from what I've read elsewhere on the internet, it would seem the Opterons scale better to 4 CPUs than the basic Xeons do. Of course, the exception to this is SGI's altix, which uses their own chipset and runs the itanium with very good memory bandwidth. This is basically what I read too. But I cannot spent money on a quad opteron just for testing purposes :) Wouldn't it be nice to just have a lab full of these things? If your I/O is saturated, then the answer may well be a better RAID array, with many more drives plugged into it. Do you have any spare drives you can toss on the machine to see if that helps? Sometimes going from 4 drives in a RAID 1+0 to 6 or 8 or more can give a big boost in performance. Next drives I'll buy will certainly be 15k scsi drives. Better to buy more 10k drives than fewer 15k drives. Other than slightly faster select times, the 15ks aren't really any faster. In short, don't expect 4 CPUs to solve the problem if the problem isn't really the CPUs being maxed out. Also, what type of load are you running? Mostly read, mostly written, few connections handling lots of data, lots of connections each handling a little data, lots of transactions, etc... In peak times we can get up to 700-800 connections at the same time. There are quite some updates involved, without having exact numbers I'll think that we have about 70% selects and 30% updates/inserts. Wow, a lot of writes then. If you are doing lots of writing, make SURE you have a controller that supports battery backed cache and is configured to write-back, not write-through. Could you recommend a certain controller type? The only battery backed one that I found on the net is the newest model from icp-vortex.com. Sure, adaptec makes one, so does lsi megaraid. Dell resells both of these, the PERC3DI and the PERC3DC are adaptec, then lsi in that order, I believe. We run the lsi megaraid with 64 megs battery backed cache. Intel also makes one, but I've heard nothing about it. If you get the LSI megaraid, make sure you're running the latest megaraid 2 driver, not the older, slower 1.18 series. If you are running linux, look for the dkms packaged version. dkms, (Dynamic Kernel Module System) automagically compiles and installs source rpms for drivers when you install them, and configures the machine to use them to boot up. Most drivers seem to be slowly headed that way in the linux universe, and I really like the simplicity and power of dkms. I haven't directly tested anything but the adaptec and the lsi megaraid. Here at work we've had massive issues trying to get the adaptec cards configured and installed on, while the megaraid was a snap. Installed RH, installed the dkms rpm, installed the dkms enabled megaraid driver and rebooted. Literally, that's all it took. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Quad processor options
On Tue, 11 May 2004, Bjoern Metzdorf wrote: scott.marlowe wrote: Sure, adaptec makes one, so does lsi megaraid. Dell resells both of these, the PERC3DI and the PERC3DC are adaptec, then lsi in that order, I believe. We run the lsi megaraid with 64 megs battery backed cache. The LSI sounds good. Intel also makes one, but I've heard nothing about it. It could well be the ICP Vortex one, ICP was bought by Intel some time ago.. Also, there are bigger, faster external RAID boxes as well, that make the internal cards seem puny. They're nice because all you need in your main box is a good U320 controller to plug into the external RAID array. That URL I mentioned earlier that had prices has some of the external boxes listed. No price, not for sale on the web, get out the checkbook and write a blank check is my guess. I.e. they're not cheap. The other nice thing about the LSI cards is that you can install 1 and the act like one big RAID array. i.e. install two cards with a 20 drive RAID0 then make a RAID1 across them, and if one or the other cards itself fails, you've still got 100% of your data sitting there. Nice to know you can survive the complete failure of one half of your chain. I haven't directly tested anything but the adaptec and the lsi megaraid. Here at work we've had massive issues trying to get the adaptec cards configured and installed on, while the megaraid was a snap. Installed RH, installed the dkms rpm, installed the dkms enabled megaraid driver and rebooted. Literally, that's all it took. I didn't hear anything about dkms for debian, so I will be hand-patching as usual :) Yeah, it seems to be an RPM kinda thing. But, I'm thinking the 2.0 drivers got included in the latest 2.6 kernels, so no biggie. I was looking around in google, and it definitely appears the 2.x and 1.x megaraid drivers were merged into unified driver in 2.6 kernel. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Why queries takes too much time to execute?
On Mon, 10 May 2004, Anderson Boechat Lopes wrote: Hum... now i think i´m beginning to understand. The vacuum analyse is recommended to perform at least every day, after adding or deleting a large number of records, and not vacuum full analyse. I´ve performed the vacuum full analyse every day and after some time i´ve noticed the database was corrupted. I couldn´t select anything any more. Hold it right there, full stop. If you've got corruption you've either found a rare corner case in postgresql (unlikely, corruption is not usually a big problem for postgresql) OR you have bad hardware. Test your RAM, CPUs, and hard drives before going any further. Data corruption, 99% of the time, is not the fault of postgresql but the fault of the hardware. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Help how to tune-up my Database
Sorry about that, I meant kbytes, not megs. My point being it's NOT measured in 8k blocks, like a lot of other settings. sorry for the mixup. On Fri, 7 May 2004, Bricklen wrote: scott.marlowe wrote: sort_mem might do with a small bump, especially if you're only handling a few connections at a time. Be careful, it's per sort, and measured in megs, so it's easy for folks to set it too high and make their machine start flushing too much kernel cache, which will slow down the other backends that have to go to disk for data. snip http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html (under Memory), it says that sort_mem is set in KB. Is this document wrong (or outdated)? ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] [OFF-TOPIC] - Known maximum size of the PostgreSQL
On Sat, 5 Jun 2004, Carlos Eduardo Smanioto wrote: Hello all, What's the case of bigger database PostgreSQL (so greate and amount of registers) that they know??? http://www.postgresql.org/docs/faqs/FAQ.html#4.5 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] cache table
On Mon, 3 May 2004, Joseph Shraibman wrote: I have a big table with some int fields. I frequently need to do queries like: SELECT if2, count(*) FROM table WHERE if1 = 20 GROUP BY if2; The problem is that this is slow and frequently requires a seqscan. I'd like to cache the results in a second table and update the counts with triggers, but this would a) require another UPDATE for each INSERT/UPDATE which would slow down adding and updating of data and b) produce a large amount of dead rows for vacuum to clear out. It would also be nice if this small table could be locked into the pg cache somehow. It doesn't need to store the data on disk because the counts can be generated from scratch? I think you might be interested in materialized views. You could create this as a materialized view which should be very fast to just select * from. While materialized views aren't a standard part of PostgreSQL just yet, there is a working implementation available from Jonathan Gardner at: http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html It's all implemented with plpgsql and is quite interesting to read through. IT has a nice tutorial methodology to it. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] index usage
On Mon, 26 Apr 2004, Stephan Szabo wrote: On Fri, 23 Apr 2004 [EMAIL PROTECTED] wrote: I have a query which I think should be using an index all of the time but postgres only uses the index part of the time. The index (ticket_crm_map_crm_id_suppid) has the where clause column (crm_id) listed first followed by the selected column (support_person_id). Wouldn't the most efficient plan be to scan the index each time because the only columns needed are in the index? Below is the table, 2 queries showing the Not necessarily. The rows in the actual file still need to be checked to see if they're visible to the select and if it's expected that the entire file (or a reasonable % of the pages anyway) will need to be loaded using the index isn't necessarily a win. While those of us familiar with PostgreSQL are well aware of the fact that indexes can't be used directly to garner information, but only as a lookup to a tuple in the table, it seems this misconception is quite common among those coming to postgreSQL from other databases. Is there any information that directly reflects this issue in the docs? There are tons of hints that it works this way in how they're written, but nothing that just comes out and says that with pgsql's mvcc implementation, an index scan still has to hit the pages that contain the tuples, so often in pgsql a seq scan is a win where in other databases and index scan would have been a win? If not, where would I add it if I were going to write something up for the docs? Just wondering... ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [ADMIN] 7.4.2 out of memory
On Wed, 28 Apr 2004, Jie Liang wrote: All, After I upgraded postgres from 7.3.4 to 7.4.2, one of my program got following error: DRROR:out of memory DETAIL: Fail on request of size 92. any idea?? does memory management have big difference between 7.3.4 and 7.4.2??? this program using a chunk of share memory and a lot of temp tables. More than likely this is a hash aggregate problem (or can they spill to disk in 7.4.2 yet? I don't think they can, but maybe we should ask Tom. Try setting this before running the query and see what happens: set enable_hashagg = false; ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] OT: Help with performance problems
On Mon, 26 Apr 2004, Rob Fielding wrote: scott.marlowe wrote: On Fri, 23 Apr 2004, Chris Hoover wrote: DB's on Powervaults 220S using raid 5 (over 6 disks) What controller is this, the adaptec? We've found it to be slower than the LSI megaraid based controller, but YMMV. Wow, really? You got any more details of the chipset, mobo and kernel driver ? We're running on a Dell 2650, the controller is the U320 LSI megaraid 2 channel (they only make the one that I know of right now). Don't know my mobo chipset offhand, but might be able to find out what one dell includes on the 2650. The kernel driver is the latest megaraid2 driver as of about Feb this year. I've been taken to my wits end wrestling with an LSI MegaRAID 320-1 controller on a supermicro board all weekend. I just couldn't get anything more than 10MB/sec out of it with megaraid driver v1 OR v2 in Linux 2.4.26, nor the version in 2.6.6-rc2. After 2 days of humming the Adaptec mantra I gave in and switched the array straight onto the onboard Adaptec 160 controller (same cable and everything). Software RAID 5 gets me over 40MB sec for a nominal cpu hit - more than 4 times what I could get out of the MegaRAID controller :( Even the 2nd SCSI-2 channel gets 40MB/sec max (pg_xlog :) And HOW LONG does it take to detect drives during POSTo never mind ... I really just wanna rant :) There should be a free counseling service for enraged sysops. I wonder if your controller is broken or something? Or maybe on a PCI slow that has to share IRQs or something. I've had great luck with SuperMicro mobos in the past (we're talking dual PPro 200 mobos, so seriously, IN THE PAST here... ) Hell, my Dual PPro 200 with an old MegaRAID 428 got 18 Megs a second cfer rate no problem. Have you tried that lsi card in another machine / mobo combo? Can you disable the onboard adaptec? We have on our Dell 2650s, the only active controllers are the onboard IDE and the add in LSI-320-2 controller. We're running ours with 128 Meg cache (I think could be 64) set to write back. I think our throughput on a RAID-1 pair was somewhere around 40+ megs a second reads with bonnie++ With RAID-5 it was not really much faster at reads (something like 60 megs a second) but was much more scalable under heavy parellel read/write access for PostgreSQL. Have you updated the BIOS on the mobo to see if that helps? I'm just throwing darts at the wall here. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Looking for ideas on how to speed up warehouse loading
On Thu, 22 Apr 2004, Sean Shanny wrote: I should have included this as well: fsync | on shared_buffers | 4000 sort_mem | 64000 For purposes of loading only, you can try turning off fsync, assuming this is a virgin load and you can just re-initdb should bad things happen (OS, postgresql crash, power plug pulled, etc...) Also increasing sort_mem and shared_buffers might help. Especially sort_mem. But turn it back down to something reasonable after the import. And turn fsync back on after the import too. Note you have to restart postgresql to make fsync = off take effect. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Help with performance problems
On Fri, 23 Apr 2004, Chris Hoover wrote: DB's on Powervaults 220S using raid 5 (over 6 disks) What controller is this, the adaptec? We've found it to be slower than the LSI megaraid based controller, but YMMV. Running RH ES 2.1 Are you running the latest kernel for ES 2.1? Early 2.4 kernels are pretty pokey and have some odd behaviour under load that later 2.4 kernels seemed to fix. Here is the postgresql.conf from the server with the 11GB db: max_connections = 64 shared_buffers = 32768# 256MB=32768(buffs)*8192(bytes/buff) max_fsm_relations = 1000 # min 10, fsm is free space map, ~40 bytes max_fsm_pages = 1 # min 1000, fsm is free space map, ~6 bytes IF you're doing lots of updates and such, you might want these higher. Have you vacuumed full the databases since taking over? sort_mem = 4096 # 256MB=4096(bytes/proc)*64(procs or conns) Sorry, that's wrong. sort_mem is measure in kbytes. i.e. 8192 means 8 megs sort_mem. Try setting it a bit higher (you've got LOTS of ram in these boxes) to something like 16 or 32 meg. checkpoint_segments = 16 # in logfile segments, min 1, 16MB each checkpoint_timeout = 30 # range 30-3600, in seconds effective_cache_size = 131072 # typically 8KB each This still looks low. On one machine you're showing kernel cache of about .7 gig, on the other it's 6 gig. 6 gigs of kernel cache would be a setting of 80. It's more of a nudge factor than an exact science, so don't worry too much. If you've got fast I/O look at lowering random page cost to something between 1 and 2. We use 1.3 to 1.4 on most of our machines with fast drives under them. I'd use vmstat to see if you're I/O bound. also, look for index bloat. Before 7.4 it was a serious problem. With 7.4 regular vacuuming should reclaim most lost space, but there are corner cases where you still might need to re-index. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Help with performance problems
On Fri, 23 Apr 2004, Chris Hoover wrote: On Friday 23 April 2004 13:21, scott.marlowe wrote: On Fri, 23 Apr 2004, Chris Hoover wrote: DB's on Powervaults 220S using raid 5 (over 6 disks) What controller is this, the adaptec? We've found it to be slower than the LSI megaraid based controller, but YMMV. We are using the perc3/di controller. Believe it is using the megaraid driver. No, that's the adaptec, the PERC3/DC is the lsi megaraid. See if there are newer drivers for the RAID card. In terms of performance, the adaptec and lsi drivers have improved considerably in later versions. In terms of stability they've largely gotten better with a few in between releases on the megaraid getting poor grades. The latest / greatest from Dell is pretty up to date. Running RH ES 2.1 Are you running the latest kernel for ES 2.1? Early 2.4 kernels are pretty pokey and have some odd behaviour under load that later 2.4 kernels seemed to fix. I'm not sure we are at the latest and greatest for 2.1, but I am trying to get there. Management won't let me do the upgrade w/o first testing/proving it will not cause any more issues. Due to all of the current issues, and the criticality of these systems to our bottom line, they are being very careful with any change that may impact our users further. Understood. It's why my production box is still running a 2.4 kernel on rh 7.2 with pg 7.2. They just work, but for us stability AND performance are both good with our load. You can install a new kernel and set up the machine to still boot off of the old one, and test on the weekend to see how it behaves under simulated load. Mining the logs for slow queries is a good way to build one. while we don't upgrade our production server's applications to the latest and greatest all the time (i.e. php or postgresql or openldap) we always run the latest security patches, and I think the latest kernels had security fixes for ES 2.1, so NOT upgrading it dangerous. Late model linux kernels (the 2.0.x and 2.2.x where x20) tend to be VERY stable and very conservatively backported and upgraded, so running a new one isn't usually a big risk. Here is the postgresql.conf from the server with the 11GB db: max_connections = 64 shared_buffers = 32768# 256MB=32768(buffs)*8192(bytes/buff) max_fsm_relations = 1000 # min 10, fsm is free space map, ~40 bytes max_fsm_pages = 1 # min 1000, fsm is free space map, ~6 bytes IF you're doing lots of updates and such, you might want these higher. Have you vacuumed full the databases since taking over? sort_mem = 4096 # 256MB=4096(bytes/proc)*64(procs or conns) Sorry, that's wrong. sort_mem is measure in kbytes. i.e. 8192 means 8 megs sort_mem. Try setting it a bit higher (you've got LOTS of ram in these boxes) to something like 16 or 32 meg. checkpoint_segments = 16 # in logfile segments, min 1, 16MB each checkpoint_timeout = 30 # range 30-3600, in seconds effective_cache_size = 131072 # typically 8KB each This still looks low. On one machine you're showing kernel cache of about .7 gig, on the other it's 6 gig. 6 gigs of kernel cache would be a setting of 80. It's more of a nudge factor than an exact science, so don't worry too much. I believe changing this requires a restart of the cluster (correct?). If so, I'll try bumping up the effective_cache_size over the weekend. Also, will all of the memory available to these machines, should I be running with larger shared_buffers? It seems like 256M is a bit small. No, you probably shouldn't. PostgreSQL doesn't cache in the classical sense. If all backends close, the stuff they had in their buffers disappears in a flash. So, it's generally considered better to let the kernel do the bulk of the caching, and having the buffer area be large enough to hold a large portion, if not all, of your working set of data. But between the cache management which is dirt simple and works but seems to have performance issues with large numbers of buffers, and the fact that all the memory in it disappears when the last backend using it. for instance, in doing the following seq scan select: explain analyze select * from test; where test is a ~10 megabyte table, the first time I ran it it took 5 seconds to run. The second time took it 2.5, the third 1.9, and it levelled out around there. Starting up another backend and running the same query got a 1.9 second response also. Shutting down both connections, and running the query again, with only the kernel for caching, I got 1.9. That's on a 2.4.2[2-4] kernel. If you've got fast I/O look at lowering random page cost to something between 1 and 2. We use 1.3 to 1.4 on most of our machines with fast drives under them. I'd use vmstat to see if you're I/O bound. If we end up being I/O bound, should
Re: [PERFORM] Setting Shared Buffers , Effective Cache, Sort Mem
On Thu, 22 Apr 2004, Pallav Kalva wrote: Hi We are in the process of building a new machine for our production database. Below you will see some of the harware specs for the machine. I need some help with setting these parameters (shared buffers, effective cache, sort mem) in the pg_conf file. Also can anyone explain the difference between shared buffers and effective cache , how these are allocated in the main memory (the docs are not clear on this). Here are the Hardware details: Operating System: Red Hat 9 Database Ver: Postgres 7.4 CPU'S : 4 RAM : 4 gig Datafile layout : RAID 1+0 Transaction log : on different RAID1 Array RAID Stripe Size: 8k Read this first: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html Basically shared buffers are the play area for the database backends to toss data in the air and munge it together. The effective cache size reflects the approximate amount of space your operating system is using to buffer Postgresql data. On a dedicated database machine this is about the same as the size of the kernel buffer shown in top. On a mixed machine, you'll have to see how much of what data is getting buffered to get a guesstimate of how much kernel cache is being used for pgsql and how much for other processes. Then divide that number in bytes by 8192, the default block size. On a machine with 1.2 gigs of kernel cache, that'd be about 150,000 blocks. Buffer sizes from 1000 to 1 blocks are common. Block sizes from 1 to 5 can somtimes increase performance, but those sizes only really make sense for machines with lots of ram, and very large datasets being operated on. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] good pc but bad performance,why?
On Wed, 7 Apr 2004, Andrew McMillan wrote: On Wed, 2004-04-07 at 20:56, huang yaqin wrote: Hello, Richard Huxton, You said turning fsync off may cause losing data, that's terrible. I use SCSI disk, and file system is ext3. I tune postgresql.conf and can't get any improvement. So what can I do? Does SCSI disk and IDE disk have difference? Yes, turning off fsync means that the database is not guaranteeing consistency of writes to disk any longer. On the other hand your IDE system probably never was, because IDE drives just typically turn on write caching in hardware without telling anyone. SCSI typically doesn't turn on write caching in the physical drive by default, as Tom Lane pointed out earlier. Good SCSI has a battery backed up cache, and then it is OK to turn on write caching, because the controller has enough battery to complete all writes in the event of a power failure. Actually, almost all SCSI drives turn on write caching by default, they just don't lie about fsync, so you still have a one update per revolution limit, but other things can be happening while that write is being commited due to the multi-threaded nature of both the SCSI interface and the kernel drivers associated with them It would appear the linux kernel hackers are trying to implement the multi-threaded features of the latest ATA spec, so that, at some future date, you could have IDE drives that cache AND tell the truth of their sync AND can do more than one thing at a time. One thing I recommend is to use ext2 (or almost anything but ext3). There is no real need (or benefit) from having the database on a journalled filesystem - the journalling is only trying to give similar sorts of guarantees to what the fsync in PostgreSQL is doing. Is this true? I was under the impression that without at least meta-data journaling postgresql could still be corrupted by power failure. The suggestion someone else made regarding use of software raid is probably also a good one if you are trying to use the on-board RAID at the moment. Some onboard RAID controllers are fairly good (dell's 2600 series have an adaptec on board that can have battery backed cache that is ok, the lsi megaraid based one is faster under linux though.) But some of them are pretty poor performers. Finally, I would say that because you are seeing poor performance on one box and great performance on another, you should look at the hardware, or at the hardware drivers, for the problem - not so much at PostgreSQL. More than likely, the biggest issue is that the SCSI drives are performing proper fsync, while the IDE drives are lying. Definitely a time to look at a good caching RAID controller. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Postgresql educational sources
On Wed, 7 Apr 2004, Nid wrote: Rather than ask some general, unanswerable question on how to tune my database...I thought I ask where I might find an introduction to...or manual/tutorial for the depths of managing a postgres db. Books? Websites? Assume a basic to intermediate knowledge of DBs in general with a desire to learn about postgres from the ground up. If it makes a difference I'm using a postgres db in a Red Hat Linux OS environment. Thanks! The online (adminstration) docs are quite good, and for tuning, look at the excellent tuning document on varlena: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] atrocious update performance
On Mon, 5 Apr 2004, Kevin Barnard wrote: On 5 Apr 2004 at 12:05, Rosser Schwarz wrote: Just this morning, however, I created a copy of the target table (all 4.7M rows), with absolutely no foreign keys referring to it, and ran the update against the copy. That update took 2300 seconds. The join columns were indexed in both cases. Have you added indexes for the custid column for tables account.acct accunt.orgacct and note? I haven't followed the entire thread but it you have cascading FK on those tables without an index on the column that could cause your delay. also make sure the fk/pk types match, or the index likely won't get used anyway. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] postgres eating CPU on HP9000
On Fri, 26 Mar 2004, Fabio Esposito wrote: On Fri, 26 Mar 2004, scott.marlowe wrote: It maintains 48hours of data, so its not a large database; roughly 600mbs. We do this by running a housekeeping program in a cron job. It deletes all data older then 48hours, then vaccuum analyzes. It will also preform a reindex if the option is set before it vaccuum's. This almost sounds like a problem (fixed in 7.4 I believe) where some system catalog indexes would get huge over time, and couldn't be vacuumed or reindexed while the database was up in multi-user mode. I'll defer to Tom or Bruce or somebody to say if my guess is even close... We haven't tried 7.4, I will experiment with it next week, I hope it will be that simple. In the meantime, a simple dump - reload into a test box running your current version may provide some insight. If it fixes the problem, then you likely do have some kind of issue with index / table growth that isn't being addressed by vacuuming. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] odd planner choice
On Thu, 25 Mar 2004, Ara Anjargolian wrote: I've run into this odd planner choice which I don't quite understand. I have two tables articles, users and articles.article_id and users.user_id are primary keys. Insides articles there are two optional fields author_id1, author_id2 which all reference users.user_id. And now the plans: (by the way this is pg 7.4 and I set enable_seqscan to off). jargol=# explain select user_id, first_names, last_name from articles, users where article_id = 5027 and (articles.author_id1 = users.user_id); QUERY PLAN -- Nested Loop (cost=0.00..4.04 rows=1 width=26) - Index Scan using articles_pk on articles (cost=0.00..2.01 rows=1 width=4) Index Cond: (article_id = 5027) - Index Scan using users_pk on users (cost=0.00..2.01 rows=1 width=26) Index Cond: (outer.author_id1 = users.user_id) (5 rows) jargol=# explain select user_id, first_names, last_name from articles, users where article_id = 5027 and (articles.author_id1 = users.user_id or articles.author_id2 = users.user_id); QUERY PLAN --- Nested Loop (cost=1.00..10003.11 rows=2 width=26) Join Filter: ((outer.author_id1 = inner.user_id) OR (outer.author_id2 = inner.user_id)) - Index Scan using articles_pk on articles (cost=0.00..2.01 rows=1 width=8) Index Cond: (article_id = 5027) - Seq Scan on users (cost=1.00..10001.04 rows=4 width=26) (5 rows) Why does it think it MUST do a seq-scan in the second case? users.user_id is a primary key, so shouldn't it behave exactly as in the first case? Any enlightenment on this problem will be much appreciated. Are articles.author_id1 and users.user_id the same type? Have you tried casting one to the other's type if they're different? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] postgres eating CPU on HP9000
On Tue, 23 Mar 2004, Fabio Esposito wrote: Hello fellow PostgreSQL users. We've been working on this interesting issue for some time now, and we're hoping that someone can help. We've recently integrated postgres into an existing mature app. Its a time sensitive 24x7 system. It runs on HP9000, a K370 Dual Processor system. Postgres is version 7.3.2. Its spawned as a child from a parent supervisory process, and they communicate to eachother via shared memory. We preform 9-12K selects per hour 6-8K inserts per hour (a few updates here as well) 1-1.5K Deletes per hour. It maintains 48hours of data, so its not a large database; roughly 600mbs. We do this by running a housekeeping program in a cron job. It deletes all data older then 48hours, then vaccuum analyzes. It will also preform a reindex if the option is set before it vaccuum's. Postgres initially worked wonderfully, fast and solid. It preformed complex joins in 0.01secs, and was able to keep up with our message queue. It stayed this way for almost a year during our development. Recently it started eating up the cpu, and cannot keepup with the system like it used to. The interesting thing here is that it still runs great on an older system with less ram, one slower cpu, and an older disk. We tried the following with no success: running VACCUUM FULL dropping all tables and staring anew reinstalling postgres tweaking kernel parameters (various combos) tweaking postgres parameters (various combos) a number of other ideas This almost sounds like a problem (fixed in 7.4 I believe) where some system catalog indexes would get huge over time, and couldn't be vacuumed or reindexed while the database was up in multi-user mode. I'll defer to Tom or Bruce or somebody to say if my guess is even close... ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] slow vacuum performance
On Wed, 24 Mar 2004, pginfo wrote: Hi, I am running pg 7.4.1 on linux box. I have a midle size DB with many updates and after it I try to run vacuum full analyze. Is there a reason to not use just regular vacuum / analyze (i.e. NOT full)? It takes about 2 h. Full vacuums, by their nature, tend to be a bit slow. It's better to let the database achieve a kind of steady state with regards to number of dead tuples, and use regular vacuums to reclaim said space rather than a full vacuum. How can I improve the vacuum full analyze time? My configuration: shared_buffers = 15000 # min 16, at least max_connections*2, 8KB each sort_mem = 1# min 64, size in KB You might want to look at dropping sort_mem. It would appear you've been going through the postgresql.conf file and bumping up numbers to see what works and what doesn't. While most of the settings aren't too dangerous to crank up a little high, sort_mem is quite dangerous to crank up high, should you have a lot of people connected who are all sorting. Note that sort_mem is a limit PER SORT, not per backend, or per database, or per user, or even per table, but per sort. IF a query needs to run three or four sorts, it can use 3 or 4x sort_mem. If a hundred users do this at once, they can then use 300 or 400x sort_mem. You can see where I'm heading. Note that for individual sorts in batch files, like import processes, you can bump up sort_mem with the set command, so you don't have to have a large setting in postgresql.conf to use a lot of sort mem when you need to, you can just grab it during that one session. vacuum_mem = 32000 # min 1024, size in KB If you've got lots of memory, crank up vacuum_mem to the 200 to 500 meg range and see what happens. For a good tuning guide, go here: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] slow vacuum performance
On Wed, 24 Mar 2004, pginfo wrote: Hi, scott.marlowe wrote: On Wed, 24 Mar 2004, pginfo wrote: Hi, I am running pg 7.4.1 on linux box. I have a midle size DB with many updates and after it I try to run vacuum full analyze. Is there a reason to not use just regular vacuum / analyze (i.e. NOT full)? Yes, in case I make massive updates (only in my case of cource) for example 2 M rows, I do not expect to have 2M new rows in next 180 days.That is the reaso for running vacuum full. My idea was to free unneedet space and so to have faster system. It is possible that I am wrong. It's all about percentages. If you've got an average of 5% dead tuples with regular vacuuming, then full vacuums won't gain you much, if anything. If you've got 20 dead tuples for each live one, then a full vacuum is pretty much a necessity. The generally accepted best performance comes with 5 to 50% or so dead tuples. Keep in mind, having a few dead tuples is actually a good thing, as your database won't grow then srhink the file all the time, but keep it in a steady state size wise. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
On Tue, 23 Mar 2004, Josh Berkus wrote: Matt, Stalin, As for the compute intensive side (complex joins sorts etc), the Dell will most likely beat the Sun by some distance, although what the Sun lacks in CPU power it may make up a bit in memory bandwidth/ latency. Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do better than you think, comparitively.On all the Dell servers I've used so far, I've not seen performance that comes even close to the hardware specs. We use a 2600 at work (dual 2.8GHz) with the LSI/Megaraid based battery backed caching controller, and it flies. Truly flies. It's not Dell that's so slow, it's the default adaptec RAID controller or IDE drives that are slow. Ours has 533 MHz memory bus, by the way. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] [ADMIN] syslog slowing the database?
On Thu, 11 Mar 2004, Christopher Kings-Lynne wrote: You could also consider not using syslog at all: let the postmaster output to its stderr, and pipe that into a log-rotation program. I believe some people use Apache's log rotator for this with good results. Not an option I'm afraid. PostgreSQL just jams and stops logging after the first rotation... I've read in the docs that syslog logging is the only production solution... Can you use the apache log rotator? It's known to work in my environment (redhat 7.2, postgresql 7.2 and 7.4) with this command to start it in my rc.local file: su - postgres -c 'pg_ctl start | rotatelogs $PGDATA/pglog 86400 21' ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [ADMIN] syslog slowing the database?
On Mon, 15 Mar 2004, Bruce Momjian wrote: scott.marlowe wrote: On Thu, 11 Mar 2004, Christopher Kings-Lynne wrote: You could also consider not using syslog at all: let the postmaster output to its stderr, and pipe that into a log-rotation program. I believe some people use Apache's log rotator for this with good results. Not an option I'm afraid. PostgreSQL just jams and stops logging after the first rotation... I've read in the docs that syslog logging is the only production solution... Can you use the apache log rotator? It's known to work in my environment (redhat 7.2, postgresql 7.2 and 7.4) with this command to start it in my rc.local file: su - postgres -c 'pg_ctl start | rotatelogs $PGDATA/pglog 86400 21' Sure, our documentation specifically mentions using rotatelogs. hehe. What I meant was can Christopher use it, or does he have a limitation in his environment where he can't get ahold of the apache log rotater... :-) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] optimizing large query with IN (...)
On Wed, 10 Mar 2004, Marcus Andree S. Magalhaes wrote: Guys, I got a Java program to tune. It connects to a 7.4.1 postgresql server running Linux using JDBC. The program needs to update a counter on a somewhat large number of rows, about 1200 on a ~130k rows table. The query is something like the following: UPDATE table SET table.par = table.par + 1 WHERE table.key IN ('value1', 'value2', ... , 'value1200' ) This query runs on a transaction (by issuing a call to setAutoCommit(false)) and a commit() right after the query is sent to the backend. The process of committing and updating the values is painfully slow (no surprises here). Any ideas? The problem, as I understand it, is that 7.4 introduced massive improvements in handling moderately large in() clauses, as long as they can fit in sort_mem, and are provided by a subselect. So, creating a temp table with all the values in it and using in() on the temp table may be a win: begin; create temp table t_ids(id int); insert into t_ids(id) values (123); - repeat a few hundred times select * from maintable where id in (select id from t_ids); ... ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Scaling further up
On Thu, 4 Mar 2004, Paul Thomas wrote: On 03/03/2004 18:23 scott.marlowe wrote: [snip] There are three factors that affect how fast you can get to the next sector: seek time settle time rotational latency Most drives only list the first, and don't bother to mention the other two. Ah yes, one of my (very) few still functioning brain cells was nagging about another bit of time in the equation :) On many modern drives, the seek times are around 5 to 10 milliseconds. [snip] Going back to the OPs posting about random_page_cost, imagine I have 2 servers identical in every way except the disk drive. Server A has a 10K rpm drive and server B has a 15K rpm drive. Seek/settle times aren't spectacularly different between the 2 drives. I'm wondering if drive B might actually merit a _higher_ random_page_cost than drive A as, once it gets settled on a disk track, it can suck the data off a lot faster. opinions/experiences anyone? It might well be that you have higher settle times that offset the small gain in rotational latency. I haven't looked into it, so I don't know one way or the other, but it seems a reasonable assumption. However, a common misconception is that the higher angular velocity of the 15krpm drives would allow you to read data faster. In fact, the limit of how fast you can read is set by the head. There's a maximum frequency that it can read, and the areal density / rpm have to be such that you don't exceed that frequency. OFten, the speed at which you read off the platters is exactly the same between a 10k and 15k of the same family. The required lower areal density is the reason 15krpm drives show up in the lower capacities first. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Scaling further up
On Tue, 2 Mar 2004, Anjan Dave wrote: By lots I mean dozen(s) in a raid 10 array with a good controller. I believe, for RAID-10, I will need even number of drives. Correct. Currently, the size of the database is about 13GB, and is not expected to grow exponentially with thousands of concurrent users, so total space is not of paramount importance compared to performance. Does this sound reasonable setup? 10x36GB FC drives on RAID-10 4x36GB FC drives for the logs on RAID-10 (not sure if this is the correct ratio)? 1 hotspare Total=15 Drives per enclosure. Putting the Logs on RAID-10 is likely to be slower than, or no faster than putting them on RAID-1, since the RAID-10 will have to write to 4 drives, while the RAID-1 will only have to write to two drives. now, if you were reading in the logs a lot, it might help to have the RAID-10. Tentatively, I am looking at an entry-level EMC CX300 product with 2GB RAID cache, etc. Pick up a spare, I'll get you my home address, etc... :-) Seriously, that's huge. At that point you may well find that putting EVERYTHING on a big old RAID-5 performs best, since you've got lots of caching / write buffering going on. Question - Are 73GB drives supposed to give better performance because of higher number of platters? Generally, larger hard drives perform better than smaller hard drives because they a: have more heads and / or b: have a higher areal density. It's a common misconception that faster RPM drives are a lot faster, when, in fact, their only speed advantage is slight faster seeks. The areal density of faster spinning hard drives tends to be somewhat less than the slower spinning drives, since the maximum frequency the heads can work in on both drives, assuming the same technology, is the same. I.e. the speed at which you can read data off of the platter doesn't usually go up with a higher RPM drive, only the speed with which you can get to the first sector. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Scaling further up
On Tue, 2 Mar 2004, Anjan Dave wrote: That was part of my original question - whether it makes sense to go for a mid-range SunFire machine (64bit HW, 64bit OS), which is scalable to high amounts of memory, and shouldn't have any issues addressing it all. I've had that kind of setup once temporarily on a V480 (quad UltraSparc, 16GB RAM) machine, and it did well in production use. Without having the time/resources to do extensive testing, I am not sure if Postgres/Solaris9 is really suggested by the community for high-performance, as opposed to a XEON/Linux setup. Storage being a separate discussion. Some folks on the list have experience with Postgresql on Solaris, and they generally say they use Solaris not for performance reasons, but for reliability reasons. I.e. the bigger Sun hardware is fault tolerant. For speed, the X86 32 and 64 bit architectures seem to be noticeable faster than Sparc. However, running Linux or BSD on Sparc make them pretty fast too, but you lose the fault tolerant support for things like hot swappable CPUs or memory. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] FreeBSD config
On Thu, 26 Feb 2004, Dror Matalon wrote: Hi, We have postgres running on freebsd 4.9 with 2 Gigs of memory. As per repeated advice on the mailing lists we configured effective_cache_size = 25520 which you get by doing `sysctl -n vfs.hibufspace` / 8192 Which results in using 200Megs for disk caching. Is there a reason not to increase the hibufspace beyond the 200 megs and provide a bigger cache to postgres? I looked both on the postgres and freebsd mailing lists and couldn't find a good answer to this. Actually, I think you're confusing effective_cache_size with shared_buffers. effective_cache_size changes no cache settings for postgresql, it simply acts as a hint to the planner on about how much of the dataset your OS / Kernel / Disk cache can hold. Making it bigger only tells the query planny it's more likely the data it's looking for will be in cache. shared_buffers, OTOH, sets the amount of cache that postgresql uses. It's generall considered that 256 Megs or 1/4 of memory, whichever is LESS, is a good setting for production database servers. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] A cache for the results of queries ?
On Thu, 26 Feb 2004, David Pradier wrote: Hi everybody, i'd like to know if it exists a system of cache for the results of queries. I believe there are some external libs that provide this at the application level. PHP's adodb is purported to do so. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] FreeBSD config
On Thu, 26 Feb 2004, Dror Matalon wrote: On Thu, Feb 26, 2004 at 11:55:31AM -0700, scott.marlowe wrote: On Thu, 26 Feb 2004, Dror Matalon wrote: Hi, We have postgres running on freebsd 4.9 with 2 Gigs of memory. As per repeated advice on the mailing lists we configured effective_cache_size = 25520 which you get by doing `sysctl -n vfs.hibufspace` / 8192 Which results in using 200Megs for disk caching. Is there a reason not to increase the hibufspace beyond the 200 megs and provide a bigger cache to postgres? I looked both on the postgres and freebsd mailing lists and couldn't find a good answer to this. Actually, I think you're confusing effective_cache_size with shared_buffers. No, I'm not. OK, sorry, I wasn't sure which you meant. effective_cache_size changes no cache settings for postgresql, it simply acts as a hint to the planner on about how much of the dataset your OS / Kernel / Disk cache can hold. I understand that. The question is why have the OS, in this case FreeBsd use only 200 Megs for disk cache and not more. Why not double the vfs.hibufspace to 418119680 and double the effective_cache_size to 51040. Doesn't the kernel just use the spare memory to buffer anyway? I'd say if you got 2 megs memory and nothing else on the box, give a big chunk (1 gig or so) to the kernel to manage. Unless large kernel caches cause some issues in FreeBSD. Making it bigger only tells the query planny it's more likely the data it's looking for will be in cache. shared_buffers, OTOH, sets the amount of cache that postgresql uses. It's generall considered that 256 Megs or 1/4 of memory, whichever is LESS, is a good setting for production database servers. Actually last I looked, I thought that the recommended max shared buffers was 10,000, 80MB, even on machines with large amounts of memory. It really depends on what you're doing. For loads involving very large data sets, up to 256 Megs has resulted in improvements, but anything after that has only had advantages in very limited types of applications. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] General performance questions about postgres on Apple
On Sun, 22 Feb 2004, Sean Shanny wrote: Tom, We have the following setting for random page cost: random_page_cost = 1# units are one sequential page fetch cost Any suggestions on what to bump it up to? We are waiting to hear back from Apple on the speed issues, so far we are not impressed with the hardware in helping in the IO department. Our DB is about 263GB with indexes now so there is not way it is going to fit into memory. :-( I have taken the step of breaking out the data into month based groups just to keep the table sizes down. Our current months table has around 72 million rows in it as of today. The joys of building a data warehouse and trying to make it as fast as possible. You may be able to achieve similar benefits with a clustered index. see cluster: \h cluster Command: CLUSTER Description: cluster a table according to an index Syntax: CLUSTER indexname ON tablename CLUSTER tablename CLUSTER I've found this can greatly increase speed, but on 263 gigs of data, I'd run it when you had a couple days free. You might wanna test it on a smaller test set you can afford to chew up some I/O CPU time on over a weekend. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Slow response of PostgreSQL
Easy two step procedure for speeding this up: 1: Upgrade to 7.4.1 2: Read this: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Tables on multiple disk drives
On Tue, 17 Feb 2004, Konstantin Tokar wrote: Hi! Does PostgreSQL allow to create tables and indices of a single database on multiple disk drives with a purpose of increase performance as Oracle database does? If a symbolic reference is the only method then the next question is: how can it be determined what file is referred to what table and index? You're life will be simpler, and your setup will be faster without having to muck about with it, if you just buy a good RAID controller with battery backed cache. LSI/Megaraid and Adaptec both make serviceable controllers for reasonable prices, and as you add drives, the speed just goes up, no muddling around with sym links. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Tables on multiple disk drives
On Tue, 17 Feb 2004, Craig Thomas wrote: On Tue, 17 Feb 2004, Craig Thomas wrote: On Tue, 17 Feb 2004, Konstantin Tokar wrote: Hi! Does PostgreSQL allow to create tables and indices of a single database on multiple disk drives with a purpose of increase performance as Oracle database does? If a symbolic reference is the only method then the next question is: how can it be determined what file is referred to what table and index? You're life will be simpler, and your setup will be faster without having to muck about with it, if you just buy a good RAID controller with battery backed cache. LSI/Megaraid and Adaptec both make serviceable controllers for reasonable prices, and as you add drives, the speed just goes up, no muddling around with sym links. This works to a limited extent. For very large databases, maximum throughput of I/O is the paramount factor for database performance. With raid controllers, your LUN is still limited to a small number of disks. PostgreSQL can only write on a file system, but Oracle, SAP DB, DB2, etc can write directly to disk (raw I/O). With large databases it is advantageous to spread a table across 100's of disks, if the table is quite large. I don't know of any manufacturer that creates a 100 disk raid array yet. You can run up to four LSI / Megaraids in one box, each with 3 UW SCSI interfaces, and they act as one unit. That's 3*4*15 = 180 disks max. With FC AL connections and four cards, it would be possible to approach 1000 drives. Of course, I'm not sure how fast any RAID card setup is gonna be with that many drives, but ya never know. My guess is that before you go there you buy a big external RAID box built for speed. We have a couple of 200+ drive external RAID5 storage boxes at work that are quite impressive. That's a good point. But it seems that the databases that are the leaders of the TPC numbers seem to be the Oracles of the world. I know that a former company I worked for publised TPC numbers using Oracle with Raw I/O to get the performance up. But keep in mind, that in the TPC benchmarks, doing things that require lots of dba work don't tend to make the cost in the test go up (you can hide a lot of admin work in those things) while in real life, they do drive up the real cost of maintenance. I'd imagine that with Postgresql coming along nicely, it may well be that in a year or two, in the real world, you can just take the money you'd have spend on Oracle licenses and Oracle DBAs and just throw more drives at a problem to solve it. And still spend less money than you would on Oracle. :-) ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Slow response of PostgreSQL
On Wed, 18 Feb 2004, Christopher Kings-Lynne wrote: 1- How can I lock a single record so that other users can only read it. ?? You cannot do that in PostgreSQL. How about SELECT ... FOR UPDATE? No, because users cannot read the locked row in that case. I just tested it (within transactions) and it appeared that I could still view the rows selected for update. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] slow database
If my boss came to me and asked me to make my database server run as slowly as possible, I might come up with the exact same postgresql.conf file as what you posted. Just installing the default postgresql.conf that came with postgresql should make this machine run faster. Read this: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Compile Vs RPMs
On Tue, 3 Feb 2004, Christopher Browne wrote: [EMAIL PROTECTED] (Anjan Dave) writes: I would like to know whether there are any significant performance advantages of compiling (say, 7.4) on your platform (being RH7.3, 8, and 9.0, and Fedora especially) versus getting the relevant binaries (rpm) from the postgresql site? Hardware is Intel XEON (various speeds, upto 2.8GHz, single/dual/quad configuration). Some Linux distribution makers make grand claims of such advantages, but it is not evident that this is much better than superstition. You are certainly NOT going to see GCC generating MMX code automagically that would lead to PostgreSQL becoming 8 times faster. Indeed, in database work, it is quite likely that you will find things to be largely I/O bound, with CPU usage being a very much secondary factor. I did some relative benchmarking between compiling PostgreSQL on GCC versus IBM's PPC compilers a while back; did not see differences that could be _clearly_ discerned as separate from observational noise. You should expect find that adding RAM, or adding a better disk controller would provide discernable differences in performance. It is much less clear that custom compiling will have any substantial effect on I/O-bound processing. I would add that the primary reason for compiling versus using RPMs is to take advantage of some compile time option having to do with block size, or using a patch to try and test a system that has found a new corner case where postgresql is having issues performing well, like the vacuum page delay patch for fixing the issue with disk bandwidth saturation. If you've got a machine grinding to its knees under certain loads, and have a test box to test it on, and the test box shows better performance, it might be better to patch the live server on the off hours if it will keep the thing up and running during the day. In that way, performance differences are very real, but because you are doing something you can't do with factory rpms. Of course, building custom rpms isn't that hard to do, so if you had a lot of boxes that needed a patched flavor of postgresql, you could still run from rpms and have the custom patch. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Overlaping indexes
On Mon, 2 Feb 2004, Tomasz Myrta wrote: Dnia 2004-02-02 15:46, U?ytkownik Rigmor Ukuhe napisa3: Hi, I have many indexes somehow overlaping like: ... btree (STATUS, VISIBLE, NP_ID); ... btree (STATUS, VISIBLE); is perfomance gained by more exact index worth overhead with managing indexes. The second (2 columns) index is useless - it's function is well done by the first one (3 columns). Not entirely, since it only has to sort two columns, it will be smaller, and will therefore be somewhat faster. On the other hand, I've seen a lot of folks create multi column indexes who didn't really understand how they work in Postgresql. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Overlaping indexes
On Mon, 2 Feb 2004, Tomasz Myrta wrote: Dnia 2004-02-02 19:30, U¿ytkownik scott.marlowe napisa³: Not entirely, since it only has to sort two columns, it will be smaller, and will therefore be somewhat faster. Can you say something more about it? Will it be enough faster to keep them both? Did anyone make such tests? that really depends on the distribution of the third column. If there's only a couple of values in the third column, no big deal. If each entry is unique, and it's a large table, very big deal. It is only useful to have a three column index if you actually use it. If you have an index on (a,b,c) and select order by b, the index won't get used unless the a part is in the where clause. the other issue is updates. IT WILL cost more to update two indexes rather than one. Generally, you can drop / readd the index and use explain analyze on one of your own queries to see if that helps. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
inserting large number of rows was: Re: [PERFORM] Increasing number of PG connections.
On Mon, 2 Feb 2004, Qing Zhao wrote: I am new here. I have a question related to this in some way. Our web site needs to upload a large volume of data into Postgres at a time. The performance deterioates as number of rows becomes larger. When it reaches 2500 rows, it never come back to GUI. Since the tests were run through GUI, my suspision is that it might be caused by the way the application server talking to Postgres server, the connections, etc.. What might be the factors involved here? Does anyone know? Actually, I'm gonna go out on a limb here and assume two things: 1. you've got lotsa fk/pk relationships setup. 2. you're analyzing the table empty before loading it up. What happens in this instance is that the analyze on an empty, or nearly so, table, means that during the inserts, postgresql thinks you have only a few rows. At first, this is fine, as pgsql will seq scan the tables to make sure there is a proper key in both. As the number of rows increases, the planner needs to switch to index scans but doesn't, because it doesn't know that the number of rows is increasing. Fix: insert a few hundred rows, run analyze, check to see if the explain for inserts is showing index scans or not. If not, load a few more hundred rows, analyze, rinse, repeat. Also, look for fk/pk mismatches. I.e. an int4 field pointing to an int8 field. That's a performance killer, so if the pk/fk types don't match, see if you can change your field types to match and try again. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Increasing number of PG connections.
On Mon, 2 Feb 2004, Kevin Barnard wrote: On 2 Feb 2004 at 13:58, scott.marlowe wrote: what do you mean at 2 GB? Is that how much is in kernel cache plus buffer, plus used, plus etc??? Could you give us the top of top output to make sure? If most of that is kernel cache, then that's fine. 2GB was total system memory. We upgraded to 4GB to prior to increasing the number of connections. Oh, ok. I thought you meant the system was using 2 gigs of RAM for postgresql Here's the top of top 16:14:17 up 2 days, 16:15, 1 user, load average: 7.60, 6.56, 4.61 730 processes: 721 sleeping, 9 running, 0 zombie, 0 stopped CPU states: cpuusernice systemirq softirq iowaitidle total0.0%0.0%0.0% 0.0% 0.0%0.0%0.0% cpu000.0%0.0%0.0% 0.0% 0.0%0.0%0.0% cpu010.0%0.0%0.0% 0.0% 0.0%0.0%0.0% cpu020.0%0.0%0.0% 0.0% 0.0%0.0%0.0% cpu030.0%0.0%0.0% 0.0% 0.0%0.0%0.0% Mem: 3747644k av, 3298344k used, 449300k free, 0k shrd, 147880k buff 2158532k active, 760040k inactive Swap: 1048088k av, 0k used, 1048088k free 2262156k cached when you have a high load but load CPU usage, you are usually I/O bound. The DB is pretty close to max connections at this point in time. I don't know why CPU shows 0% in every bucket. It looks like I can increase the number of connections a little from here. This is a fairly standard Fedora install. It's using version 2.4.22 of the Kernel. Postgres is a complied version using 7.4.1 On this machine you could probably handle even more. What I want is to get your page return times down enough so you don't need to increase the number of connections. I.e. if you've got 2 second response times and you drop those to 0.2 seconds, then you won't need as many processes to handle the load (theoretically... :-) experience has been that individual postgresql backends only weigh in at a mega byte at most, and they share buffer, so 700 connections can be anywhere from 300meg to 1 gig. the rest would be buffer memory. It's not a good idea to give up too much to shared buffers, as the database isn't as good at caching as the kernel. OK I take this as I should keep shared buffers around 2x connections then correct? Not really. What happens is that if the shared buffers are so large that they are as large as or god forbid, larger than the kernel cache, then the kernel cache becomes less effective. The general rule of thumb is 25% of memory, or 256 Megs, whichever is less. The real test is that you want enough shared_buffers so that all the result sets currently being smooshed up against each other in joins, sorts, etc... can fit in postgresql's shared buffers, or at least the buffers can hold a fair chunk of it. So, the number of buffers can be anywhere from a few thousand, up to 4 or 5, sometimes even higher. But for most tuning you won't be needing to be above 32768, which is 256 Megs of ram. What do you have in postgresql.conf? sort_mem, shared_buffers, etc??? Here is what I have that is not set from the defaults. max_connections = 700 shared_buffers = 1500 sort_mem = 512 random_page_cost = 2 stats_start_collector = true stats_command_string = true stats_block_level = true stats_row_level = true sort_mem can be a real killer if it lets the processes chew up too much memory. Once sort_mem gets high enough to make the machine start swapping it is doing more harm than good being that high, and should usually be lowered a fair bit. I dropped it down to 512 as you can see. Should I be running with all of the stats on? I am no longer using pg_autovacuum. I seem to be getting better results with an hourly Vacuum anaylse. Seeing as how top shows 2262156k kernel cache, you can afford to give up a fair bit more than 512k per sort. I generally run 8192 (8 meg) but I don't handle 700 simos. Try running it a little higher, 2048, 4096, etc... and see if that helps. Note you can change sort_mem and just do a pg_ctl reload to make the change, without interrupting service, unlike shared_buffers, which requires a restart. How many disks in your RAID5? The more the better. Is it hardware with battery backed cache? If you write much to it it will help to have battery backed cache on board. If it's a megaraid / LSI board, get the megaraid2 driver, it's supposedly much faster. 4 disk IBM ServeRAID 5i with battery backed cache. Do you have the cache set to write back or write through? Write through can be a performance killer. But I don't think your RAID is the problem, it looks to me like postgresql is doing a lot of I/O. When you run top, do the postgresql processes show a lot of D status? That's usually waiting
Re: [PERFORM] 100 simultaneous connections, critical limit?
On Wed, 14 Jan 2004, Jón Ragnarsson wrote: I am writing a website that will probably have some traffic. Right now I wrap every .php page in pg_connect() and pg_close(). Then I read somewhere that Postgres only supports 100 simultaneous connections (default). Is that a limitation? Should I use some other method when writing code for high-traffic website? A few tips from an old PHP/Apache/Postgresql developer. 1: Avoid pg_pconnect unless you are certain you have load tested the system and it will behave properly. pg_pconnect often creates as many issues as it solves. 2: While php has pretty mediocre run time performance, it's startup / shutdown / cleanup are quite fast, and it caches previously executed pages. Thus, if your pages are relatively small, code-wise, then the amount of time it will take to execute them, versus the amount of time the user will spend reading the output will be quite small. So, you can likely handle many hundreds of users before hitting any limit on the database end. 3: Apache can only run so many children too. The default for the 1.3 branch is 150. If you decrease that to 50 or so, you are quite unlikely to ever run out of connections to the database. 4: Postgresql can handle thousands of connections if the server and postgresql itself are properly configured, so don't worry so much about that. You can always increase the max should you need to later. 5: Database connection time in a php script is generally a non-issue. pg_connect on a fast machine, hitting a local pgsql database generally runs in about 1/10,000th of a second. Persistant connects get this down to about 1/1,000,000th of a second. Either way, a typical script takes milliseconds to run, i.e. 1/100th of a second or longer, so the actual difference between a pg_pconnect and a pg_connect just isn't worth worrying about in 99% of all circumstances. 6: Profile your user's actions and the time it takes the server versus how long it takes them to make the next click. Even the fastest user is usually much slower than your server, so it takes a whole bunch of them to start bogging the system down. 7: Profile your machine under parallel load. Note that machine simos (i.e. the kind you get from the ab utility) generally represent about 10 to 20 real people. I.e. if your machine runs well with 20 machine simos, you can bet on it handling 100 or more real people with ease. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] 100 simultaneous connections, critical limit?
On Wed, 14 Jan 2004, Adam Alkins wrote: scott.marlowe wrote: A few tips from an old PHP/Apache/Postgresql developer. 1: Avoid pg_pconnect unless you are certain you have load tested the system and it will behave properly. pg_pconnect often creates as many issues as it solves. I share the above view. I've had little success with persistent connections. The cost of pg_connect is minimal, pg_pconnect is not a viable solution IMHO. Connections are rarely actually reused. I've found that for best performance with pg_pconnect, you need to restrict the apache server to a small number of backends, say 40 or 50, extend keep alive to 60 or so seconds, and use the same exact connection string all over the place. Also, set max.persistant.connections or whatever it is in php.ini to 1 or 2. Note that max.persistant.connections is PER BACKEND, not total, in php.ini, so 1 or 2 should be enough for most types of apps. 3 tops. Then, setup postgresql for 200 connections, so you'll never run out. Tis better to waste a little shared memory and be safe than it is to get the dreaded out of connections error from postgresql. If you do all of the above, pg_pconnect can work pretty well, on things like dedicated app servers where only one thing is being done and it's being done a lot. On general purpose servers with 60 databases and 120 applications, it adds little, although extending the keep alive timeout helps. but if you just start using pg_pconnect without reconfiguring and then testing, it's quite likely your site will topple over under load with out of connection errors. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] COUNT Pagination
On Wed, 14 Jan 2004, Rajesh Kumar Mallah wrote: scott.marlowe wrote: On Tue, 13 Jan 2004, David Shadovitz wrote: We avert the subsequent execution of count(*) by passing the value of count(*) as a query parameter through the link in page numbers. Mallah, and others who mentioned caching the record count: Yes, I will certainly do this. I can detect whether the query's filter has been changed, or whether the user is merely paging through the results or sorting* the results. I'd love to completely eliminate the cost of the COUNT(*) query, but I guess that I cannot have everything. * My HTML table column headers are hyperlinks which re-execute the query, sorting the results by the selected column. The first click does an ASC sort; a second click does a DESC sort. another useful trick is to have your script save out the count(*) result in a single row table with a timestamp, and every time you grab if, check to see if x number of minutes have passed, and if so, update that row with a count(*). Greetings! The count(*) can get evaluated with any arbitrary combination in whre clause how do you plan to store that information ? In a typical application pagination could be required in n number of contexts . I would be interested to know more about this trick and its applicability in such situations. Offtopic: Does PostgreSQL optimise repeated execution of similar queries ie queries on same table or set of tables (in a join) with same where clause and only differing in LIMIT and OFFSET. Yes, and no. Yes, previously run query should be faster, if it fits in kernel cache. No, Postgresql doesn't cache any previous results or plans (unless you use prepare / execute, then it only caches the plan, not the query results). Plus, the design of Postgresql is such that it would have to do a LOT of cache checking to see if there were any updates to the underlying data between selects. Since such queries are unlikely to be repeated inside a transaction, the only place where you wouldn't have to check for new tuples, it's not really worth trying to implement. Keep in mind most databases can use an index on max(*) because each aggregate is programmed by hand to do one thing. In Postgresql, you can create your own aggregate, and since there's no simple way to make aggregates use indexes in the general sense, it's not likely to get optimized. I.e. any optimization for JUST max(*)/min(*) is unlikely unless it can be used for the other aggregates. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] COUNT Pagination
On Tue, 13 Jan 2004, David Shadovitz wrote: We avert the subsequent execution of count(*) by passing the value of count(*) as a query parameter through the link in page numbers. Mallah, and others who mentioned caching the record count: Yes, I will certainly do this. I can detect whether the query's filter has been changed, or whether the user is merely paging through the results or sorting* the results. I'd love to completely eliminate the cost of the COUNT(*) query, but I guess that I cannot have everything. * My HTML table column headers are hyperlinks which re-execute the query, sorting the results by the selected column. The first click does an ASC sort; a second click does a DESC sort. another useful trick is to have your script save out the count(*) result in a single row table with a timestamp, and every time you grab if, check to see if x number of minutes have passed, and if so, update that row with a count(*). You can even have a cron job do it so your own scripts don't incur the cost of the count(*) and delay output to the user. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Index creation
On Wed, 7 Jan 2004, Eric Jain wrote: Any tips for speeding up index creation? I need to bulk load a large table with 100M rows and several indexes, some of which span two columns. By dropping all indexes prior to issuing the 'copy from' command, the operation completes 10x as fast (1.5h vs 15h). Unfortunately, recreating a single index takes nearly as long as loading all of the data into the table; this more or less eliminates the time gained by dropping the index in the first place. Also, there doesn't seem to be a simple way to disable/recreate all indexes for a specific table short of explicitely dropping and later recreating each index? Note that you can issue the following command to see all the index definitions for a table: select * from pg_indexes where tablename='sometable'; And store those elsewhere to be reused when you need to recreate the index. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] general peformance question
On Thu, 18 Dec 2003, Conny Thimren wrote: Hi, This a kind of newbie-question. I've been using Postgres for a long time in a low transction environment - and it is great. Now I've got an inquiry for using Postgresql in a heavy-load on-line system. This system must handle something like 20 questions per sec with a response time at 1/10 sec. Each question will result in approx 5-6 reads and a couple of updates. Anybody have a feeling if this is realistic on a Intelbased Linux server with Postgresql. Ofcourse I know that this is too little info for an exact answer but - as I said - maybe someone can give a hint if it's possible. Maybe someone with heavy-load can give an example of what is possible... That really depends on how heavy each query is, so it's hard to say from what little you've given us. If you are doing simple banking style transactions, then you can easily handle this load, if you are talking a simple shopping cart, ditto, if, however, you are talking about queries that run 4 or 5 tables with millions of rows againts each other, you're gonna have to test it yourself. With the autovacuum daemon running, I ran a test overnight of pgbench (more for general purpose burn in than anything else) pgbench -i -s 100 pgbench -c 50 -t 25 that's 10 million transactions, and it took just over twelve hours to complete at 220+ transactions per second. so, for financials, you're likely to find it easy to meet your target. But as the tables get bigger / more complex / more interconnected you'll see a drop in performance. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] [ADMIN] ODBC Driver generates a too big windows swap file and
On Fri, 12 Dec 2003, Rhaoni Chiu Pereira wrote: Hi, is there a switch in your pgsql/odbc connector to enable cursors? If so, try turning that on. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] tuning questions
On Thu, 4 Dec 2003, Jack Coates wrote: Another problem is that performance of the 6 million row job is decent if I stop the job and run a vacuumdb --analyze before letting it continue; is this something that 7.4 will help with? vacuumb --analyze doesn't seem to have much effect on the 18 million row job. Just to add to what the others have said here, you probably want to run the pg_autovacuum daemon in the background. It comes with 7.4 but will work fine with 7.3. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] tuning questions
On Thu, 4 Dec 2003, Jack Coates wrote: On Thu, 2003-12-04 at 12:27, Richard Huxton wrote: On Thursday 04 December 2003 19:50, Jack Coates wrote: I'm trying to set Postgres's shared memory usage in a fashion that allows it to return requested results quickly. Unfortunately, none of these changes allow PG to use more than a little under 300M RAM. vacuumdb --analyze is now taking an inordinate amount of time as well (40 minutes and counting), so that change needs to be rolled back. You don't want PG to use all your RAM, it's designed to let the underlying OS do a lot of caching for it. Probably worth having a look at vmstat/iostat and see if it's saturating on I/O. latest changes: shared_buffers = 35642 max_fsm_relations = 1000 max_fsm_pages = 1 wal_buffers = 64 sort_mem = 32768 vacuum_mem = 32768 effective_cache_size = 1 /proc/sys/kernel/shmmax = 5 IO is active, but hardly saturated. CPU load is hefty though, load average is at 4 now. Postgresql is busily managing a far too large shared buffer. Let the kernel do that. Postgresql's shared buffers should be bug enough to hold as much of the current working set as it can, up to about 25% or so of the servers memory, or 512Meg, whichever comes first. Unless a single query will actually use all of the buffer at once, you're not likely to see an improvement. Also, your effective cache size is really small. On a typical Postgresql server with 2 gigs of ram, you'll have about 1 to 1.5 gigs as kernel cache and buffer, and if it's dedicated to postgresql, then the effective cache setting for 1 gig would be 131072 (assuming 8k pages). If you're updating a lot of tuples without vacuums, you'll likely want to up your fsm settings. Note you can change things like sort_mem, effective_cache_size and random_page_cost on the fly (but not buffers, they're allocated at startup, nor fsm, they are as well.) so, if you're gonna have one huge honkin query that needs to sort a hundred megs at a time, but you'd rather not up your sort memory that high (sort mem is PER SORT, not per backend or per database, so it can get out of hand quickly) then you can just set sort_mem=128000; before throwing out the big queries that need all the sort. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Value of Quad vs. Dual Processor machine
As another post pointed out, you need to set cflags to get optimization under Solaris on that flavor of Postgresql. Also, Postgresql tends to get its best performance from the free unixes, Linux and BSD. those are available for Sun Sparcs, but postgresql in 64 bit mode on those boxes is still a bit cutting edge. It might be worth a try to set up the sun to dual boot to either BSD or Linux and test Postgresql under that environment to see how it works and compare it to Sun after you've set the cflags and recompiled. On Tue, 11 Nov 2003, Chris Field wrote: we are looking at Xeon, We are currently running it on a quad sun v880 compiled to be 64bit and have been getting dreadful performance. I don't think we really have much to gain from going 64bit. - Original Message - From: Ron Johnson [EMAIL PROTECTED] To: PgSQL Performance ML [EMAIL PROTECTED] Sent: Tuesday, November 11, 2003 8:24 PM Subject: Re: [PERFORM] Value of Quad vs. Dual Processor machine On Tue, 2003-11-11 at 17:32, Chris Field wrote: We are getting ready to spec out a new machine and are wondering about the wisdom of buying a quad versus a dual processor machine. Seing as how postgres in not a threaded application, and this server will only be used for log/transaction analysis (it will only ever have a few large queries running). Is there any performance to be gained, and if so is it worth the large cost? Any thoughts/experience are much appreciated... Xeon or Opteron? The faster Opterons *really* blaze, especially in 64-bit mode. As others have said, though, RAM and I/O are most important. -- - Ron Johnson, Jr. [EMAIL PROTECTED] Jefferson, LA USA As I like to joke, I may have invented it, but Microsoft made it popular David Bradley, regarding Ctrl-Alt-Del ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] *very* slow query to summarize data for a month ...
On 11 Nov 2003, Greg Stark wrote: Marc G. Fournier [EMAIL PROTECTED] writes: On Tue, 11 Nov 2003, Greg Stark wrote: Actually you might be able to get the same effect using function indexes like: create index i on traffic_log (month_trunc(runtime), company_id) had actually thought of that one ... is it something that is only available in v7.4? Hum, I thought you could do simple functional indexes like that in 7.3, but perhaps only single-column indexes. In any case, given your situation I would seriously consider putting a month integer column on your table anyways. Then your index would be a simple (month, company_id) index. In 7.3 and before, you had to use only column names as inputs, so you could cheat: alter table test add alp int; alter table test add omg int; update test set alp=0; update test set omg=13; and then create a functional index: create index test_xy on test (substr(info,alp,omg)); select * from test where substr(info,alp,omg)=='abcd'; ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Value of Quad vs. Dual Processor machine
On Tue, 11 Nov 2003, Rod Taylor wrote: On Tue, 2003-11-11 at 18:32, Chris Field wrote: We are getting ready to spec out a new machine and are wondering about the wisdom of buying a quad versus a dual processor machine. Seing as how postgres in not a threaded application, and this server will only be used for log/transaction analysis (it will only ever have a few large queries running). Is there any performance to be gained, and if so is it worth the large cost? Any thoughts/experience are much appreciated... Since you're asking the question, I'll assume you don't have CPU intensive queries or monstrous loads. I'd probably invest in a Quad system with 2 chips in it (2 empty sockets) and put the difference in funds into a few extra GB of Ram or improved IO. In 6 months or a year, if you start doing longer or more complex queries, toss in the other 2 chips. So long as you don't hit a memory limit, it'll be fine. Note that you want to carefully look at the difference in cost of the motherboard versus the CPUs. It's often the motherboard that raises the cost, not the CPUs so much. Although with Xeons, the CPUs are not cheap. The second issue is that Intel (and AMD probably) only guarantee proper performance from chips int he same batch, so you may wind up replacing the two working CPUs with two new ones to go with the other two you'll be buying, to make sure that they work together. My guess is that more CPUs aren't gonna help this problem a lot, so look more at fast RAM and lots of it, as well as a fast I/O subsystem. 2 CPUs should be plenty. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] vacuum locking
On Wed, 29 Oct 2003, Rob Nagler wrote: Greg Stark writes: SELECT a, (SELECT name FROM t2 WHERE t2.f2 = t1.f2) FROM t1 GROUP BY f2 This doesn't solve the problem. It's the GROUP BY that is doing the wrong thing. It's grouping, then aggregating. But at least in the form above it will consider using an index on f2, and it will consider using indexes on t1 and t2 to do the join. There are 20 rows in t2, so an index actually slows down the join. I had to drop the index on t1.f2, because it was trying to use it instead of simply sorting 20 rows. t2 was 'vacuum full'ed and analyzed, right? Just guessing. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] vacuum locking
On Thu, 30 Oct 2003, Rob Nagler wrote: scott.marlowe writes: t2 was 'vacuum full'ed and analyzed, right? Just guessing. Fresh import. I've been told this includes a ANALYZE. You should probably run analyze by hand just to be sure. If the planner is using an index scan on a table with 20 rows, then it's likely it has the default statistics for the table, not real ones. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL 7.4beta5 vs MySQL 4.0.16 with
On Thu, 30 Oct 2003 [EMAIL PROTECTED] wrote: So its not just PostgreSQL that is suffering from the bad SQL but MySQL also. But the question is my does PostgreSQL suffer so badly ?? I think not all developers write very nice SQLs. Its really sad to see that a fine peice of work (RT) is performing sub-optimal becoz of malformed SQLs. [ specially on database of my choice ;-) ] Post EXPLAIN ANALYZES of the queries you're running, then maybe you'll be able to get some useful help from this list. Until then, it's very hard to speculate as to why PostgreSQL is slower. -sc Here It is: in case they are illegeble please lemme know i will attach it as .txt files. Slower One: explain analyze SELECT DISTINCT main.* FROM Groups main , Principals Principals_1, ACL ACL_2 WHERE ((ACL_2.RightName = 'OwnTicket')OR(ACL_2.RightName = 'SuperUser')) AND ( ( ACL_2.PrincipalId = Principals_1.id AND ACL_2.PrincipalType = 'Group' AND ( main.Domain = 'SystemInternal' OR main.Domain = 'UserDefined' OR main.Domain = 'ACLEquivalence') AND main.id = Principals_1.id) OR ( ( (main.Domain = 'RT::Queue-Role' AND main.Instance = 25) OR ( main.Domain = 'RT::Ticket-Role' AND main.Instance = 6973) ) AND main.Type = ACL_2.PrincipalType AND main.id = Principals_1.id) ) AND (ACL_2.ObjectType = 'RT::System' OR (ACL_2.ObjectType = 'RT::Queue' AND ACL_2.ObjectId = 25) ) ORDER BY main.Name ASC ; Note here: Merge Join (cost=1788.68..4735.71 rows=1 width=85) (actual time=597.540..1340.526 rows=20153 loops=1) Merge Cond: (outer.id = inner.id) This estimate is WAY off. Are both of those fields indexed and analyzed? Have you tried upping the statistics target on those two fields? I assume they are compatible types. You might try 'set enable_mergejoin = false' and see if it does something faster here. Just a guess. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] My own performance/tuning qa
On Fri, 24 Oct 2003, Vivek Khera wrote: AL == Allen Landsidel [EMAIL PROTECTED] writes: AL Well I had the vacuums running every 15 minutes for a while.. via a AL simple cron script I wrote just to make sure no more than one vacuum AL ran at once, and to 'nice' the job.. but performance on the db does nice-ing the client does nothing for the backend server that does the actual work. You need to track down the PID of the backend server running the vacuum and renice *it* to get any effect. Note that Tom has mentioned problems with possible deadlocks when nicing individual backends before, so proceed with caution here. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Tuning for mid-size server
On Tue, 21 Oct 2003, Josh Berkus wrote: Anjan, Pretty soon, a PowerEdge 6650 with 4 x 2Ghz XEONs, and 8GB Memory, with internal drives on RAID5 will be delivered. Postgres will be from RH8.0. How many drives? RAID5 sucks for heavy read-write databases, unless you have 5+ drives. Or a large battery-backed cache. You don't need a large cache, so much as a cache. The size isn't usually an issue now that 64 to 256 megs caches are the nominal cache sizes. Back when it was a choice of 4 or 8 megs it made a much bigger difference than 64 versus 256 meg make today. Also, if it's a read only environment, RAID5 with n drives equals the performance of RAID0 with n-1 drives. Also, last I checked, you can't address 8GB of RAM without a 64-bit processor. Since when are the Xeons 64-bit? Josh, you gotta get out more. IA32 has supported 4 gig ram for a long time now, and so has the linux kernel. It uses a paging method to do it. Individual processes are still limited to ~3 gig on Linux on 32 bit hardware though, so the extra mem will almost certainly spend it's time as kernel cache. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Tuning for mid-size server
On Tue, 21 Oct 2003, Josh Berkus wrote: Scott, Also, if it's a read only environment, RAID5 with n drives equals the performance of RAID0 with n-1 drives. True. Josh, you gotta get out more. IA32 has supported 4 gig ram for a long time now, and so has the linux kernel. It uses a paging method to do it. Individual processes are still limited to ~3 gig on Linux on 32 bit hardware though, so the extra mem will almost certainly spend it's time as kernel cache. Not that you'd want a sigle process to grow that large anyway. True :-) Especially a pgsql backend. So what is the ceiling on 32-bit processors for RAM? Most of the 64-bit vendors are pushing Athalon64 and G5 as breaking the 4GB barrier, and even I can do the math on 2^32. All these 64-bit vendors, then, are talking about the limit on ram *per application* and not per machine? I think it's 64 gigs in the current implementation, but that could just be a chip set thing, i.e. the theoretical limit is probably 2^63 or 2^64, but the realistic limitation is that the current mobo chipsets are gonna have a much lower limit, and I seem to recall that being 64 gig last I looked. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Tuning for mid-size server
On Tue, 21 Oct 2003, Josh Berkus wrote: Anjan, From what I know, there is a cache-row-set functionality that doesn't exist with the newer postgres... What? PostgreSQL has always used the kernel cache for queries. Concurrent users will start from 1 to a high of 5000 or more, and could ramp up rapidly. So far, with increased users, we have gone up to starting the JVM (resin startup) with 1024megs min and max (recommended by Sun) - on the app side. Well, just keep in mind when tuning that your calculations should be based on *available* RAM, meaning RAM not used by Apache or the JVM. With that many concurrent requests, you'll want to be *very* conservative with sort_mem; I might stick to the default of 1024 if I were you, or even lower it to 512k. Exactly. Remember, Anjan, that that if you have a single sort that can't fit in RAM, it will use the hard drive for temp space, effectively swapping on its own. If the concurrent sorts run the server out of memory, the server will start swapping process, quite possibly the sorts, in a sort of hideous round robin death spiral that will bring your machine to its knees as the worst possible time, midday under load. sort_mem is one of the small foot guns in the postgresql.conf file that people tend to pick up and go huh, what's this do? right before cranking it up. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] RAID controllers etc... was: PostgreSQL data on a NAS device ?
On 21 Oct 2003, Will LaShell wrote: On Tue, 2003-10-21 at 08:40, scott.marlowe wrote: SNIP So that brings up my question, which is better, the Perc4 or Perc3 controllers, and what's the difference between them? I find Dell's tendency to hide other people's hardware behind their own model numbers mildly bothersome, as it makes it hard to comparison shop. Perc4 has n LSI 1030 chip http://docs.us.dell.com/docs/storage/perc4di/en/ug/features.htm Perc3 depending on the model can be a couple of things but I think they are all U160 controllers and not U320 Thanks. I googled around and found this page: http://www.domsch.com/linux/ Which says what each model is. It looks like the RAID controller they wanna charge me for is about $500 or so, so I'm guessing it's the medium range Elite 1600 type controller, i.e. U160, which is plenty for the machine / drive number we'll be using. Has anyone played around with the latest ones to get a feel for them? I want a battery backed controller that runs well under linux and also BSD that isn't gonna break the bank. I'd heard bad stories about the performance of the Adaptec RAID controllers, but it seems the newer ones aren't bad from what I've found googling. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Tuning for mid-size server
On Tue, 21 Oct 2003, Andrew Sullivan wrote: On Tue, Oct 21, 2003 at 11:51:02AM -0700, Josh Berkus wrote: Of course, if you have anecdotal evidence to the contrary, then the only way to work this would be to have OSDL help us sort it out. Nope. I too have such anecdotal evidence that 25% is way too high. It also seems to depend pretty heavily on what you're trying to optimise for and what platform you have. But I'm glad to hear (again) that people seem to think the 25% too high for most cases. I don't feel so much like I'm tilting against windmills. I think where it makes sense is when you have something like a report server where the result sets may be huge, but the parellel load is load, i.e. 5 or 10 users tossing around 100 Meg or more at time. If you've got 5,000 users running queries that are indexed and won't be using that much memory each, then there's usually no advantage to going over a certain number of buffers, and that certain number may be as low as 1000 for some applications. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL data on a NAS device ?
On Mon, 20 Oct 2003, Alexander Priem wrote: Hi all, Does anyone have any experience with putting PostgreSQL data on a NAS device? I am asking this because a NAS device is much cheaper to set up than a couple of SCSI disks. I would like to use a relatively cheap NAS device which uses four IDE drives (7.200 rpm), like the Dell PowerVault 725N. The disks themselves would be much slower than SCSI disks, I know, but a NAS device can be equipped with 3 Gb of memory, so this would make a very large disk cache, right? If this NAS would be dedicated only to PostgreSQL, would this be slower/faster than a SCSI RAID-10 setup of 6 disks? It would be much cheaper... Any advice on this would be appreciated :) How important is this data? With a local SCSI RAID controller and SCSI drives, you can pull the power cord out the back of the machine during 1000 transactions, and your database will come back up in a coherent state. If you need that kind of reliability, then you'll likely want to use local SCSI drives. Note that you should test your setup to be sure, i.e. pull the network cord and see how the machine recovers (if the machine recovers). Running storage on a NAS is a bit of a tightrope act with your data, as is using IDE drives with write cache enabled. But depending on your application, using NAS may be a good solution. So, what's this database gonna be used for? ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] further testing on IDE drives
On Tue, 14 Oct 2003, Tom Lane wrote: scott.marlowe [EMAIL PROTECTED] writes: open_sync was WAY faster at this than the other two methods. Do you not have open_datasync? That's the preferred method if available. Nope, when I try to start postgresql with it set to that, I get this error message: FATAL: invalid value for wal_sync_method: open_datasync This is on RedHat 9, but I have the same problem on a RH 7.2 box as well. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] [PERFORM] sql performance and cache
On Tue, 14 Oct 2003, Wei Weng wrote: On Sat, 11 Oct 2003, Christopher Kings-Lynne wrote: I have two very similar queries which I need to execute. They both have exactly the same from / where conditions. When I execute the first, it takes about 16 seconds. The second is executed almost immediately after, it takes 13 seconds. In short, I'd like to know why the query result isn't being cached and any ideas on how to improve the execution. snip OK - so I could execute the query once, and get the maximum size of the array and the result set in one. I know what I am doing is less than optimal but I had expected the query results to be cached. So the second execution would be very quick. So why aren't they ? I have increased my cache size - shared_buffers is 2000 and I have doubled the default max_fsm... settings (although I am not sure what they do). sort_mem is 8192. PostgreSQL does not have, and has never had a query cache - so nothing you do is going to make that second query faster. Perhaps you are confusing it with the MySQL query cache? Chris Is there plan on developing one (query cache)? Not really, Postgresql's design makes it a bit of a non-winner. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] further testing on IDE drives
On Thu, 9 Oct 2003, Bruce Momjian wrote: scott.marlowe wrote: I was testing to get some idea of how to speed up the speed of pgbench with IDE drives and the write caching turned off in Linux (i.e. hdparm -W0 /dev/hdx). The only parameter that seems to make a noticeable difference was setting wal_sync_method = open_sync. With it set to either fsync, or fdatasync, the speed with pgbench -c 5 -t 1000 ran from 11 to 17 tps. With open_sync it jumped to the range of 45 to 52 tps. with write cache on I was getting 280 to 320 tps. so, not instead of being 20 to 30 times slower, I'm only about 5 times slower, much better. Now I'm off to start a pgbench -c 10 -t 1 and pull the power cord and see if the data gets corrupted with write caching turned on, i.e. do my hard drives have the ability to write at least some of their cache during spin down. Is this a reason we should switch to open_sync as a default, if it is availble, rather than fsync? I think we are doing a single write before fsync a lot more often than we are doing multiple writes before fsync. Sounds reasonable to me. Are there many / any scenarios where a plain fsync would be faster than open_sync? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] further testing on IDE drives
On Fri, 10 Oct 2003, Josh Berkus wrote: Bruce, Yes. If you were doing multiple WAL writes before transaction fsync, you would be fsyncing every write, rather than doing two writes and fsync'ing them both. I wonder if larger transactions would find open_sync slower? Want me to test? I've got an ide-based test machine here, and the TPCC databases. Just make sure the drive's write cache is disabled. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] further testing on IDE drives
On Fri, 10 Oct 2003, Josh Berkus wrote: Bruce, Yes. If you were doing multiple WAL writes before transaction fsync, you would be fsyncing every write, rather than doing two writes and fsync'ing them both. I wonder if larger transactions would find open_sync slower? Want me to test? I've got an ide-based test machine here, and the TPCC databases. OK, I decided to do a quick dirty test of things that are big transactions in each mode my kernel supports. I did this: createdb dbname time pg_dump -O -h otherserver dbname|psql dbname then I would drop the db, edit postgresql.conf, and restart the server. open_sync was WAY faster at this than the other two methods. open_sync: 1st run: real11m27.107s user0m26.570s sys 0m1.150s 2nd run: real6m5.712s user0m26.700s sys 0m1.700s fsync: 1st run: real15m8.127s user0m26.710s sys 0m0.990s 2nd run: real15m8.396s user0m26.990s sys 0m1.870s fdatasync: 1st run: real15m47.878s user0m26.570s sys 0m1.480s 2nd run: real15m9.402s user0m27.000s sys 0m1.660s I did the first runs in order, then started over, i.e. opensync run1, fsync run1, fdatasync run1, opensync run2, etc... The machine I was restoring to was under no other load. The machine I was reading from had little or no load, but is a production server, so it's possible the load there could have had a small effect, but probably not this big of a one. The machine this is one is setup so that the data partition is on a drive with write cache enabled, but the pg_xlog and pg_clog directories are on a drive with write cache disabled. Same drive models as listed before in my previous test, Seagate generic 80gig IDE drives, model ST380023A. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Linux filesystem shootout
On Thu, 9 Oct 2003, Shridhar Daithankar wrote: Kaarel wrote: http://www.ussg.iu.edu/hypermail/linux/kernel/0310.1/0208.html Shridhar I feel incompetent when it comes to file systems. Yet everybody would like to have the best file system if given the choice...so do I :) Here I am looking at those tables seeing JFS having more green cells than others. The more green the better right? So based on these tests JFS ought to be the one? Yes and no. Yes for the results. No for the tests that weren't run. Database load is quite different. Its mixture of read and write load with a dynamics varying from one extreme to other, between these two. All it says that if you want to choose a good file system for postgresql, look at JFS first..:-) Besides all the tests were done on files file bigger than 1GB. If single file size is restricted to 1GB, it might produce a different result set. And postgresql does not exceed 1GB limit per file. So still, quite a few unknowns there.. Absolutely. For instance, one file system may be faster on a RAID card with battery backed cache, while another may be faster on an IDE drive with write cache disabled, while another may be faster on software RAID1, while another might be faster on software RAID5. If you haven't tested different file systems on your setup, you don't really know which will be faster until you do. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] PostgreSQL vs MySQL
On Thu, 9 Oct 2003, Jeff wrote: On Thu, 9 Oct 2003, David Griffiths wrote: 1) the MySQL docs are better (sorry - I found them easier to read, and more comprehensive; I had an easier time finding the answers I needed) Huh. I had the opposite experience. Each to his own. I think everybody agrees PG needs a better tuning doc (or pointers to it, or something). I think the issue is that Postgresql documentation is oriented towards DBA types, who already understand databases in general, so they can find what they want, while MySQL docs are oriented towards dbms newbies, who don't know much, if anything, about databases. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Shopping for hardware
On Mon, 6 Oct 2003, Jason Hihn wrote: Ok, I asked this on [novice], but I was told it's be better to post it here... I've got some money to spend on a new servers. The biggest concern is the PostgreSQL database server that will be the company. (*Everyone* uses the database server in some form or another) I'm looking for hot-swappable RAID 1 on a Linux platform at the least. Are there any vendors to avoid or prefer? What works best? Am I better off going with a DIY or getting something pre-packaged? Depends on your hardware expertise. You can do quite well either way. I prefer adding my own components to a pre-built vanilla server. In terms of numbers, we expect have an average of 100 active connections (most of which are idle 9/10ths of the time), with about 85% reading traffic. I hope to have one server host about 1000-2000 active databases, with the largest being about 60 meg (no blobs). Inactive databases will only be for reading (archival) purposes, and will seldom be accessed. (I could probably move them off to another server with a r/o disk...) That's not a really big load, but I'm guessing the peaks will be big enough to notice. Does any of this represent a problem for Postgres? The datasets are typically not that large, only a few queries on a few databases ever return over 1000 rows. Nah, this is pretty normal stuff for Postgresql or any other database in its approximate class (Sybase, Oracle, Informix, DB2, MSSQL2k). The configuration that is going on in my head is: RAID 1, 200gig disks 1 server, 4g ram Linux 2.4 or 2.6 (depends on when we deploy and 2.6's track record at that time) That's a good starting point. I'd avoid 2.6 until it's had time for the bugs to drop out. The latest 2.4 kernels are pretty stable. List of things to include if you need more performance, in order of priority: proper tuning of the postgresql.conf file (see http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html) hardware RAID card with battery backed cache, the bigger the cache the better. more drives for RAID 1+0 faster CPUs. since you've already got 4 gigs of RAM slated, you're set there on linux, where having more won't likely help a lot unless you go to a 64 bit platform. I want something that can do hot-swaps and auto-mirroring after swap. Unfortunately, this is a new area for me. (I normally stick to S/W for non-high end systems) The LSI/Megaraid cards can handle hot swaps quite well, make sure you get the right kind of hot swap shoes so they isolate the drive from the buss when you turn it off and they don't lock up your scsi buss. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] TPC-R benchmarks
Have you tried increasing the statistics target for those columns that are getting bad estimates yet and then turning back on enable_nestloop and rerunning analyze and seeing how the query does? The idea being to try and get a good enough estimate of your statistics so the planner stops using nestloops on its own rather than forcing it to with enable_nestloop = false. On Thu, 2 Oct 2003, Oleg Lebedev wrote: As Scott recommended, I did the following: # set enable_nestloop = false; # vacuum full analyze; After this I re-ran the query and its execution time went down from 2 hours to 2 minutes. I attached the new query plan to this posting. Is there any way to optimize it even further? What should I do to make this query run fast without hurting the performance of the other queries? Thanks. Oleg -Original Message- From: scott.marlowe [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 01, 2003 4:00 PM To: Oleg Lebedev Cc: Josh Berkus; [EMAIL PROTECTED] Subject: Re: [PERFORM] TPC-R benchmarks For troubleshooting, can you try it with set enable_nestloop = false and rerun the query and see how long it takes? It looks like the estimates of rows returned is WAY off (estimate is too low compared to what really comes back.) Also, you might try to alter the table.column to have a higher target on the rows p_partkey and ps_partkey and any others where the estimate is so far off of the reality. On Wed, 1 Oct 2003, Oleg Lebedev wrote: All right, my query just finished running with EXPLAIN ANALYZE. I show the plan below and also attached it as a file. Any ideas? - Sort (cost=54597.49..54597.50 rows=1 width=121) (actual time=6674562.03..6674562.15 rows=175 loops=1) Sort Key: nation.n_name, date_part('year'::text, orders.o_orderdate) - Aggregate (cost=54597.45..54597.48 rows=1 width=121) (actual time=6668919.41..6674522.48 rows=175 loops=1) - Group (cost=54597.45..54597.47 rows=3 width=121) (actual time=6668872.68..6672136.96 rows=348760 loops=1) - Sort (cost=54597.45..54597.46 rows=3 width=121) (actual time=6668872.65..6669499.95 rows=348760 loops=1) Sort Key: nation.n_name, date_part('year'::text, orders.o_orderdate) - Hash Join (cost=54596.00..54597.42 rows=3 width=121) (actual time=6632768.89..6650192.67 rows=348760 loops=1) Hash Cond: (outer.n_nationkey = inner.s_nationkey) - Seq Scan on nation (cost=0.00..1.25 rows=25 width=33) (actual time=6.75..7.13 rows=25 loops=1) - Hash (cost=54596.00..54596.00 rows=3 width=88) (actual time=6632671.96..6632671.96 rows=0 loops=1) - Nested Loop (cost=0.00..54596.00 rows=3 width=88) (actual time=482.41..6630601.46 rows=348760 loops=1) Join Filter: (inner.s_suppkey = outer.l_suppkey) - Nested Loop (cost=0.00..54586.18 rows=3 width=80) (actual time=383.87..6594984.40 rows=348760 loops=1) - Nested Loop (cost=0.00..54575.47 rows=4 width=68) (actual time=199.95..3580882.07 rows=348760 loops=1) Join Filter: (outer.p_partkey = inner.ps_partkey) - Nested Loop (cost=0.00..22753.33 rows=9343 width=49) (actual time=146.85..3541433.10 rows=348760 loops=1) - Seq Scan on part (cost=0.00..7868.00 rows=320 width=4) (actual time=33.64..15651.90 rows=11637 loops=1) Filter: (p_name ~~ '%green%'::text) - Index Scan using i_l_partkey on lineitem (cost=0.00..46.15 rows=29 width=45) (actual time=10.71..302.67 rows=30 loops=11637) Index Cond: (outer.p_partkey = lineitem.l_partkey) - Index Scan using pk_partsupp on partsupp (cost=0.00..3.39 rows=1 width=19) (actual time=0.09..0.09 rows=1 loops=348760) Index Cond: ((partsupp.ps_partkey = outer.l_partkey) AND (partsupp.ps_suppkey = outer.l_suppkey)) - Index Scan using pk_orders on orders (cost=0.00..3.01 rows=1 width=12) (actual time=8.62..8.62 rows=1 loops=348760) Index Cond: (orders.o_orderkey = outer.l_orderkey) - Index Scan using pk_supplier on supplier (cost=0.00..3.01 rows=1 width=8) (actual time=0.08
[PERFORM] further testing on IDE drives
I was testing to get some idea of how to speed up the speed of pgbench with IDE drives and the write caching turned off in Linux (i.e. hdparm -W0 /dev/hdx). The only parameter that seems to make a noticeable difference was setting wal_sync_method = open_sync. With it set to either fsync, or fdatasync, the speed with pgbench -c 5 -t 1000 ran from 11 to 17 tps. With open_sync it jumped to the range of 45 to 52 tps. with write cache on I was getting 280 to 320 tps. so, not instead of being 20 to 30 times slower, I'm only about 5 times slower, much better. Now I'm off to start a pgbench -c 10 -t 1 and pull the power cord and see if the data gets corrupted with write caching turned on, i.e. do my hard drives have the ability to write at least some of their cache during spin down. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] TPC-R benchmarks
On Thu, 2 Oct 2003, Oleg Lebedev wrote: I was trying to get the pg_stats information to Josh and decided to recreate the indexes on all my tables. After that I ran vacuum full analyze, re-enabled nestloop and ran explain analyze on the query. It ran in about 2 minutes. I attached the new query plan. I am not sure what did the trick, but 2 minutes is much better than 2 hours. But then again, I can't take long lunches anymore :) Is there any way to make this query run even faster without increasing the memory dedicated to postgres? Thanks. As long as the estimated row counts and real ones match up, and postgresql seems to be picking the right plan, there's probably not a lot to be done. You might want to look at increasing sort_mem a bit, but don't go crazy, as being too high can result in swap storms under load, which are a very bad thing. I'd check for index growth. You may have been reloading your data over and over and had an index growth problem. Next time instead of recreating the indexed completely, you might wanna try reindex indexname. Also, 7.4 mostly fixes the index growth issue, especially as it applies to truncating/reloading a table over and over, so moving to 7.4 beta3/4 and testing might be a good idea (if you aren't there already). What you want to avoid is having postgresql switch back to that nestloop join on you in the middle of the day, and to prevent that you might need to have higher statistics targets so the planner gets the right number all the time. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Optimizing = and = for numbers and dates
On Wed, 1 Oct 2003, Dimitri Nagiev wrote: here goes the EXPLAIN ANALYZE output: template1=# VACUUM analyze mytable; VACUUM template1=# explain analyze select * from mytable where mydate='2003-09-01'; QUERY PLAN --- Seq Scan on mytable (cost=0.00..2209.11 rows=22274 width=562) (actual time=0.06..267.30 rows=22677 loops=1) Filter: (mydate = '2003-09-01'::date) Total runtime: 307.71 msec (3 rows) How many rows are there in this table? If the number is only two or three times as many as the number of rows returned (22677) then a seq scan is preferable. The way to tune your random_page_cost is to keep making your range more selective until you get an index scan. Then, see what the difference is in speed between the two queries that sit on either side of that number, i.e. if a query that returns 1000 rows switches to index scan, and takes 100 msec, while one that returns 1050 uses seq scan and takes 200 msec, then you might want to lower your random page cost. Ideally, what should happen is that as the query returns more and more rows, the switch to seq scan should happen so that it's taking about the same amount of time as the index scan, maybe just a little more. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Optimizing = and = for numbers and dates
Oh, to followup on my previously sent post, make sure you've got effective_cache_size set right BEFORE you go trying to set random_page_cost, and you might wanna run a select * from table to load the table into kernel buffer cache before testing, then also test it with the cache cleared out (select * from a_different_really_huge_table will usually do that.) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] TPC-R benchmarks
For troubleshooting, can you try it with set enable_nestloop = false and rerun the query and see how long it takes? It looks like the estimates of rows returned is WAY off (estimate is too low compared to what really comes back.) Also, you might try to alter the table.column to have a higher target on the rows p_partkey and ps_partkey and any others where the estimate is so far off of the reality. On Wed, 1 Oct 2003, Oleg Lebedev wrote: All right, my query just finished running with EXPLAIN ANALYZE. I show the plan below and also attached it as a file. Any ideas? - Sort (cost=54597.49..54597.50 rows=1 width=121) (actual time=6674562.03..6674562.15 rows=175 loops=1) Sort Key: nation.n_name, date_part('year'::text, orders.o_orderdate) - Aggregate (cost=54597.45..54597.48 rows=1 width=121) (actual time=6668919.41..6674522.48 rows=175 loops=1) - Group (cost=54597.45..54597.47 rows=3 width=121) (actual time=6668872.68..6672136.96 rows=348760 loops=1) - Sort (cost=54597.45..54597.46 rows=3 width=121) (actual time=6668872.65..6669499.95 rows=348760 loops=1) Sort Key: nation.n_name, date_part('year'::text, orders.o_orderdate) - Hash Join (cost=54596.00..54597.42 rows=3 width=121) (actual time=6632768.89..6650192.67 rows=348760 loops=1) Hash Cond: (outer.n_nationkey = inner.s_nationkey) - Seq Scan on nation (cost=0.00..1.25 rows=25 width=33) (actual time=6.75..7.13 rows=25 loops=1) - Hash (cost=54596.00..54596.00 rows=3 width=88) (actual time=6632671.96..6632671.96 rows=0 loops=1) - Nested Loop (cost=0.00..54596.00 rows=3 width=88) (actual time=482.41..6630601.46 rows=348760 loops=1) Join Filter: (inner.s_suppkey = outer.l_suppkey) - Nested Loop (cost=0.00..54586.18 rows=3 width=80) (actual time=383.87..6594984.40 rows=348760 loops=1) - Nested Loop (cost=0.00..54575.47 rows=4 width=68) (actual time=199.95..3580882.07 rows=348760 loops=1) Join Filter: (outer.p_partkey = inner.ps_partkey) - Nested Loop (cost=0.00..22753.33 rows=9343 width=49) (actual time=146.85..3541433.10 rows=348760 loops=1) - Seq Scan on part (cost=0.00..7868.00 rows=320 width=4) (actual time=33.64..15651.90 rows=11637 loops=1) Filter: (p_name ~~ '%green%'::text) - Index Scan using i_l_partkey on lineitem (cost=0.00..46.15 rows=29 width=45) (actual time=10.71..302.67 rows=30 loops=11637) Index Cond: (outer.p_partkey = lineitem.l_partkey) - Index Scan using pk_partsupp on partsupp (cost=0.00..3.39 rows=1 width=19) (actual time=0.09..0.09 rows=1 loops=348760) Index Cond: ((partsupp.ps_partkey = outer.l_partkey) AND (partsupp.ps_suppkey = outer.l_suppkey)) - Index Scan using pk_orders on orders (cost=0.00..3.01 rows=1 width=12) (actual time=8.62..8.62 rows=1 loops=348760) Index Cond: (orders.o_orderkey = outer.l_orderkey) - Index Scan using pk_supplier on supplier (cost=0.00..3.01 rows=1 width=8) (actual time=0.08..0.08 rows=1 loops=348760) Index Cond: (outer.ps_suppkey = supplier.s_suppkey) Total runtime: 6674724.23 msec (28 rows) -Original Message- From: Oleg Lebedev Sent: Wednesday, October 01, 2003 12:00 PM To: Josh Berkus; scott.marlowe Cc: [EMAIL PROTECTED] Subject: Re: [PERFORM] TPC-R benchmarks Importance: Low Sure, below is the query. I attached the plan to this posting. select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier, lineitem, partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey
Re: [PERFORM] advice on raid controller
I've used the megaraid / LSI cards in the past and they were pretty good in terms of reliability, but the last one I used was the 328 model, from 4 years ago or so. that one had a battery backup option for the cache, and could go to 128 Meg. We tested it with 4/16 and 128 meg ram, and it was about the same with each, but we didn't do heavy parallel testing either. Here's the page on the megaraid cards at lsilogic.com: http://www.lsilogic.com/products/stor_prod/raid/ultra320products.html On Sun, 28 Sep 2003, Matt Clark wrote: As others have mentioned, you really ought to get battery-backed cache if you're doing any volume of writes. The ability to do safe write-back caching makes an *insane* difference to write performance. The site you link to also has that for only 15% more money: http://uk.azzurri.com/product/product.cgi?productId=80 No experience with the card(s) I'm afraid. In general though, U320 will only be faster than U160 for large sequential reads, or when you have silly numbers of disks on a channel (i.e. more than 4/channel). If you have silly numbers of disks, then RAID5 will probably be better, if you have 4 disks total then RAID1+0 will probably be better. In between it depends on all sorts of other factors. Bear in mind though that if you *do* have silly numbers of disks then more channels and more cache will count for more than anything else, so spend the money on that rather than latest-and-greatest performance for a single channel. HTH Matt -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Richard Jones Sent: 27 September 2003 18:25 To: [EMAIL PROTECTED] Subject: [PERFORM] advice on raid controller Hi, i'm on the verge of buying a MegaRAID SCSI 320-2 raid controller. I need it to build a db server using 4x ultra320 scsi disks i'm thinking raid 1+0 but will try with raid5 too and compare Does anyone have any experience with this model, good or bad i'd like to know.. thanks :) as seen: http://uk.azzurri.com/product/product.cgi?productId=188 Regards, Richard. PS: whoever mentioned starting a site with raid controller reviews, excellent idea - its hard to find decent info on which card to buy. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] advice on raid controller
On 29 Sep 2003, Will LaShell wrote: On Mon, 2003-09-29 at 06:48, scott.marlowe wrote: I've used the megaraid / LSI cards in the past and they were pretty good in terms of reliability, but the last one I used was the 328 model, from 4 years ago or so. that one had a battery backup option for the cache, and could go to 128 Meg. We tested it with 4/16 and 128 meg ram, and it was about the same with each, but we didn't do heavy parallel testing either. Here's the page on the megaraid cards at lsilogic.com: http://www.lsilogic.com/products/stor_prod/raid/ultra320products.html On Sun, 28 Sep 2003, Matt Clark wrote: As others have mentioned, you really ought to get battery-backed cache if you're doing any volume of writes. The ability to do safe write-back caching makes an *insane* difference to write performance. The site you link to also has that for only 15% more money: http://uk.azzurri.com/product/product.cgi?productId=80 No experience with the card(s) I'm afraid. In general though, U320 will only be faster than U160 for large sequential reads, or when you have silly numbers of disks on a channel (i.e. more than 4/channel). If you have silly numbers of disks, then RAID5 will probably be better, if you have 4 disks total then RAID1+0 will probably be better. In between it depends on all sorts of other factors. Bear in mind though that if you *do* have silly numbers of disks then more channels and more cache will count for more than anything else, so spend the money on that rather than latest-and-greatest performance for a single channel. Just to add my thoughts, we use the MegaRaid Elite 1650 in 3 servers here that drive our core databases. We paired up the controllers with the Seagate Cheetah 10k drives, we could have purchased the X15's which are Seagate's 15k version, but due to budget constraints and lack of true performance increase from the 10k to the 15k rpm drives we didn't opt for them. I have to say that I've been 100% pleased with the performance and reliability of the Megaraid controllers. They do everything a good controller should and they are very easy to manage. The driver is actively maintained by the guys at LSI and their tech support personnel are very good as well. If you want any specific information or have any questions about our experience or configuration please feel free to contact me. To add one more feature the LSI/MegaRAIDs have that I find interesting, you can put two in a machine, build a RAID0 or 5 on each card, then mirror the two cards together, and should one card / RAID0 ot 5 chain die, the other card will keep working. I.e. the work like one big card with failover. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Is there a reason _not_ to vacuum continuously?
On Wed, 17 Sep 2003, Matt Clark wrote: *** THE QUESTION(S) *** Is there any reason for me not to run continuous sequential vacuum analyzes? At least for the 6 tables that see a lot of updates? I hear 10% of tuples updated as a good time to vac-an, but does my typical count of 3 indexes per table affect that? Generally, the only time continuous vacuuming is a bad thing is when you are I/O bound. If you are CPU bound, then continuous vacuuming is usually acceptable. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Inconsistent performance
On Mon, 15 Sep 2003, Joseph Bove wrote: Stephan, I've run explain analyze a number of times and have gotten results between 5.5 and 7.5 seconds Attached is a typical output QUERY PLAN - Aggregate (cost=9993.92..9993.92 rows=1 width=0) (actual time=7575.59..7575.59 rows=1 loops=1) - Seq Scan on vetapview (cost=0.00..9771.34 rows=89034 width=0) (actual time=0.06..7472.20 rows=88910 loops=1) Total runtime: 7575.67 msec (3 rows) The only things changing are the actual time. The costs are constant. The relpages from pg_class for vetapview (the table in question) is 8881. At the end of this message is the exhaustive contents of postgresql.conf. The only settings I have attempted tuning are as follows: tcpip_socket = true max_connections = 100 shared_buffers = 5000 sort_mem = 8192 fsync = false A couple of things. 1: Is there an index on the parts of the query used for the where clause? 2: What is your effect_cache_size set to? It needs to be set right for your postgresql server to be able to take advantage of the kernel's cache (i.e. use an index scan when the kernel is likely to have that data in memory.) ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Inconsistent performance
On Mon, 15 Sep 2003, scott.marlowe wrote: On Mon, 15 Sep 2003, Joseph Bove wrote: Stephan, I've run explain analyze a number of times and have gotten results between 5.5 and 7.5 seconds Attached is a typical output QUERY PLAN - Aggregate (cost=9993.92..9993.92 rows=1 width=0) (actual time=7575.59..7575.59 rows=1 loops=1) - Seq Scan on vetapview (cost=0.00..9771.34 rows=89034 width=0) (actual time=0.06..7472.20 rows=88910 loops=1) Total runtime: 7575.67 msec (3 rows) The only things changing are the actual time. The costs are constant. The relpages from pg_class for vetapview (the table in question) is 8881. At the end of this message is the exhaustive contents of postgresql.conf. The only settings I have attempted tuning are as follows: tcpip_socket = true max_connections = 100 shared_buffers = 5000 sort_mem = 8192 fsync = false A couple of things. 1: Is there an index on the parts of the query used for the where clause? 2: What is your effect_cache_size set to? It needs to be set right for your postgresql server to be able to take advantage of the kernel's cache (i.e. use an index scan when the kernel is likely to have that data in memory.) Sorry, that should be effective_cache_size, not effect_cache_size. It's set in 8k blocks and is usually about how much buffer / cache you have left over after the machines settles after being up and running for a while. Fer instance, on my server, I show 784992K cache, and 42976K buff under top, so, that's 827968k/8k=103496 blocks. Note that if you've recompiled you may have somehow set block size larger, but installations with postgresql block sizes ~=8k are pretty uncommon, and you'd know if you had done that, so it's probably 8k blocks. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] [GENERAL] how to get accurate values in pg_statistic
On Thu, 11 Sep 2003, Tom Lane wrote: Christopher Browne [EMAIL PROTECTED] writes: The right answer for most use seems likely to involve: a) Getting an appropriate number of bins (I suspect 10 is a bit small, but I can't justify that mathematically), and I suspect that also, but I don't have real evidence for it either. We've heard complaints from a number of people for whom it was indeed too small ... but that doesn't prove it's not appropriate in the majority of cases ... Does the sample size change if you increase the number of bins? Yes, read the comments in backend/commands/analyze.c. Do we also need a parameter to control sample size? Not if the paper I read before writing that code is correct. I was just talking to a friend of mine who does statistical analysis, and he suggested a different way of looking at this. I know little of the analyze.c, but I'll be reading it some today. His theory was that we can figure out the number of target bins by basically running analyze twice with two different random seeds, and initially setting the bins to 10. The, compare the variance of the two runs. If the variance is great, increase the target by X, and run two again. repeat, wash, rinse, until the variance drops below some threshold. I like the idea, I'm not at all sure if it's practical for Postgresql to implement it. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] slow plan for min/max
On Sun, 7 Sep 2003, Pailloncy Jean-Gérard wrote: Asking a question about why max(id) is so much slower than select id order by id desc limit 1, Pailloncy said: I ask for the same thing. That's better ! This is a Frequently asked question about something that isn't likely to change any time soon. Basically, Postgresql uses an MVCC locking system that makes massively parallel operation possible, but costs in certain areas, and one of those areas is aggregate performance over large sets. MVCC makes it very hard to optimize all but the simplest of aggregates, and even those optimzations which are possible would wind up being quite ugly at the parser level. You might want to search the archives in the last couple years for this subject, as it's come up quite often. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] slow plan for min/max
On Mon, 8 Sep 2003, Neil Conway wrote: On Mon, 2003-09-08 at 11:56, scott.marlowe wrote: Basically, Postgresql uses an MVCC locking system that makes massively parallel operation possible, but costs in certain areas, and one of those areas is aggregate performance over large sets. MVCC makes it very hard to optimize all but the simplest of aggregates, and even those optimzations which are possible would wind up being quite ugly at the parser level. As was pointed out in a thread a couple days ago, MIN/MAX() optimization has absolutely nothing to do with MVCC. It does, however, make optimizing COUNT() more difficult. Not exactly. While max(id) is easily optimized by query replacement, more complex aggregates will still have perfomance issues that would not be present in a row locking database. i.e. max((field1/field2)*field3) is still going to cost more to process, isn't it? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] SELECT's take a long time compared to other DBMS
On Wed, 3 Sep 2003, Relaxin wrote: I have a table with 102,384 records in it, each record is 934 bytes. Using the follow select statement: SELECT * from table PG Info: version 7.3.4 under cygwin on Windows 2000 ODBC: version 7.3.100 Machine: 500 Mhz/ 512MB RAM / IDE HDD Under PG: Data is returned in 26 secs!! Under SQL Server: Data is returned in 5 secs. Under SQLBase: Data is returned in 6 secs. Under SAPDB:Data is returned in 7 secs. This is typical of postgresql under cygwin, it's much faster under a Unix OS like Linux or BSD. That said, you CAN do some things to help speed it up, the biggest being tuning the shared_buffers to be something large enough to hold a fair bit of data. Set the shared_buffers to 1000, restart, and see if things get better. Running Postgresql in a unix emulation layer is guaranteed to make it slow. If you've got a spare P100 with 128 Meg of RAM you can throw redhat 9 or FreeBSD 4.7 on and run Postgresql on, it will likely outrun your 500MHZ cygwin box, and might even keep up with the other databases on that machine as well. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Selecting random rows efficiently
Can you just create an extra serial column and make sure that one is always in order and no holes in it? (i.e. a nightly process, etc...)??? If so, then something like this truly flies: select * from accounts where aid = (select cast(floor(random()*10)+1 as int)); My times on it on a 100,000 row table are 1 millisecond. Note that you have to have a hole free sequence AND know how many rows there are, but if you can meet those needs, this is screamingly fast. On Sat, 30 Aug 2003, Russell Garrett wrote: Considering that we'd have to index the random field too, it'd be neater in the long term to re-number the primary key. Although, being a primary key, that's foreign-keyed from absolutely everywhere, so that'd probably take an amusingly long time. ...and no we're not from Micronesia, we're from ever so slightly less exotic London. Though Micronesia might be nice... Russ (also from last.fm but without the fancy address) [EMAIL PROTECTED] wrote: On Sat, 2003-08-30 at 09:01, Rod Taylor wrote: i was hoping there was some trickery with sequences that would allow me to easily pick a random valid sequence number..? I would suggest renumbering the data. ALTER SEQUENCE ... RESTART WITH 1; UPDATE table SET pkey = DEFAULT; Of course, PostgreSQL may have trouble with that update due to evaluation of the unique constraint immediately -- so drop the primary key first, and add it back after. And if there are child tables, they'd all have to be updated, too. ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] opinion on RAID choice
On Thu, 28 Aug 2003, Vivek Khera wrote: sm == scott marlowe scott.marlowe writes: sm My experience has been that once you get past 6 disks, RAID5 is faster sm than RAID1+0. Any opinion on stripe size for the RAID? That's more determined by what kind of data you're gonna be handling. If you want to do lots of little financial transactions, then 32k or less is good. If you're gonna store moderately large text fields and such, then going above 32k or 64k is usually a good idea. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Hardware recommendations to scale to silly load
On 27 Aug 2003, matt wrote: I'm wondering if the good people out there could perhaps give me some pointers on suitable hardware to solve an upcoming performance issue. I've never really dealt with these kinds of loads before, so any experience you guys have would be invaluable. Apologies in advance for the amount of info below... My app is likely to come under some serious load in the next 6 months, but the increase will be broadly predictable, so there is time to throw hardware at the problem. Currently I have a ~1GB DB, with the largest (and most commonly accessed and updated) two tables having 150,000 and 50,000 rows. A typical user interaction with the system involves about 15 single-table selects, 5 selects with joins or subqueries, 3 inserts, and 3 updates. The current hardware probably (based on benchmarking and profiling) tops out at about 300 inserts/updates *or* 2500 selects per second. There are multiple indexes on each table that updates inserts happen on. These indexes are necessary to provide adequate select performance. Current hardware/software: Quad 700MHz PIII Xeon/1MB cache 3GB RAM RAID 10 over 4 18GB/10,000rpm drives 128MB battery backed controller cache with write-back enabled Redhat 7.3, kernel 2.4.20 Postgres 7.2.3 (stock redhat issue) I need to increase the overall performance by a factor of 10, while at the same time the DB size increases by a factor of 50. e.g. 3000 inserts/updates or 25,000 selects per second, over a 25GB database with most used tables of 5,000,000 and 1,000,000 rows. It will likely take a combination of optimizing your database structure / methods and increasing your hardware / OS performance. You probably, more than anything, should look at some kind of superfast, external storage array that has dozens of drives, and a large battery backed cache. You may be able to approximate this yourself with just a few dual channel Ultra 320 SCSI cards and a couple dozen hard drives. The more spindles you throw at a database, generally speaking, the more parallel load it can handle. You may find that once you get to 10 or 20 drives, RAID 5 or 5+0 or 0+5 will be outrunning 1+0/0+1 due to fewer writes. You likely want to look at the fastest CPUs with the fastest memory you can afford. those 700MHz xeons are likely using PC133 memory, which is painfully slow compared to the stuff pumping data out at 4 to 8 times the rate of the older stuff. Maybe an SGI Altix could do this? Have you looked at them? They're not cheap, but they do look to be quite fast, and can scale to 64 CPUs if need be. They're interbox communication fabric is faster than most CPU's front side busses. Notably, the data is very time-sensitive, so the active dataset at any hour is almost certainly going to be more on the order of 5GB than 25GB (plus I'll want all the indexes in RAM of course). Also, and importantly, the load comes but one hour per week, so buying a Starfire isn't a real option, as it'd just sit idle the rest of the time. I'm particularly interested in keeping the cost down, as I'm a shareholder in the company! Interesting. If you can't spread the load out, can you batch some parts of it? Or is the whole thing interactive therefore needing to all be done in real time at once? So what do I need? whether you like it or not, you're gonna need heavy iron if you need to do this all in one hour once a week. Can anyone who has (or has ever had) that kind of load in production offer any pointers, anecdotes, etc? Any theoretical musings also more than welcome. Comments upon my sanity will be referred to my doctor. If the best price/performance option is a second hand 32-cpu Alpha running VMS I'd be happy to go that way ;-) Actually, I've seen stuff like that going on Ebay pretty cheap lately. I saw a 64 CPU E10k (366 MHz CPUs) with 64 gigs ram and 20 hard drives going for $24,000 a month ago. Put Linux or BSD on it and Postgresql should fly. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Best tweak for fast results.. ?
On Tue, 26 Aug 2003, JM wrote: need input on parameter values on confs... our database is getting 1000 transactions/sec on peak periods.. sitting on RH 7.3 2.4.7-10smp RAM: 1028400 SWAP: 2040244 1: Upgrade your kernel. 2.4.7 on RH3 was updated to 2.4.18-24 in March, and the 2.4.18 kernel is MUCH faster and has many bugs squashed. 2: Upgrade to the latest stable version of postgresql, 7.3.4 3: Make sure your kernels file-nr settings, and shm settings are big enough to handle load. 4: Edit the $PGDATA/postgresql.conf file to reflect all that extra cache you've got etc shared_buffers = 5000 sort_mem = 16384 effective_cache_size = (size of cache/buffer mem divided by 8192) 5: Look at moving WAL to it's own spindle(s), as it is often the choke point when doing lots of transactions. 6: Look at using more drives in a RAID 1+0 array for the data (as well as a seperate one for WAL if you can afford it.) 7: Make sure your drives are mounted noatime. 8: If you don't mind living dangerously, or the data can be reproduced from source files (i.e. catastrophic failure of your data set won't set you back) look at both mounting the drives async (the default for linux, slightly dangerous) and turning fsync off (quite dangerous, in case of crashed hardware / OS, you very well might lose data. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Hardware recommendations to scale to silly load
On 27 Aug 2003, matt wrote: You probably, more than anything, should look at some kind of superfast, external storage array Yeah, I think that's going to be a given. Low end EMC FibreChannel boxes can do around 20,000 IOs/sec, which is probably close to good enough. You mentioned using multiple RAID controllers as a boost - presumably the trick here is to split the various elements (WAL, tables, indexes) across different controllers using symlinks or suchlike? Can I feasibly split the DB tables across 5 or more controllers? I'm not sure I'd split the tables by hand right up front. Try getting as many hard drives as you can afford hooked up at once, and then try different ways of partitioning them. I'm guessing that making two fairly good sized 1+0 sets, one for data and one for WAL might be the best answer. Actually, I've seen stuff like that going on Ebay pretty cheap lately. I saw a 64 CPU E10k (366 MHz CPUs) with 64 gigs ram and 20 hard drives going for $24,000 a month ago. Put Linux or BSD on it and Postgresql should fly. Jeez, and I thought I was joking about the Starfire. Even Slowaris would be OK on one of them. The financial issue is that there's just not that much money in the micropayments game for bursty sales. If I was doing these loads *continuously* then I wouldn't be working, I'd be in the Maldives :-) $24,000 isn't that much for a server really, and if you can leverage this one sale to get more, then it would likely pay for itself over time. If you have problems keeping up with load, it will be harder to get more customers, so you kinda wanna do this as well as possible the first time. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] The results of my PostgreSQL/filesystem performance
On Thu, 28 Aug 2003, Sean Chittenden wrote: What it still leaves quite open is just what happens when the OS has more than one disk drive or CPU to play with. It's not clear what happens in such cases, whether FreeBSD would catch up, or be left further in the dust. The traditional propaganda has been that there are all sorts of reasons to expect PostgreSQL on FreeBSD to run a bit faster than on Linux; it is a bit unexpected for the opposite to seem true. Let me nip this in the butt before people run away with ideas that aren't correct. When the tests were performed in FreeBSD 5.1 and Linux, the hard drives were running UDMA. When running 4.8, for some reason his drives settled in on PIO mode: ad0s1a: UDMA ICRC error writing fsbn 1458368 of 729184-729215 (ad0s1 bn 1458368; cn 241 tn 12 sn 44) retrying ad0s1a: UDMA ICRC error writing fsbn 1458368 of 729184-729215 (ad0s1 bn 1458368; cn 241 tn 12 sn 44) retrying ad0s1a: UDMA ICRC error writing fsbn 1458368 of 729184-729215 (ad0s1 bn 1458368; cn 241 tn 12 sn 44) retrying ad0s1a: UDMA ICRC error writing fsbn 1458368 of 729184-729215 (ad0s1 bn 1458368; cn 241 tn 12 sn 44) falling back to PIO mode The benchmarks were hardly conclusive as UDMA runs vastly faster than PIO. Until we hear back as to whether cables were jarred loose between the tests or hearing if something else changed, I'd hardly consider these conclusive tests given PIO/UDMA is apples to oranges in terms of speed and I fully expect that FreeBSD 4.8 will perform at least faster than 5.1 (5.x is still being unwound from Giant), but should out perform Linux as well if industry experience iss any indicator. Plus, in most real servers you're gonna be running SCSI, so it might be nice to see a test with a good SCSI controller (Symbios 875 is a nice choice) and a couple hard drives, one each for WAL and data. This would more closely resemble actual usage and there are likely to be fewer issues with things like UDMA versus PIO on SCSI. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match