Hello all: I'm seeing some very odd query behavior on postgres 7.1.3. I know that's way out of date, and I do have a plan in place to upgrade, but any immediate help will be, well, very helpful.
I have a server running two instances of postgres: 7.1.3 and 7.4.3. Each supports a different web application -- the web apps are served from a separate middleware server. Formerly both web apps ran on a single all-in-one server, on postgres 7.1.3. Since reconfiguring the servers (one web, one database) and deploying on the two-postgres config, performance of the web app running on the older pg is terrible, despite the new db-only machine being superficially "better" (in processor and RAM) than the original. I have enabled pretty extensive query logging on the 7.1.3 instance. What I see is that certain queries take hideously long. What is odd is that these are perfectly well optimized queries: they represents updates to a table of web application sessions. Each user's session is written once per web page view, so the query gets run frequently. Now, this session update is not, I think, the only query that's taking long. But there are so many of these queries that they are heavily represented in the slow queries (which I'm determining somewhat laboriously by running the huge logs through an awk script). Here's an example EXPLAIN: datapoint=# explain UPDATE nsse_session SET expiration = 1093541296, value = 'sessIdUser|s:5:\"31991\";sessPriv|s:1:\"u\";se\ datapoint'# ssCustId|s:5:\"14688\";sessSchoolId|s:5:\"14781\";sessUserName|s:6:\"gcruse\ ";sessImgVersion|i:2;sessAccessPrivs|N;sessAccessSchools|a:1:{i:14781;s:\ datapoint'# 5:\"14781\";}sessSort|a:0:{}!sessBack|sessDPVersion|s:4:\"full\";sessDataSou rce|s:4:\"live\";sessUserMeasures|s:14:\"1948,1913,1703\";sessUserGroups\ datapoint'# |s:84:\"3074,3113,3144,3179,3223,3263,5968,5997,6011,6027,6035,6046,6063,607 6,6087,6105,6116\";!sessGroupFilter|' WHERE id_session = '955af0272896ba\ datapoint'# a67956494dcb30a5fe' AND expiration > 1093441296; NOTICE: QUERY PLAN: Index Scan using nsse_session_pkey on nsse_session (cost=0.00..8.16 rows=3 width=50) EXPLAIN So, an index scan, as it should be. When I check the postgres log for the query stats, I find this: QUERY STATISTICS ! system usage stats: ! 0.012477 elapsed 0.000000 user 0.000000 system sec ! [0.070000 user 0.000000 sys total] ! 0/0 [0/0] filesystem blocks in/out ! 3/1 [1044/309] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 0/0 [0/0] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 0 read, 0 written, buffer hit rate = 100.00% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written QUERY STATISTICS ! system usage stats: ! 1878.708691 elapsed 194.000000 user 1.180000 system sec ! [194.020000 user 1.190000 sys total] ! 0/0 [0/0] filesystem blocks in/out ! 36441/3233 [37108/3496] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 0/0 [0/0] voluntary/involuntary context switches ! postgres usage stats: ! Shared blocks: 16863 read, 125 written, buffer hit rate = 99.99% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written Two sets of stats, of which the first looks OK, and the second is terrible. Virtually all of the long queries I've seen so far display this behavior, a double set of stats, with the second one looking like bad news. The second set seems almost to show a very elevated level of page faults. (what are these page faults against? They seem not to be against the pg shared buffers -- are they against the OS disk cache?) I did hear a suggestion that it's possible that a few very expensive queries are blocking others (if they're writing) and that I see the effects mostly on these session updates, because they're so frequent. I guess that's possible, but would like to know more about what the above stats might mean. Some of these session writes are fairly heavy -- we carry a lot of data in the session, often multiple tens of K of text going into one column. Is this a stats problem, symptom of insufficient vacuuming? Relevant params: postgres 7.1.3 running on RedHat ES 2.1. SHMMAX set to 800000000. (Physical ram is 3gig). Shared buffers for this install = 20000, sort mem is 8 meg, max connections = 50. (I don't think it's pure concurrent load, it happened during a user training session with only about 30 users). The other pg install on the box is similarly configured. Just to complete the data dump, here's the output of ipcs: ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status 0x0052e6a9 3538944 postgres 600 167649280 2 0x0052e2c1 3473409 postgres 600 170762240 2 ------ Semaphore Arrays -------- key semid owner perms nsems status 0x0052e6a9 11567104 postgres 600 17 0x0052e6aa 11599873 postgres 600 17 0x0052e6ab 11632642 postgres 600 17 0x0052e6ac 11665411 postgres 600 17 0x0052e2c1 10878980 postgres 600 17 0x0052e2c2 10911749 postgres 600 17 0x0052e2c3 10944518 postgres 600 17 0x0052e2c4 10977295 postgres 600 17 0x0052e2c5 11010064 postgres 600 17 0x0052e2c6 11042835 postgres 600 17 0x0052e2c7 11075604 postgres 600 17 0x0052e2c8 11108373 postgres 600 17 0x0052e2c9 11141142 postgres 600 17 0x0052e2ca 11173911 postgres 600 17 0x0052e2cb 11206680 postgres 600 17 0x0052e2cc 11239449 postgres 600 17 0x0052e2cd 11272218 postgres 600 17 0x0052e2ce 11304987 postgres 600 17 0x0052e2cf 11337756 postgres 600 17 0x0052e2d0 11370525 postgres 600 17 0x0052e2d1 11403294 postgres 600 17 ------ Message Queues -------- key msqid owner perms used-bytes messages Sorry, I know that's a ton of data but the problem is urgent and I'm very perplexed. For comparison, on the original unified server, I had max connections of 256 (as opposed to 50 on the new one -- could this be the problem?), sort_mem of 6000, shared buffers of 20000. (overall, shared buffers have doubled on the new machine because I've devoted that much to each). -- sgl ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings