On Sun, Feb 13, 2011 at 9:57 AM, Kevin Grittner <[email protected] > wrote:
> Keith Gabryelski <[email protected]> wrote: > > > here is an example of one process's growth over time -- > > > > USER PID %CPU %MEM VSZ RSS > > postgres 20533 0.0 0.3 647388 52216 > > [...] > > postgres 20533 0.0 0.9 663532 144328 > > Let's get the non-problem out of the way first. RSS includes shared > memory. Each process will basically be showing how much of the same > 256MB shared memory segment it has touched, which will tend to > increase over time without having *anything* to do overall memory > usage. Ignore this. It is not a problem. > > Yes, RSS includes shared memory -- the numbers I show you might not be exactly related to total memory used but we can see from historical graphs of used memory and from the machine crash when used memory reached 16GB that this is certainly a problem. please correct me further if I'm missing something -- but I can see the postgres processes growing and the used memory growing until I reset the connection by hand. Is there anyway I can control the postgres's process's use of memory. > the server is a 16GB ram, 4-processor x64 bit centos machine > > > there can be up to (about) 750 connections to the machine > > > work_mem = 256MB > > Now this, on the other hand, is a potential problem. Each > connection which is running a query might reserve one or more > work_mem allocations. 256MB * 750 = 187.5GB. You have 16GB. > Now you describe your workload as heavy inserts, so perhaps this > isn't (currently) happening to you, but it's certainly something you > want to watch. > there are a small number of clients that connect and do interesting queries -- they do not amount to a potential problem. I'm not worried about work_mem for the insert clients (which seem to be the problem) because work_mem is not considered during these simple inserts. > > You didn't describe your storage environment, but let's assume that > your effective_io_concurrency is on target. The benchmarking I've > done of throughput and latency (response time) have shown best > performance at about ((2 * cores) + effective spindle count). You > have four cores and (apparently) four "effective spindles" (which is > a complex topic in itself). So, if your environment performs like > mine, you will see your best performance if you funnel those 750 > client-side connections down to about 12 database connections, with > requests queued by the pooler when all 12 connections are busy. > With tens of thousands of concurrent clients hitting our web site, > we were able to improve throughput and response time by cutting our > connection pool from 60 connections to 30. (Of course, this is on a > much larger system than you describe.) > > interesting. can you point me to something that will help me understand this performance and how I can apply it to my situation? > General comments on your config: > > > max_connections = 1000 > > Far too high; you need to use your connection pooler better, or use > a better connection pooler. > > shared_buffers = 512MB > > Not insane, but possibly a little on the low side. > probably -- but it's difficult to understand what this number actually does for my system and its influence on postgres. > > > maintenance_work_mem = 1024MB > > OK > > > max_stack_depth = 9MB > > I've never adjusted this. I'm a bit curious why you did. > > followed the instructions: the value of ulimit -s minus 1MB. it probably doesn't matter. > > synchronous_commit = off > > So you're OK with not necessarily having all transactions which were > successfully committed represented in the database if there is a > crash? (That's not rhetorical -- for some applications that's OK; > others, not so much.) > > yes. i'm ok with this. > > commit_delay = 10 > > commit_siblings = 2 > > Have you confirmed, through testing with your real workload, that > these settings are helping? (I can see where they might, but > sometimes people adjust these without testing and actually make > things worse.) > > yes. i've tested this. > > effective_cache_size = 1024MB > > On a 16GB machine, that's probably too low to get the best plans on > some complex queries. I'd probably be using something in the > neighborhood of 14GB. For the insert load it won't make any > difference; but when it comes to querying all that data, it might. > > i'll consider this. thank you. > One setting you didn't override which almost certainly would help > your insert performance is wal_buffers. Try setting that to 16MB. > > good point, thank you. > -Kevin > if I understand your position, though -- it is: reduce the number of concurrent connections to increase the overal throughput of inserts on the table (because they'll be less collisions on table locks??). this reduction in connections will reduce the maximum total memory used by them (but, still that is just avoiding the inevitable, right -- if postgres processes can grow larger than total memory, i'll still have a problem with 30 postgres processes as I do with 750 postgres processes). thank you for your response, any pointers for me to educate myself on these things is greatly appreciated (a book?, google fu, web links?) Pax, Keith
