[PERFORM] Why is my stats collector so busy?
On a 8 processor system, my stats collector is always at 100% CPU. Meanwhile disk I/O is very low. We have many databases, they are accessed frequently. Sometimes there are big table updates, but in most of the time only simple queries are ran against the databases, returning a few records only. From the maximum possible 8.0 system load, the average load is always above 1.1 and from this, 1.0 is the stats collector and 0.1 is the remaining of the system. If I restart the postgresql server, then the stats collector uses 0% CPU for about 10 minutes, then goes up to 100% again. Is there a way to tell why it is working so much? I asked this problem some months ago on a different mailing list. I was asked to provide tracebacks of the stats collector, but due to a bug in the FreeBSD ppid() function, I'm not able to trace the stats collector. Thank you, Laszlo -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why is my stats collector so busy?
Laszlo Nagy wrote: On a 8 processor system, my stats collector is always at 100% CPU. Meanwhile disk I/O is very low. We have many databases, they are accessed frequently. Sometimes there are big table updates, but in most of the time only simple queries are ran against the databases, returning a few records only. From the maximum possible 8.0 system load, the average load is always above 1.1 and from this, 1.0 is the stats collector and 0.1 is the remaining of the system. If I restart the postgresql server, then the stats collector uses 0% CPU for about 10 minutes, then goes up to 100% again. Is there a way to tell why it is working so much? I asked this problem some months ago on a different mailing list. I was asked to provide tracebacks of the stats collector, but due to a bug in the FreeBSD ppid() function, I'm not able to trace the stats collector. What version of Postgres are you using? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why is my stats collector so busy?
Laszlo Nagy gand...@shopzeus.com writes: On a 8 processor system, my stats collector is always at 100% CPU. What platform? What Postgres version? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why is my stats collector so busy?
On Fri, Jun 5, 2009 at 9:38 AM, Bruce Momjianbr...@momjian.us wrote: Laszlo Nagy wrote: On a 8 processor system, my stats collector is always at 100% CPU. Meanwhile disk I/O is very low. We have many databases, they are accessed frequently. Sometimes there are big table updates, but in most of the time only simple queries are ran against the databases, returning a few records only. From the maximum possible 8.0 system load, the average load is always above 1.1 and from this, 1.0 is the stats collector and 0.1 is the remaining of the system. If I restart the postgresql server, then the stats collector uses 0% CPU for about 10 minutes, then goes up to 100% again. Is there a way to tell why it is working so much? I asked this problem some months ago on a different mailing list. I was asked to provide tracebacks of the stats collector, but due to a bug in the FreeBSD ppid() function, I'm not able to trace the stats collector. What version of Postgres are you using? A little context here. The stats collector is really version dependent...it gets tweaked just about every version of postgres...it is more or less unrecognizable since the 8.0 version of postgresql, where I would simply turn it off and run analyze myself. Be prepared for the advice to consider upgrading to help deal with this issue. 8.4 in fact has some enhancements that will help with situations like this. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Scalability in postgres
Mark Mielke m...@mark.mielke.cc wrote: Kevin Grittner wrote: James Mansion ja...@mansionfamily.plus.com wrote: Kevin Grittner wrote: Sure, but the architecture of those products is based around all the work being done by engines which try to establish affinity to different CPUs, and loop through the various tasks to be done. You don't get a context switch storm because you normally have the number of engines set at or below the number of CPUs. This is just misleading at best. What part? Last I checked, Sybase ASE and SQL Server worked as I described. Those are the products I was describing. Or is it misleading to say that you aren't likely to get a context switch storm if you keep your active thread count at or below the number of CPUs? Context switch storm is about how the application and runtime implements concurrent accesses to shared resources, not about the potentials of the operating system. I'm really not following how that's responsive to my questions or points, at all. You're making pretty basic and obvious points about other ways to avoid the problem, but the fact is that the other databases people point to as examples of handling large numbers of connections have (so far at least) been ones which solve the problems in other ways than what people seem to be proposing. That doesn't mean that the techniques used by these other products are the only way to solve the issue, or even that they are the best ways; but it does mean that pointing to those other products doesn't prove anything relative to what lock optimization is likely to buy us. For example, if threads all spin every time a condition or event is raised, then yes, a context storm probably occurs if there are thousands of threads. But, it doesn't have to work that way. At it's very simplest, this is the difference between wake one thread (which is then responsible for waking the next thread) vs wake all threads. This isn't necessarily the best solution - but it is one alternative. Other solutions might involve waking the *right* thread. For example, if I know that a particular thread is waiting on my change and it has the highest priority - perhaps I only need to wake that one thread. Or, if I know that 10 threads are waiting on my results and can act on it, I only need to wake these specific 10 threads. Any system which actually wakes all threads will probably exhibit scaling limitations. I would be surprised if any of this is not obvious to all on the list. I'm sorry, but (in particular) UNIX systems have routinely managed large numbers of runnable processes where the run queue lengths are long without such an issue. Well, the OP is looking at tens of thousands of connections. If we have a process per connection, how many tens of thousands can we handle before we get into problems with exhausting possible pid numbers (if nothing else)? This depends if it is 16-bit pid numbers or 32-bit pid numbers. I believe Linux supports 32-bit pid numbers although I'm not up-to-date on what the default configurations are for all systems in use today. In particular, Linux 2.6 added support for the O(1) task scheduler, with the express requirement of supporting hundreds of thousands of (mostly idle) threads. The support exists. Is it activated or in proper use? I don't know. Interesting. I'm running the latest SuSE Enterprise on a 64 bit system with 128 GB RAM and 16 CPUs, yet my pids and port numbers are 16 bit. Since I only use a tiny fraction of the available numbers using current techniques, I don't need to look at this yet, but I'll keep it in mind. I know that if you do use a large number of threads, you have to be pretty adaptive. In our Java app that pulls data from 72 sources and replicates it to eight, plus feeding it to filters which determine what publishers for interfaces might be interested, the Sun JVM does very poorly, but the IBM JVM handles it nicely. It seems they use very different techniques for the monitors on objects which synchronize the activity of the threads, and the IBM technique does well when no one monitor is dealing with a very large number of blocking threads. They got complaints from people who had thousands of threads blocking on one monitor, so they now keep a count and switch techniques for an individual monitor if the count gets too high. Could be, and if so then Sun JVM should really address the problem. I wish they would. However, having thousands of threads waiting on one monitor probably isn't a scalable solution, regardless of whether the JVM is able to optimize around your usage pattern or not. Why have thousands of threads waiting on one monitor? That's a bit insane. :-) Agreed. We weren't the ones complaining to IBM. :-) Perhaps something like that (or some other new approach) might mitigate the effects of tens of thousands of processes competing for for a few resources, but it
Re: [PERFORM] Scalability in postgres
Scott Carey sc...@richrelevance.com wrote: If you wake up 10,000 threads, and they all can get significant work done before yielding no matter what order they run, the system will scale extremely well. But with roughly twice the average response time you would get throttling active requests to the minimum needed to keep all resources busy. (Admittedly a hard point to find with precision.) I would think that the 4 or 5 most important locks or concurrency coordination points in Postgres have very specific, unique properties. Given the wide variety of uses I'd be cautious about such assumptions. In particular, these are interesting references, (not only for java): With this wealth of opinion, perhaps they can soon approach IBM's JVM in their ability to support a large number of threads. I'm rooting for them. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Scalability in postgres
On Fri, Jun 5, 2009 at 12:33 AM, da...@lang.hm wrote: On Fri, 5 Jun 2009, Greg Smith wrote: On Thu, 4 Jun 2009, Robert Haas wrote: That's because this thread has altogether too much theory and altogether too little gprof. But running benchmarks and profiling is actual work; that's so much less fun than just speculating about what's going on! This thread reminds me of Jignesh's Proposal of tunable fix for scalability of 8.4 thread from March, except with only a fraction of the real-world detail. There are multiple high-profile locks causing scalability concerns at quadruple digit high user counts in the PostgreSQL code base, finding them is easy. Shoot, I know exactly where a couple are, and I didn't have to think about it at all--just talked with Jignesh a couple of times, led me right to them. Fixing them without causing regressions in low client count cases, now that's the hard part. No amount of theoretical discussion advances that any until you're at least staring at a very specific locking problem you've already characterized extensively via profiling. And even then, profiling trumps theory every time. This is why I stay out of these discussions and work on boring benchmark tools instead. actually, as I see it we are a step before that. it seems that people are arguing that there is no need to look for and fix this sort of thing, on the basis that anyone who trips over these problems is doing something wrong to start with and needs to change the behavior of their app. I have a slightly different take on that. I don't think there's actually resistance to improving this situation if someone (or some group of people) comes up with a good proposal for doing it and writes a patch and tests it and shows that it helps that case without hurting other cases that people care about. And there is clearly great willingness to tell people what to do until that happens: use connection pooling. But if you come back and say, well, I shouldn't have to use connection pooling because it should work without connection pooling, well, OK, but... ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Scalability in postgres
Greg Smith wrote: No amount of theoretical discussion advances that any until you're at least staring at a very specific locking problem you've already characterized extensively via profiling. And even then, profiling trumps theory every time. In theory, there is no difference between theory and practice. In practice, there is a great deal of difference. Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Scalability in postgres
On Fri, 5 Jun 2009, Mark Mielke wrote: I disagree that profiling trumps theory every time. That's an interesting theory. Unfortunately, profiling shows it doesn't work that way. Let's see if I can summarize the state of things a bit better here: 1) PostgreSQL stops working as efficiently with 1000 active connections 2) Profiling suggests the first barrier that needs to be resolved to fix that is how the snapshots needed to support MVCC are derived 3) There are multiple patches around that aim to improve that specific situation, but only being tested aggressively by one contributor so far (that I'm aware of) 4) Those patches might cause a regression for other workloads, and the section of code involved was very hard to get working well initially. Before any change here will be accepted there needs to be a lot of data proving it both does what expected and doesn't introduce a regression. 5) Few people are motivated to get their hands dirty doing the boring benchmarking work to resolve this specific problem because use a connection pool is a quite good workaround 6) Many other database vendors admit this problem so is hard to solve that they also just suggest using a connection pool If anyone wants to work on improving things here, (4) is the sticking point that could use more hands. Not much theory involved, but there is a whole lot of profiling. -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Scalability in postgres
On Thu, 4 Jun 2009, Mark Mielke wrote: At it's very simplest, this is the difference between wake one thread (which is then responsible for waking the next thread) vs wake all threadsAny system which actually wakes all threads will probably exhibit scaling limitations. The prototype patch we got from Jignesh improved his specific workload by waking more waiting processes than were being notified in the current code. The bottleneck that's been best examined so far at high client counts is not because of too much waking, it's caused by not enough. -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why is my stats collector so busy?
On Fri, Jun 5, 2009 at 9:38 AM, Bruce Momjianbr...@momjian.us wrote: Laszlo Nagy wrote: On a 8 processor system, my stats collector is always at 100% CPU. Meanwhile disk I/O is very low. We have many databases, they are accessed frequently. Sometimes there are big table updates, but in most of the time only simple queries are ran against the databases, returning a few records only. From the maximum possible 8.0 system load, the average load is always above 1.1 and from this, 1.0 is the stats collector and 0.1 is the remaining of the system. If I restart the postgresql server, then the stats collector uses 0% CPU for about 10 minutes, then goes up to 100% again. Is there a way to tell why it is working so much? I asked this problem some months ago on a different mailing list. I was asked to provide tracebacks of the stats collector, but due to a bug in the FreeBSD ppid() function, I'm not able to trace the stats collector I've been having the same problem for several months now. I posted something to the novice list back in January but it really never went anywhere so I dropped it. Formalities... v8.3.7 build 1400, Windows XP 64-bit, two Opteron 2218. This is my personal database. It runs a single database locally on my box and I'm the only person that ever accesses it. From a fresh start of the server I get one postgres process that will run 100% of a CPU with no I/O essentially forever. If I use Process Explorer to identify the process and attach the debugger it will terminate and then restart with another process id. When I saw the previous post I looked at the process a bit closer and below is what is listed from Process Explorer for the problem process: \BaseNamedObjects\pgident(3432): postgres: stats collector process What I have resorted to is just suspending this process so it's not wasting one of my CPUs and everything seems to be working fine. I realize this is just a bandage but it works for me. I'm just a novice so if anyone has suggestions on what I can do to provide more information to try and track this down I'd appreciate it. I figured it was just something I had screwed up but now that someone else is seeing the same problem I know it's not just my problem. Bob -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] GiST index performance
On Thu, Jun 4, 2009 at 12:33 PM, Matthew Wakelingmatt...@flymine.org wrote: Do you have a recommendation for how to go about profiling Postgres, what profiler to use, etc? I'm running on Debian Linux x86_64. I mostly compile with --enable-profiling and use gprof. I know Tom Lane has had success with oprofile for quick and dirty measurements but I haven't quite figured out how to make all the bits work for that yet. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Scalability in postgres
On Fri, Jun 5, 2009 at 1:02 PM, Greg Smithgsm...@gregsmith.com wrote: On Fri, 5 Jun 2009, Mark Mielke wrote: I disagree that profiling trumps theory every time. That's an interesting theory. Unfortunately, profiling shows it doesn't work that way. I had a laugh when I read this, but I can see someone being offended by it. Hopefully no one took it that way. Let's see if I can summarize the state of things a bit better here: 1) PostgreSQL stops working as efficiently with 1000 active connections 2) Profiling suggests the first barrier that needs to be resolved to fix that is how the snapshots needed to support MVCC are derived 3) There are multiple patches around that aim to improve that specific situation, but only being tested aggressively by one contributor so far (that I'm aware of) I am actually aware of only two forays into this area that have been reduced to code. I am pretty much convinced that Jignesh's wake-all-waiters patch is fundamentally - dare I say theoretically - unsound, however much it may improve performance for his particular workload. The other is Simon's patch which AIUI is a fast-path for the case where nothing has changed. Are you aware of any others? Thanks for the summary. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Bad Plan for Questionnaire-Type Query
On Mon, May 25, 2009 at 11:22 AM, David Blewett da...@dawninglight.net wrote: On Sun, May 24, 2009 at 2:42 PM, Tom Lane t...@sss.pgh.pa.us wrote: It still feels like this schema design is obscuring correlations that the planner needs to know about in order to make decent estimates. I'm not sure how to make the planner aware of these correlations. Is there something inherently flawed with this design? It seems pretty close to the one on the Varlena website [1]. You mentioned earlier that the seemingly unrelated question_ids were linked via a common submission_id. I wonder whether it's possible to query using the submission_id instead? Well, I do join the different response tables [text/date/etc] together via the submission_id. However, in order to be able to apply the where clauses appropriately, I have to limit the responses to the appropriate question_id's. Would it matter to push that requirement down to the where clause instead of part of the join clause? David 1. http://www.varlena.com/GeneralBits/110.php Anyone have thoughts on this? Bueller? David -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Bad Plan for Questionnaire-Type Query
David, My first thought would be to increase statistics dramatically on the filtered columns in hopes of making PG realize there's a lot of rows there; it's off by 8x. Correlations stats are an ongoing issue in PostgreSQL. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Bad Plan for Questionnaire-Type Query
On Fri, Jun 5, 2009 at 7:32 PM, Josh Berkus j...@agliodbs.com wrote: My first thought would be to increase statistics dramatically on the filtered columns in hopes of making PG realize there's a lot of rows there; it's off by 8x. Correlations stats are an ongoing issue in PostgreSQL. I started at a stats_target of 250, then tried 500 and finally the plan that I pasted before resorting to disabling nestloops was at 1000 (and re-analyzing in between of course). Will a CLUSTER or REINDEX help at all? David -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] degenerate performance on one server of 3
See ALTER TABLE and CREATE TABLE (and the Index variants). ALTER TABLE foo SET (fillfactor=90); I'll try that. This will leave on average, 10% of every 8k block empty and allow updates to columns to more likely live within the same block. Good for the items table. Probably bad for the cookies table, with 6 million rows, and thousands of inserts and deletes every day, but few updates. Maybe I should have another way of doing it.That table gets bloated fast. A vacuum full takes 3 and half hours - which would be an unacceptable amount of downtime if I didn't have working mirrors of everything. Creating a new table as a select from the old and renaming, OR doing a CLUSTER and REINDEX is almost always faster than VACUUM FULL for such large tables. But there are different implications on how long other queries are locked out of access to the table. CLUSTER will generally lock out other queries for a long time, but the end result (especially combined with a reasonable fillfactor setting) ends up best for long term performance and reduction in bloat. I'll try it on the other mirror server, which has the same specs and size, see if CLUSTER/REINDEX is faster. - copy all rows to new table - lock for a millisecond while renaming tables - drop old table. Locking a whole table for a very long time is scary for admins. You can do the above manually in a single transaction, however any updates or inserts during that time may be lost. Postgres can have multiple row versions around for transactions, so for a lockless vacuum full to work, some row versions would have to be in the new table. I think that could be done at the expense of some performance degradation, as you'd have to figure out which table to look at (or reads... new one nothing there ok then old one, for copies... there's an update there... put the copy under it), some wacky logic like that. I don't know postgres's internals well enough to do it for all cases, but I know my own DB well enought to get it to work for me. Have 2 tables with triggered timestamps, then juggling of the queries that hit the tables (check table a and table b, use the row with newer timestamp for reads, meanwhile a is copying to b, but not overwriting newer rowssomething like that). Not sure whether I'd rather have a 7-hour performance degraded table-copy (which would reindex and recluster too) or a 3.5 hour table-locked vacuum (which doesn't reindex or re-cluster). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Bad Plan for Questionnaire-Type Query
On Fri, Jun 5, 2009 at 8:29 PM, David Blewettda...@dawninglight.net wrote: On Fri, Jun 5, 2009 at 7:32 PM, Josh Berkus j...@agliodbs.com wrote: My first thought would be to increase statistics dramatically on the filtered columns in hopes of making PG realize there's a lot of rows there; it's off by 8x. Correlations stats are an ongoing issue in PostgreSQL. I started at a stats_target of 250, then tried 500 and finally the plan that I pasted before resorting to disabling nestloops was at 1000 (and re-analyzing in between of course). Will a CLUSTER or REINDEX help at all? Probably not. Your problem is similar to the one Anne Rosset was complaining about on -performance a couple of days ago, though your case is appears to be more complex. http://archives.postgresql.org/pgsql-performance/2009-06/msg00023.php It's really not clear what to do about this problem. In Anne's case, it would probably be enough to gather MCVs over the product space of her folder_id and is_deleted columns, but I'm not certain that would help you. It almost seems like we need a way to say for every distinct value that appears in column X, you need to gather separate statistics for the other columns of the table. But that could make statistics gathering and query planning very expensive. Another angle of attack, which we've talked about before, is to teach the executor that when a nestloop with a hash-joinable condition executes too many times, it should hash the inner side on the next pass and then switch to a hash join. But none of this helps you very much right now... ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] degenerate performance on one server of 3
On Thu, Jun 4, 2009 at 7:31 AM, Erik Aronestye...@q32.com wrote: is there some way to view the level of bloat that needs full in each table, so i could write a script that alerts me to the need of a vacuum full without waiting for random queries to get slow? looking at the results of the bloat query, i still can't see how to know whether bloat is getting bad in an objective manner. One other thought on this... I think the main thing to consider is bloat as a percentage of table size. When you go to sequential scan the table, a table with as much bloat as data will take twice as long to scan, one with twice as much bloat as data will take three times as long to scan, and so on. If you're only ever doing index scans, the effect will be less noticeable, but in round figures comparing the amount of bloat to the amount of data is a good place to start. I usually find 3x is about where the pain starts to hit. Also, small tables can sometimes tolerate a higher percentage of bloat than large ones, because those table scans tend to be fast anyway. A lot of times bloat happens at one particular time and just never goes away. Leaving an open transaction around for an hour or two can bloat all of your tables, and they'll never get de-bloated on their own without help. It would be nice if VACUUM had even a little bit of capability for incrementally improving this situation, but currently it doesn't. So when you mention running for a year, it's not unlikely that you had one bad day (or several days in a row) when you collected all of that bloat, rather than accumulating it gradually over time. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance