On 12/08/2011 09:48 AM, Satoshi Nagayasu wrote:
For examples, I've been working on investigating PostgreSQL LWLock behaviors
precisely for a few weeks, and it could not be obtained within PostgreSQL
itself, therefore, I picked up SystemTap. However, SystemTap could not be
used in a production system, because it often kills the target processes. :(
How can I observe LWLocks in the production system?

I decided about a year ago that further work on using SystemTap was a black hole: time goes in, nothing really usable on any production server seems to come out. It can be useful for collecting data in a developer context. But the sort of problems people are more interested in all involve "why is the production server doing this?", and as you've also discovered the only reasonable answer so far doesn't involve SystemTap; it involves DTrace and either Solaris or FreeBSD (or Mac OS, for smaller server hardware deployments). Since those platforms are problematic to run database servers on in many cases, that doesn't help very much.

I'm planning to put that instrumentation into the database directly, which is what people with Oracle background are asking for. There are two underlying low-level problems to solve before even starting that:

-How can the overhead of collecting the timing data be kept down? It's really high in some places. This is being worked out right now on pgsql-hackers, see "Timing overhead and Linux clock sources"

-How do you log the potentially large amount of data collected without killing server performance? Initial discussions also happening right now, see "logging in high performance systems".

I feel this will increasingly be the top blocker for performance sensitive deployments in the coming year, people used to having these tools in Oracle cannot imagine how they would operate without them. One of my big pictures goals is have this available as a compile-time option starting in PostgreSQL 9.3 in 2013, piggybacked off the existing DTrace support. And the earlier the better--since many migrations have a long lead time, just knowing it's coming in the next version would be good enough for some people who are blocked right now to start working on theirs.

--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to