2012/10/19 23:48, Fujii Masao wrote:
On Wed, Oct 17, 2012 at 12:31 AM, Satoshi Nagayasu <sn...@uptime.jp> wrote:
2012/10/16 2:40, Jeff Janes wrote:
On Sun, Oct 14, 2012 at 9:43 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
Satoshi Nagayasu <sn...@uptime.jp> writes:
(2012/10/14 13:26), Fujii Masao wrote:
The tracing lwlock usage seems to still cause a small performance
overhead even if reporting is disabled. I believe some users would
prefer to avoid such overhead even if pg_stat_lwlocks is not available.
It should be up to a user to decide whether to trace lwlock usage,
e.g.,
by using trace_lwlock parameter, I think.
Frankly speaking, I do not agree with disabling performance
instrument to improve performance. DBA must *always* monitor
the performance metrix when having such heavy workload.
This brings up a question that I don't think has been honestly
considered, which is exactly whom a feature like this is targeted at.
TBH I think it's of about zero use to DBAs (making the above argument
bogus). It is potentially of use to developers, but a DBA is unlikely
to be able to do anything about lwlock-level contention even if he has
the knowledge to interpret the data.
Waiting on BufFreelistLock suggests increasing shared_buffers.
Waiting on ProcArrayLock perhaps suggests use of a connection pooler
(or does it?)
WALWriteLock suggests doing something about IO, either moving logs to
different disks, or getting BBU, or something.
WALInsertLock suggests trying to adapt your data loading process so it
can take advantage of the bulk, or maybe increasing wal_buffers.
And a lot of waiting on any of the locks gives a piece of information
the DBA can use when asking the mailing lists for help, even if it
doesn't allow him to take unilateral action.
So I feel it isn't something that should be turned on in production
builds. I'd vote for enabling it by a non-default configure option,
and making sure that it doesn't introduce any overhead when the option
is off.
I think hackers would benefit from getting reports from DBAs in the
field with concrete data on bottlenecks.
If the only way to get this is to do some non-standard compile and
deploy it to production, or to create a "benchmarking" copy of the
production database system including a realistic work-load driver and
run the non-standard compile there; either of those is going to
dramatically cut down on the participation.
Agreed.
The hardest thing to investigate performance issue is
reproducing a situation in the different environment
from the production environment.
I often see people struggling to reproduce a situation
with different hardware and (similar but) different
workload. It is very time consuming, and also it often
fails.
So, we need to collect any piece of information, which
would help us to understand what's going on within
the production PostgreSQL, without any changes of
binaries and configurations in the production environment.
That's the reason why I stick to a "built-in" instrument,
and I disagree to disable such instrument even if it has
minor performance overhead.
A flight-recorder must not be disabled. Collecting
performance data must be top priority for DBA.
pg_stat_lwlocks seems not adequate 'flight-recorder'. It collects
only narrow performance data concerning lwlock. What we should
have as 'flight-recorder' is something like Oracle wait event, I think.
Not only lwlocks but also all of wait events should be collected for
DBA to investigate the performance bottleneck.
That's the reason why I said "I accept that it's not enough
for DBA", and I'm going to work on another lock stats.
This idea was
proposed by Itagaki-san before. Though he implemented the
sampling-profiler patch, it failed to be committed. I'm not sure why
not.
Yeah, I know the previous patch posted by Itagaki-san.
So, I'm questioning why (again) for this time.
I think this is very important question because it would
be critical in order to involve new DBAs to PostgreSQL.
Anyway, I think that this would be more right approach to
provide the 'flight-recorder' to DBA.
Ok, I guess we have reached the consensus to have
"some flight-recorder". Right?
Actually, it seems a great progress from my point of view. :)
Regards,
--
Satoshi Nagayasu <sn...@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers