Hi, I just reviewed this patch.
https://commitfest.postgresql.org/action/patch_view?id=1035 2013/1/13 Greg Smith <g...@2ndquadrant.com>: > On 12/26/12 7:23 PM, Greg Stark wrote: >> >> It's also possible it's a bad cpu, not bad memory. If it affects >> decrement or increment in particular it's possible that the pattern of >> usage on LocalRefCount is particularly prone to triggering it. > > > This looks to be the winning answer. It turns out that under extended > multi-hour loads at high concurrency, something related to CPU overheating > was occasionally flipping a bit. One round of compressed air for all the > fans/vents, a little tweaking of the fan controls, and now the system goes >>24 hours with no problems. > > Sorry about all the noise over this. I do think the improved warning > messages that came out of the diagnosis ideas are useful. The reworked code > must slows down the checking a few cycles, but if you care about performance > these assertions are tacked onto the biggest pig around. > > I added the patch to the January CF as "Improve buffer refcount leak warning > messages". The sample I showed with the patch submission was a simulated > one. Here's the output from the last crash before resolving the issue, > where the assertion really triggered: > > WARNING: buffer refcount leak: [170583] (rel=base/16384/16578, > blockNum=302295, flags=0x106, refcount=0 1073741824) > > WARNING: buffers with non-zero refcount is 1 > TRAP: FailedAssertion("!(RefCountErrors == 0)", File: "bufmgr.c", Line: > 1712) This patch is intended to improve warning message at AtEOXact_Buffers(), but I guess another function, AtProcExit_Buffers(), needs to be modified as well. Right? With this additional fix, the patch could be applied to the current git master, and could be compiled with --enable-cassert option. Then, I need some suggestion from hackers to continue this review. How should I reproduce this message for review? This is a debug warning message, so it's not easy for me to reproduce this message. Any suggestion? -- Satoshi Nagayasu <sn...@uptime.jp> Uptime Technologies, LLC http://www.uptime.jp/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers