subject:"Re\: \[HACKERS\] hung backends stuck in spinlock heavy endless loop"

On Thu, Jan 15, 2015 at 5:10 PM, Peter Geoghegan p...@heroku.com wrote:
 On Thu, Jan 15, 2015 at 3:00 PM, Merlin Moncure mmonc...@gmail.com wrote:
 Running this test on another set of hardware to verify -- if this
 turns out to be a false alarm which it may very well be, I can only
 offer my apologies!  I've never had a new drive fail like that, in
 that manner.  I'll burn the other hardware in overnight and report
 back.

huh -- well possibly. not.  This is on a virtual machine attached to a
SAN.  It ran clean for several (this is 9.4 vanilla, asserts off,
checksums on) hours then the starting having issues:

[cds2 21952 2015-01-15 22:54:51.833 CST 5502]WARNING:  page
verification failed, calculated checksum 59143 but expected 59137 at
character 20
[cds2 21952 2015-01-15 22:54:51.852 CST 5502]QUERY:
  DELETE FROM onesitepmc.propertyguestcard t
  WHERE EXISTS
  (
SELECT 1 FROM propertyguestcard_insert d
WHERE (t.prptyid, t.gcardid) = (d.prptyid, d.gcardid)
  )

[cds2 21952 2015-01-15 22:54:51.852 CST 5502]CONTEXT:  PL/pgSQL
function cdsreconciletable(text,text,text,text,boolean) line 197 at
EXECUTE statement
SQL statement SELECT* FROM CDSReconcileTable(
  t.CDSServer,
  t.CDSDatabase,
  t.SchemaName,
  t.TableName)
PL/pgSQL function cdsreconcileruntable(bigint) line 35 at SQL statement

After that, several hours of clean running, followed by:

[cds2 32353 2015-01-16 04:40:57.814 CST 7549]WARNING:  did not find
subXID 7553 in MyProc
[cds2 32353 2015-01-16 04:40:57.814 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.018 CST 7549]WARNING:  you don't own a
lock of type AccessShareLock
[cds2 32353 2015-01-16 04:40:58.018 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]LOG:  could not send data
to client: Broken pipe
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]STATEMENT:  SELECT
CDSReconcileRunTable(1160)
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]WARNING:  you don't own a
lock of type AccessShareLock
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]WARNING:  you don't own a
lock of type AccessShareLock
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:  you don't own a
lock of type AccessShareLock
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:  you don't own a
lock of type ShareLock
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]ERROR:  failed to re-find
shared lock object
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]STATEMENT:  SELECT
CDSReconcileRunTable(1160)
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
AbortSubTransaction while in ABORT state
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:  did not find
subXID 7553 in MyProc
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:  you don't own a
lock of type RowExclusiveLock
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Fri, Jan 16, 2015 at 8:05 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Thu, Jan 15, 2015 at 5:10 PM, Peter Geoghegan p...@heroku.com wrote:
 On Thu, Jan 15, 2015 at 3:00 PM, Merlin Moncure mmonc...@gmail.com wrote:
 Running this test on another set of hardware to verify -- if this
 turns out to be a false alarm which it may very well be, I can only
 offer my apologies!  I've never had a new drive fail like that, in
 that manner.  I'll burn the other hardware in overnight and report
 back.

 huh -- well possibly. not.  This is on a virtual machine attached to a
 SAN.  It ran clean for several (this is 9.4 vanilla, asserts off,
 checksums on) hours then the starting having issues:

I'm going to bisect this down...FYI.  I'll start with the major
releases first.   This is not going to be a fast process...I'm out of
pocket for the weekend and have a lot of other stuff going on.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-16 Thread Heikki Linnakangas


On 01/16/2015 04:05 PM, Merlin Moncure wrote:

On Thu, Jan 15, 2015 at 5:10 PM, Peter Geoghegan p...@heroku.com wrote:

On Thu, Jan 15, 2015 at 3:00 PM, Merlin Moncure mmonc...@gmail.com wrote:

Running this test on another set of hardware to verify -- if this
turns out to be a false alarm which it may very well be, I can only
offer my apologies!  I've never had a new drive fail like that, in
that manner.  I'll burn the other hardware in overnight and report
back.


huh -- well possibly. not.  This is on a virtual machine attached to a
SAN.  It ran clean for several (this is 9.4 vanilla, asserts off,
checksums on) hours then the starting having issues:

[cds2 21952 2015-01-15 22:54:51.833 CST 5502]WARNING:  page
verification failed, calculated checksum 59143 but expected 59137 at
character 20


The calculated checksum is suspiciously close to to the expected one. It 
could be coincidence, but the previous checksum warning you posted was 
also quite close:



[cds2 18347 2015-01-15 15:58:29.955 CST 1779]WARNING:  page
verification failed, calculated checksum 28520 but expected 28541


I believe the checksum algorithm is supposed to mix the bits quite 
thoroughly, so that a difference in a single byte in the input will lead 
to a completely different checksum. However, we add the block number to 
the checksum last:



/* Mix in the block number to detect transposed pages */
checksum ^= blkno;

/*
 * Reduce to a uint16 (to fit in the pd_checksum field) with an offset 
of
 * one. That avoids checksums of zero, which seems like a good idea.
 */
return (checksum % 65535) + 1;


It looks very much like that a page has for some reason been moved to a 
different block number. And that's exactly what Peter found out in his 
investigation too; an index page was mysteriously copied to a different 
block with identical content.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-16 Thread Andres Freund

Hi,

On 2015-01-16 08:05:07 -0600, Merlin Moncure wrote:
 On Thu, Jan 15, 2015 at 5:10 PM, Peter Geoghegan p...@heroku.com wrote:
  On Thu, Jan 15, 2015 at 3:00 PM, Merlin Moncure mmonc...@gmail.com wrote:
  Running this test on another set of hardware to verify -- if this
  turns out to be a false alarm which it may very well be, I can only
  offer my apologies!  I've never had a new drive fail like that, in
  that manner.  I'll burn the other hardware in overnight and report
  back.
 
 huh -- well possibly. not.  This is on a virtual machine attached to a
 SAN.  It ran clean for several (this is 9.4 vanilla, asserts off,
 checksums on) hours then the starting having issues:

Damn.

Is there any chance you can package this somehow so that others can run
it locally? It looks hard to find the actual bug here without adding
instrumentation to to postgres.

 [cds2 21952 2015-01-15 22:54:51.833 CST 5502]WARNING:  page
 verification failed, calculated checksum 59143 but expected 59137 at
 character 20
 [cds2 21952 2015-01-15 22:54:51.852 CST 5502]QUERY:
   DELETE FROM onesitepmc.propertyguestcard t
   WHERE EXISTS
   (
 SELECT 1 FROM propertyguestcard_insert d
 WHERE (t.prptyid, t.gcardid) = (d.prptyid, d.gcardid)
   )

 [cds2 21952 2015-01-15 22:54:51.852 CST 5502]CONTEXT:  PL/pgSQL
 function cdsreconciletable(text,text,text,text,boolean) line 197 at
 EXECUTE statement
 SQL statement SELECT* FROM CDSReconcileTable(
   t.CDSServer,
   t.CDSDatabase,
   t.SchemaName,
   t.TableName)
 PL/pgSQL function cdsreconcileruntable(bigint) line 35 at SQL statement


This was the first error? None of the 'could not find subXID' errors
beforehand?


 [cds2 32353 2015-01-16 04:40:57.814 CST 7549]WARNING:  did not find
 subXID 7553 in MyProc
 [cds2 32353 2015-01-16 04:40:57.814 CST 7549]CONTEXT:  PL/pgSQL
 function cdsreconcileruntable(bigint) line 35 during exception cleanup
 [cds2 32353 2015-01-16 04:40:58.018 CST 7549]WARNING:  you don't own a
 lock of type AccessShareLock
 [cds2 32353 2015-01-16 04:40:58.018 CST 7549]CONTEXT:  PL/pgSQL
 function cdsreconcileruntable(bigint) line 35 during exception cleanup
 [cds2 32353 2015-01-16 04:40:58.026 CST 7549]LOG:  could not send data
 to client: Broken pipe
 [cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
 function cdsreconcileruntable(bigint) line 35 during exception cleanup
 [cds2 32353 2015-01-16 04:40:58.026 CST 7549]STATEMENT:  SELECT
 CDSReconcileRunTable(1160)
 [cds2 32353 2015-01-16 04:40:58.026 CST 7549]WARNING:
 ReleaseLockIfHeld: failed??
 [cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
 function cdsreconcileruntable(bigint) line 35 during exception cleanup
 [cds2 32353 2015-01-16 04:40:58.026 CST 7549]WARNING:  you don't own a
 lock of type AccessShareLock
 [cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
 function cdsreconcileruntable(bigint) line 35 during exception cleanup
 [cds2 32353 2015-01-16 04:40:58.026 CST 7549]WARNING:
 ReleaseLockIfHeld: failed??
 [cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
 function cdsreconcileruntable(bigint) line 35 during exception cleanup
 [cds2 32353 2015-01-16 04:40:58.026 CST 7549]WARNING:  you don't own a
 lock of type AccessShareLock
 [cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
 function cdsreconcileruntable(bigint) line 35 during exception cleanup
 [cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
 ReleaseLockIfHeld: failed??
 [cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
 function cdsreconcileruntable(bigint) line 35 during exception cleanup
 [cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:  you don't own a
 lock of type AccessShareLock
 [cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
 function cdsreconcileruntable(bigint) line 35 during exception cleanup
 [cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
 ReleaseLockIfHeld: failed??
 [cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
 function cdsreconcileruntable(bigint) line 35 during exception cleanup
 [cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:  you don't own a
 lock of type ShareLock
 [cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
 function cdsreconcileruntable(bigint) line 35 during exception cleanup
 [cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
 ReleaseLockIfHeld: failed??
 [cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
 function cdsreconcileruntable(bigint) line 35 during exception cleanup
 [cds2 32353 2015-01-16 04:40:58.027 CST 7549]ERROR:  failed to re-find
 shared lock object
 [cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
 function cdsreconcileruntable(bigint) line 35 during exception cleanup
 [cds2 32353 2015-01-16 04:40:58.027 CST 7549]STATEMENT:  SELECT
 CDSReconcileRunTable(1160)
 [cds2 32353 2015-01-16 04:40:58.027 CST

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Fri, Jan 16, 2015 at 8:22 AM, Andres Freund and...@2ndquadrant.com wrote:
 Is there any chance you can package this somehow so that others can run
 it locally? It looks hard to find the actual bug here without adding
 instrumentation to to postgres.

That's possible but involves a lot of complexity in the setup because
of the source database (SQL Server) dependency..

Thinking outside the box here I'm going to migrate the source to
postgres.   This will rule out pl/sh which is the only non-core
dependency but will take some setup work on my end first.  If I can
still reproduce the error at that point, maybe we can go in this
direction, and it it would make local reproduction easier anyways.

 [cds2 21952 2015-01-15 22:54:51.833 CST 5502]WARNING:  page

 This was the first error? None of the 'could not find subXID' errors
 beforehand?

yep.  no caught exceptions or anything interesting in the log before that.

 Could you add a EmitErrorReport(); before the FlushErrorState() in
 pl_exec.c's exec_stmt_block()?

will do.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Fri, Jan 16, 2015 at 8:22 AM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 On 2015-01-16 08:05:07 -0600, Merlin Moncure wrote:
 On Thu, Jan 15, 2015 at 5:10 PM, Peter Geoghegan p...@heroku.com wrote:
  On Thu, Jan 15, 2015 at 3:00 PM, Merlin Moncure mmonc...@gmail.com wrote:
  Running this test on another set of hardware to verify -- if this
  turns out to be a false alarm which it may very well be, I can only
  offer my apologies!  I've never had a new drive fail like that, in
  that manner.  I'll burn the other hardware in overnight and report
  back.

 huh -- well possibly. not.  This is on a virtual machine attached to a
 SAN.  It ran clean for several (this is 9.4 vanilla, asserts off,
 checksums on) hours then the starting having issues:

 Damn.

 Is there any chance you can package this somehow so that others can run
 it locally? It looks hard to find the actual bug here without adding
 instrumentation to to postgres.

FYI, a two hour burn in on my workstation on 9.3 ran with no issues.
An overnight run would probably be required to prove it, ruling out
both hardware and pl/sh.   If proven, it's possible we may be facing a
regression, perhaps a serious one.

ISTM the next step is to bisect the problem down over the weekend in
order to to narrow the search.  If that doesn't turn up anything
productive I'll look into taking other steps.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-16 Thread Peter Geoghegan

On Fri, Jan 16, 2015 at 10:33 AM, Merlin Moncure mmonc...@gmail.com wrote:
 ISTM the next step is to bisect the problem down over the weekend in
 order to to narrow the search.  If that doesn't turn up anything
 productive I'll look into taking other steps.

That might be the quickest way to do it, provided you can isolate the
bug fairly reliably. It might be a bit tricky to write a shell script
that assumes a certain amount of time having passed without the bug
tripping indicates that it doesn't exist, and have that work
consistently. I'm slightly concerned that you'll hit other bugs that
have since been fixed, given the large number of possible symptoms
here.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-16 Thread Peter Geoghegan

On Fri, Jan 16, 2015 at 6:21 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 It looks very much like that a page has for some reason been moved to a
 different block number. And that's exactly what Peter found out in his
 investigation too; an index page was mysteriously copied to a different
 block with identical content.

What I found suspicious about that was that the spuriously identical
pages were not physically adjacent, but logically adjacent (i.e. the
bad page was considered the B-Tree right link of the good page by the
good, spuriously-copied-by-bad page). It also seems likely that that
small catalog index on pg_class(oid) was well cached in
shared_buffers. So I agree that it's unlikely that this is actually a
hardware or filesystem problem. Beyond that, if I had to guess, I'd
say that the problem is more likely to be in the B-Tree code than it
is in the buffer manager or whatever (so the logically adjacent
thing is probably not an artifact of the order that the pages were
accessed, since it appears there was a downlink to the bad page. This
downlink was not added recently. Also, this logical adjacency is
unlikely to be mere coincidence - Postgres seemed to fairly
consistently break this way).

Does anyone have a better developed sense of where the ultimate
problem here is than I do? I guess I've never thought too much about
how the system fails when a catalog index is this thoroughly broken.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-15 Thread Peter Geoghegan

On Thu, Jan 15, 2015 at 6:02 AM, Merlin Moncure mmonc...@gmail.com wrote:
 Question: Coming in this morning I did an immediate restart and logged
 into the database and queried pg_class via index.   Everything was
 fine, and the leftright verify returns nothing.  How did it repair
 itself without a reindex?

Maybe the relevant WAL records are more or less correct, and recovery
restores the database to its correct state, as opposed to the state it
was actually in at the time of the crash due to a race condition or
whatever. Could you possibly send bt_page_items() against block 9 of
the index after recovery? It probably looks totally sane, but it would
be nice to be sure, if you happen to still have the data.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Thu, Jan 15, 2015 at 8:02 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Thu, Jan 15, 2015 at 6:04 AM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
 On 01/15/2015 03:23 AM, Peter Geoghegan wrote:

 So now the question is: how did that inconsistency arise? It didn't
 necessarily arise at the time of the (presumed) split of block 2 to
 create 9. It could be that the opaque area was changed by something
 else, some time later. I'll investigate more.


 Merlin, could you re-run the test with a WAL archive (if you don't have one
 already), and then run pg_xlogdump, filtering it to show only the changes to
 the index? That should show us how the index got to be the way it is. Also,
 if you could post a copy of the raw relation file for pg_class_oid_index; I
 assume it's not too large.

 Something like:

 pg_xlogdump -r Btree -p walarchive/ -s 0/20035D0 | grep 11917

 11917 is the relfilenode of pg_class_oid_index on a freshly initdb'd
 cluster. In case it's not the same on your system, you can use oid2name to
 find it out.

 I'm on it.  Will try this first, then patch removal.

 Question: Coming in this morning I did an immediate restart and logged
 into the database and queried pg_class via index.   Everything was
 fine, and the leftright verify returns nothing.  How did it repair
 itself without a reindex?

The plot thickens!  I looped the test, still stock 9.4 as of this time
and went to lunch. When I came back, the database was in recovery
mode.  Here is the rough sequence of events.

1) REINDEXED pg_class (wanted clean slate for full reproduction)
2) before too long (unfortunately did not configure timestamps in the
log) starting seeing:

ERROR:  root page 3 of index pg_class_oid_index has level 0,
expected 1 at character 8
QUERY:  UPDATE CDSRunTable SET
  Finished = clock_timestamp(),
  DidSucceed = _DidSucceed,
  ErrorMessage = _ErrorMessage
WHERE CDSRunTableId = _CdsRunTableId
CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 82 at
SQL statement
STATEMENT:  SELECT CDSReconcileRunTable(2020)

..and again with a FATAL
FATAL:  root page 3 of index pg_class_oid_index has level 0, expected 1
ERROR:  root page 3 of index pg_class_oid_index has level 0, expected 1
CONTEXT:  SQL statement UPDATE CDSRunTable SET
  Finished = clock_timestamp(),
  DidSucceed = _DidSucceed,
  ErrorMessage = _ErrorMessage
WHERE CDSRunTableId = _CdsRunTableId
PL/pgSQL function cdsreconcileruntable(bigint) line 82 at SQL statement


3) shortly (?) after that, I got:

WARNING:  did not find subXID 14955 in MyProc
CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
during exception cleanup
WARNING:  you don't own a lock of type RowExclusiveLock
CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
during exception cleanup
LOG:  could not send data to client: Broken pipe
CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
during exception cleanup
STATEMENT:  SELECT CDSReconcileRunTable(2151)
WARNING:  ReleaseLockIfHeld: failed??
CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
during exception cleanup
ERROR:  failed to re-find shared proclock object
CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
during exception cleanup
STATEMENT:  SELECT CDSReconcileRunTable(2151)
WARNING:  AbortSubTransaction while in ABORT state
WARNING:  did not find subXID 14955 in MyProc
WARNING:  you don't own a lock of type AccessShareLock
WARNING:  ReleaseLockIfHeld: failed??
ERROR:  failed to re-find shared proclock object
WARNING:  AbortSubTransaction while in ABORT state
WARNING:  did not find subXID 14955 in MyProc
WARNING:  you don't own a lock of type AccessShareLock
WARNING:  ReleaseLockIfHeld: failed??
WARNING:  you don't own a lock of type ShareLock
TRAP: FailedAssertion(!(FastPathStrongRelationLocks-count[fasthashcode]
 0), File: lock.c, Line: 1240)
LOG:  server process (PID 10117) was terminated by signal 6: Aborted
LOG:  terminating any other active server processes

4) while I was still at lunch, the database started up again, and
quickly fell over:
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started
ERROR:  could not open relation with OID 3597
STATEMENT:  SELECT CDSReconcileRunTable(2268)
ERROR:  could not open relation with OID 3597
STATEMENT:  SELECT CDSReconcileRunTable(2270)
ERROR:  could not open relation with OID 3597
STATEMENT:  SELECT CDSReconcileRunTable(2272)
ERROR:  could not open relation with OID 3597
STATEMENT:  SELECT CDSReconcileRunTable(2273)
TRAP: FailedAssertion(!(( ((void) ((bool) ((! assert_enabled) || !
(!(((bool) (((const void*)(htup-t_ctid) != ((void *)0)) 
((htup-t_ctid)-ip_posid != 0)
LOG:  server process (PID 30520) was terminated by signal 6: Aborted
LOG:  terminating any other active server processes

this repeated a very large number of times.   Occasionally, but
infrequently, I'd see during the database startup something

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Thu, Jan 15, 2015 at 1:15 PM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 The plot thickens!  I looped the test, still stock 9.4 as of this time
 and went to lunch. When I came back, the database was in recovery
 mode.  Here is the rough sequence of events.


 Whoa. That looks scary. Did you see (some of) those errors before? Most
 of them should have been emitted independently of being built with
 assertions.

Nope.

 1) REINDEXED pg_class (wanted clean slate for full reproduction)
 2) before too long (unfortunately did not configure timestamps in the
 log) starting seeing:

 ERROR:  root page 3 of index pg_class_oid_index has level 0,
 expected 1 at character 8
 QUERY:  UPDATE CDSRunTable SET
   Finished = clock_timestamp(),
   DidSucceed = _DidSucceed,
   ErrorMessage = _ErrorMessage
 WHERE CDSRunTableId = _CdsRunTableId
 CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 82 at
 SQL statement
 STATEMENT:  SELECT CDSReconcileRunTable(2020)

 ..and again with a FATAL
 FATAL:  root page 3 of index pg_class_oid_index has level 0, expected 1
 ERROR:  root page 3 of index pg_class_oid_index has level 0, expected 1
 CONTEXT:  SQL statement UPDATE CDSRunTable SET
   Finished = clock_timestamp(),
   DidSucceed = _DidSucceed,
   ErrorMessage = _ErrorMessage
 WHERE CDSRunTableId = _CdsRunTableId
 PL/pgSQL function cdsreconcileruntable(bigint) line 82 at SQL statement

 Just to be absolutely clear, those happened *before* the following
 errors? And there were no 'during exception cleanup' like errors before?

correct.  unfortunately, I wasn't logging times so I'm not 100% sure
if the 'root page' error happened during the event or sometime before
it.  I'm addressing that with the log prefix.

 Were there any 'LOG: Handling deadlock detected on CdsRunTableId' log
 entries before? It's hard to know from here, but the 'during exception
 cleanup' indicates a problem in abort handling.  Were there any deadlock
 detected errors closeby?

 You're catching deadlock errors in a subtransaction. Hm.

yup, exactly 1:
WARNING:  LOG: Handling deadlock detected on CdsRunTableId 1626

Based on the id that is proven to be before the crashing started,
although at least one manual server restart happened in the interim.

Since it's possible the database is a loss, do you see any value in
bootstrappinng it again with checksums turned on?  One point of note
is that this is a brand spanking new SSD, maybe we nee to rule out
hardware based corruption?

 lternatively were there any 'LOG: CdsRunTableId %s Failed' messages? If
so, what was the cause?

There was one, but unfortunately since the error was trapped the
relevant detail to catch the other side of the deadlock is missed.
Barring this analysis, I'm inclined to remove that guard: it was
mostly there to deal with what turned out to be some bugs in the code
(specifically nonconstrained update on CDSTable).

 * Do you also use lock_timeout/statement_timeout?
not on this database -- this is a coding environment and not set up
with standard configuration

  were any processes killed at that time?
later on, yes, by manual restart, and the query was cancelled first: to wit:

WARNING:  LOG: Handling deadlock detected on CdsRunTableId 1626
ERROR:  canceling statement due to user request
STATEMENT:  SELECT CDSReconcileRunTable(1626)

followed by restart -m fast

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-15 Thread Andres Freund

On 2015-01-15 20:15:42 +0100, Andres Freund wrote:
  WARNING:  did not find subXID 14955 in MyProc
  CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
  during exception cleanup
  WARNING:  you don't own a lock of type RowExclusiveLock
  CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
  during exception cleanup
  LOG:  could not send data to client: Broken pipe
  CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
  during exception cleanup
  STATEMENT:  SELECT CDSReconcileRunTable(2151)
  WARNING:  ReleaseLockIfHeld: failed??
  CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
  during exception cleanup
  ERROR:  failed to re-find shared proclock object
  CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
  during exception cleanup
  STATEMENT:  SELECT CDSReconcileRunTable(2151)
  WARNING:  AbortSubTransaction while in ABORT state
  WARNING:  did not find subXID 14955 in MyProc
  WARNING:  you don't own a lock of type AccessShareLock
  WARNING:  ReleaseLockIfHeld: failed??
  ERROR:  failed to re-find shared proclock object
  WARNING:  AbortSubTransaction while in ABORT state
  WARNING:  did not find subXID 14955 in MyProc
  WARNING:  you don't own a lock of type AccessShareLock
  WARNING:  ReleaseLockIfHeld: failed??
  WARNING:  you don't own a lock of type ShareLock
  TRAP: FailedAssertion(!(FastPathStrongRelationLocks-count[fasthashcode]
   0), File: lock.c, Line: 1240)
  LOG:  server process (PID 10117) was terminated by signal 6: Aborted
  LOG:  terminating any other active server processes
 
 Ick.
 
 Were there any 'LOG: Handling deadlock detected on CdsRunTableId' log
 entries before? It's hard to know from here, but the 'during exception
 cleanup' indicates a problem in abort handling.  Were there any deadlock
 detected errors closeby?

Alternatively were there any 'LOG: CdsRunTableId %s Failed' messages? If
so, what was the cause?

 You're catching deadlock errors in a subtransaction. Hm.

A couple questions:
* Do you also use lock_timeout/statement_timeout? If so, what are their
  settings + deadlock_timeout?
* were any processes killed at that time?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Thu, Jan 15, 2015 at 1:32 PM, Merlin Moncure mmonc...@gmail.com wrote:
 Since it's possible the database is a loss, do you see any value in
 bootstrappinng it again with checksums turned on?  One point of note
 is that this is a brand spanking new SSD, maybe we nee to rule out
 hardware based corruption?

hm!  I bootstrapped a new database with checksums on, and lo,

[cds2 18347 2015-01-15 15:58:29.955 CST 1779]WARNING:  page
verification failed, calculated checksum 28520 but expected 28541
[cds2 18347 2015-01-15 15:58:29.955 CST 1779]CONTEXT:  SQL statement
COPY lease_delete FROM
'/tmp/C1188732_onesiteproperty.lease_delete.csv' CSV HEADER

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Thu, Jan 15, 2015 at 4:03 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Thu, Jan 15, 2015 at 1:32 PM, Merlin Moncure mmonc...@gmail.com wrote:
 Since it's possible the database is a loss, do you see any value in
 bootstrappinng it again with checksums turned on?  One point of note
 is that this is a brand spanking new SSD, maybe we nee to rule out
 hardware based corruption?

 hm!  I bootstrapped a new database with checksums on, and lo,

 [cds2 18347 2015-01-15 15:58:29.955 CST 1779]WARNING:  page
 verification failed, calculated checksum 28520 but expected 28541
 [cds2 18347 2015-01-15 15:58:29.955 CST 1779]CONTEXT:  SQL statement
 COPY lease_delete FROM
 '/tmp/C1188732_onesiteproperty.lease_delete.csv' CSV HEADER

Running this test on another set of hardware to verify -- if this
turns out to be a false alarm which it may very well be, I can only
offer my apologies!  I've never had a new drive fail like that, in
that manner.  I'll burn the other hardware in overnight and report
back.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-15 Thread Peter Geoghegan

On Thu, Jan 15, 2015 at 3:00 PM, Merlin Moncure mmonc...@gmail.com wrote:
 Running this test on another set of hardware to verify -- if this
 turns out to be a false alarm which it may very well be, I can only
 offer my apologies!  I've never had a new drive fail like that, in
 that manner.  I'll burn the other hardware in overnight and report
 back.

Thanks.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-15 Thread Peter Geoghegan

On Wed, Jan 14, 2015 at 8:50 PM, Peter Geoghegan p...@heroku.com wrote:
I am mistaken on one detail here - blocks 2 and 9 are actually fully
identical. I still have no idea why, though.

So, I've looked at it in more detail and it appears that the page of
block 2 split at some point, thereby creating a new page (the block 9
page). There is a sane downlink in the root page for the new rightlink
page. The root page looks totally sane, as does every other page - as
I said, the problem is only that block 9 is spuriously identical to
block 2. So the (correct) downlink in the root page, to block 9, is
the same as the (incorrect) high key value in block 9 - Oid value
69924. To be clear: AFAICT everything is perfect except block 9, which
is bizarrely identical to block 2.

Now, since the sane page downlink located in the root (like every
downlink, a lower bound on items in its child) is actually a copy of
the high key on the page that is the child's left link (that is to
say, it comes from the original target of a page split - it shares the
target's high key value, Oid value 69924), there may have never been
sane data in block 9, even though its downlink is sane (so maybe the
page split patch is implicated). But it's hard to see how that could
be true. The relevant code wasn't really what was changed about page
splits in 9.4 anyway (plus this wasn't a non-leaf split, since there
aren't enough pages for those to be a factor). There just isn't that
many items on page 2 (or its bizarre identical twin, page 9), so a
recent split seems unlikely. And, the target and new right page are
locked together throughout both the split and down link insertion
(even though there are two atomic operations/WAL inserts). So to
reiterate, a close by page split that explains the problem seems
unlikely.

I'm going to focus on the page deletion patch for the time being.
Merlin - it would be great if you could revert all the page split
commits (which came after the page deletion fix). All the follow-up
page split commits [1] were fairly straightforward bugs with recovery,
so it should be easy enough to totally remove the page split stuff
from 9.4 for the purposes of isolating the bug.

[1]
http://www.postgresql.org/message-id/cam3swzspj6m9hfhksjuiuof30auwxyyb56fjbw1_dogqkbe...@mail.gmail.com
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-15 Thread Heikki Linnakangas


On 01/15/2015 03:23 AM, Peter Geoghegan wrote:

So now the question is: how did that inconsistency arise? It didn't
necessarily arise at the time of the (presumed) split of block 2 to
create 9. It could be that the opaque area was changed by something
else, some time later. I'll investigate more.


Merlin, could you re-run the test with a WAL archive (if you don't have 
one already), and then run pg_xlogdump, filtering it to show only the 
changes to the index? That should show us how the index got to be the 
way it is. Also, if you could post a copy of the raw relation file for 
pg_class_oid_index; I assume it's not too large.


Something like:

pg_xlogdump -r Btree -p walarchive/ -s 0/20035D0 | grep 11917

11917 is the relfilenode of pg_class_oid_index on a freshly initdb'd 
cluster. In case it's not the same on your system, you can use oid2name 
to find it out.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Thu, Jan 15, 2015 at 6:04 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 01/15/2015 03:23 AM, Peter Geoghegan wrote:

 So now the question is: how did that inconsistency arise? It didn't
 necessarily arise at the time of the (presumed) split of block 2 to
 create 9. It could be that the opaque area was changed by something
 else, some time later. I'll investigate more.


 Merlin, could you re-run the test with a WAL archive (if you don't have one
 already), and then run pg_xlogdump, filtering it to show only the changes to
 the index? That should show us how the index got to be the way it is. Also,
 if you could post a copy of the raw relation file for pg_class_oid_index; I
 assume it's not too large.

 Something like:

 pg_xlogdump -r Btree -p walarchive/ -s 0/20035D0 | grep 11917

 11917 is the relfilenode of pg_class_oid_index on a freshly initdb'd
 cluster. In case it's not the same on your system, you can use oid2name to
 find it out.

I'm on it.  Will try this first, then patch removal.

Question: Coming in this morning I did an immediate restart and logged
into the database and queried pg_class via index.   Everything was
fine, and the leftright verify returns nothing.  How did it repair
itself without a reindex?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-15 Thread Andres Freund

Hi,

 The plot thickens!  I looped the test, still stock 9.4 as of this time
 and went to lunch. When I came back, the database was in recovery
 mode.  Here is the rough sequence of events.


Whoa. That looks scary. Did you see (some of) those errors before? Most
of them should have been emitted independently of being built with
assertions.

 1) REINDEXED pg_class (wanted clean slate for full reproduction)
 2) before too long (unfortunately did not configure timestamps in the
 log) starting seeing:
 
 ERROR:  root page 3 of index pg_class_oid_index has level 0,
 expected 1 at character 8
 QUERY:  UPDATE CDSRunTable SET
   Finished = clock_timestamp(),
   DidSucceed = _DidSucceed,
   ErrorMessage = _ErrorMessage
 WHERE CDSRunTableId = _CdsRunTableId
 CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 82 at
 SQL statement
 STATEMENT:  SELECT CDSReconcileRunTable(2020)
 
 ..and again with a FATAL
 FATAL:  root page 3 of index pg_class_oid_index has level 0, expected 1
 ERROR:  root page 3 of index pg_class_oid_index has level 0, expected 1
 CONTEXT:  SQL statement UPDATE CDSRunTable SET
   Finished = clock_timestamp(),
   DidSucceed = _DidSucceed,
   ErrorMessage = _ErrorMessage
 WHERE CDSRunTableId = _CdsRunTableId
 PL/pgSQL function cdsreconcileruntable(bigint) line 82 at SQL statement

Just to be absolutely clear, those happened *before* the following
errors? And there were no 'during exception cleanup' like errors before?

 3) shortly (?) after that, I got:
 
 WARNING:  did not find subXID 14955 in MyProc
 CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
 during exception cleanup
 WARNING:  you don't own a lock of type RowExclusiveLock
 CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
 during exception cleanup
 LOG:  could not send data to client: Broken pipe
 CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
 during exception cleanup
 STATEMENT:  SELECT CDSReconcileRunTable(2151)
 WARNING:  ReleaseLockIfHeld: failed??
 CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
 during exception cleanup
 ERROR:  failed to re-find shared proclock object
 CONTEXT:  PL/pgSQL function cdsreconcileruntable(bigint) line 35
 during exception cleanup
 STATEMENT:  SELECT CDSReconcileRunTable(2151)
 WARNING:  AbortSubTransaction while in ABORT state
 WARNING:  did not find subXID 14955 in MyProc
 WARNING:  you don't own a lock of type AccessShareLock
 WARNING:  ReleaseLockIfHeld: failed??
 ERROR:  failed to re-find shared proclock object
 WARNING:  AbortSubTransaction while in ABORT state
 WARNING:  did not find subXID 14955 in MyProc
 WARNING:  you don't own a lock of type AccessShareLock
 WARNING:  ReleaseLockIfHeld: failed??
 WARNING:  you don't own a lock of type ShareLock
 TRAP: FailedAssertion(!(FastPathStrongRelationLocks-count[fasthashcode]
  0), File: lock.c, Line: 1240)
 LOG:  server process (PID 10117) was terminated by signal 6: Aborted
 LOG:  terminating any other active server processes

Ick.

Were there any 'LOG: Handling deadlock detected on CdsRunTableId' log
entries before? It's hard to know from here, but the 'during exception
cleanup' indicates a problem in abort handling.  Were there any deadlock
detected errors closeby?

You're catching deadlock errors in a subtransaction. Hm.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 9:49 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-01-14 09:47:19 -0600, Merlin Moncure wrote:
 On Wed, Jan 14, 2015 at 9:30 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  If you gdb in, and type 'fin' a couple times, to wait till the function
  finishes, is there actually any progress? I'm wondering whether it's
  just many catalog accesses + contention, or some other
  problem. Alternatively set a breakpoint on ScanPgRelation() or so and
  see how often it's hit.

 well, i restarted the database, forgetting my looper was running which
 immediately spun up and it got stuck again with a similar profile
 (lots of cpu in spinlock):

 Samples: 3K of event 'cycles', Event count (approx.): 2695723228
  31.16%  postgres[.] s_lock
  22.32%  postgres[.] tas
  12.13%  postgres[.] tas
   5.93%  postgres[.] spin_delay
   5.69%  postgres[.] LWLockRelease
   3.75%  postgres[.] LWLockAcquireCommon
   3.61%  perf[.] 0x000526c4
   2.51%  postgres[.] FunctionCall2Coll
   1.48%  libc-2.17.so[.] 0x0016a132

  If you gdb in, and type 'fin' a couple times,
 (gdb) fin
 Run till exit from #0  0x7ff4c63f7a97 in semop () from
 /lib/x86_64-linux-gnu/libc.so.6
 0x006de073 in PGSemaphoreLock ()
 (gdb) fin
 Run till exit from #0  0x006de073 in PGSemaphoreLock ()

 It returned once.  Second time, it didn't at least so far (minute or so).

 Hm, that's autovac though, not the normal user backends that actually do
 stuff, right? If you could additionally check those, it'd be great.


got it to reproduce.  had to take a break to get some work done.


(gdb) fin
Run till exit from #0  0x7f7b07099dc3 in select ()
   from /lib/x86_64-linux-gnu/libc.so.6
0x008d3ac4 in pg_usleep ()
(gdb) fin
Run till exit from #0  0x008d3ac4 in pg_usleep ()
0x00750b69 in s_lock ()
(gdb) fin
Run till exit from #0  0x00750b69 in s_lock ()
0x00750844 in LWLockRelease ()
(gdb) fin
Run till exit from #0  0x00750844 in LWLockRelease ()
0x0073 in LockBuffer ()
(gdb) fin
Run till exit from #0  0x0073 in LockBuffer ()
0x004b2db4 in _bt_relandgetbuf ()
(gdb) fin
Run till exit from #0  0x004b2db4 in _bt_relandgetbuf ()
0x004b7116 in _bt_moveright ()
(gdb) fin
Run till exit from #0  0x004b7116 in _bt_moveright ()



so it looks like nobody ever exits from _bt_moveright.   any last
requests before I start bisecting down?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 4:53 PM, Merlin Moncure mmonc...@gmail.com wrote:
 yeah.  via:
 cds2=# \copy (select s as page, (bt_page_items('pg_class_oid_index',
 s)).* from generate_series(1,12) s) to '/tmp/page_items.csv' csv
 header;

My immediate observation here is that blocks 2 and 9 have identical
metadata (from their page opaque area), but partially non-matching
data items (however, the number of items on each block is consistent
and correct according to that metadata, as far as it goes). I think
block 9 is supposed to have a right-link to block 5 (block 5 seems to
think so, at least -- its left link is 9).

So now the question is: how did that inconsistency arise? It didn't
necessarily arise at the time of the (presumed) split of block 2 to
create 9. It could be that the opaque area was changed by something
else, some time later. I'll investigate more.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 6:50 PM, Peter Geoghegan p...@heroku.com wrote:
 This is great, but it's not exactly clear which bt_page_items() page
 is which - some are skipped, but I can't be sure which. Would you mind
 rewriting that query to indicate which block is under consideration by
 bt_page_items()?

yeah.  via:
cds2=# \copy (select s as page, (bt_page_items('pg_class_oid_index',
s)).* from generate_series(1,12) s) to '/tmp/page_items.csv' csv
header;

merlin
page,itemoffset,ctid,itemlen,nulls,vars,data
1,1,(0,9),16,f,f,16 0b 00 00 00 00 00 00
1,2,(2,50),16,f,f,70 00 00 00 00 00 00 00
1,3,(2,52),16,f,f,71 00 00 00 00 00 00 00
1,4,(2,24),16,f,f,ae 00 00 00 00 00 00 00
1,5,(2,25),16,f,f,af 00 00 00 00 00 00 00
1,6,(2,51),16,f,f,24 02 00 00 00 00 00 00
1,7,(2,53),16,f,f,25 02 00 00 00 00 00 00
1,8,(4,66),16,f,f,3a 03 00 00 00 00 00 00
1,9,(2,55),16,f,f,3b 03 00 00 00 00 00 00
1,10,(2,56),16,f,f,3c 03 00 00 00 00 00 00
1,11,(4,15),16,f,f,70 04 00 00 00 00 00 00
1,12,(1,31),16,f,f,71 04 00 00 00 00 00 00
1,13,(4,14),16,f,f,bd 04 00 00 00 00 00 00
1,14,(4,17),16,f,f,be 04 00 00 00 00 00 00
1,15,(1,34),16,f,f,d0 04 00 00 00 00 00 00
1,16,(1,35),16,f,f,d1 04 00 00 00 00 00 00
1,17,(0,2),16,f,f,df 04 00 00 00 00 00 00
1,18,(3,10),16,f,f,e1 04 00 00 00 00 00 00
1,19,(3,11),16,f,f,e7 04 00 00 00 00 00 00
1,20,(7,6),16,f,f,eb 04 00 00 00 00 00 00
1,21,(2,23),16,f,f,ec 04 00 00 00 00 00 00
1,22,(4,16),16,f,f,ed 04 00 00 00 00 00 00
1,23,(4,12),16,f,f,ee 04 00 00 00 00 00 00
1,24,(4,64),16,f,f,89 05 00 00 00 00 00 00
1,25,(3,31),16,f,f,8a 05 00 00 00 00 00 00
1,26,(1,48),16,f,f,8b 08 00 00 00 00 00 00
1,27,(4,63),16,f,f,18 09 00 00 00 00 00 00
1,28,(0,64),16,f,f,20 09 00 00 00 00 00 00
1,29,(0,65),16,f,f,21 09 00 00 00 00 00 00
1,30,(4,18),16,f,f,5c 09 00 00 00 00 00 00
1,31,(1,11),16,f,f,5d 09 00 00 00 00 00 00
1,32,(4,40),16,f,f,28 0a 00 00 00 00 00 00
1,33,(3,47),16,f,f,29 0a 00 00 00 00 00 00
1,34,(3,48),16,f,f,2a 0a 00 00 00 00 00 00
1,35,(3,49),16,f,f,2b 0a 00 00 00 00 00 00
1,36,(3,41),16,f,f,2c 0a 00 00 00 00 00 00
1,37,(4,45),16,f,f,2d 0a 00 00 00 00 00 00
1,38,(3,42),16,f,f,2e 0a 00 00 00 00 00 00
1,39,(4,48),16,f,f,2f 0a 00 00 00 00 00 00
1,40,(4,49),16,f,f,30 0a 00 00 00 00 00 00
1,41,(4,44),16,f,f,31 0a 00 00 00 00 00 00
1,42,(3,43),16,f,f,32 0a 00 00 00 00 00 00
1,43,(7,26),16,f,f,33 0a 00 00 00 00 00 00
1,44,(3,50),16,f,f,34 0a 00 00 00 00 00 00
1,45,(4,71),16,f,f,35 0a 00 00 00 00 00 00
1,46,(4,47),16,f,f,37 0a 00 00 00 00 00 00
1,47,(3,46),16,f,f,38 0a 00 00 00 00 00 00
1,48,(3,44),16,f,f,39 0a 00 00 00 00 00 00
1,49,(4,41),16,f,f,3a 0a 00 00 00 00 00 00
1,50,(0,1),16,f,f,3b 0a 00 00 00 00 00 00
1,51,(4,42),16,f,f,3c 0a 00 00 00 00 00 00
1,52,(0,61),16,f,f,5a 0a 00 00 00 00 00 00
1,53,(0,54),16,f,f,5b 0a 00 00 00 00 00 00
1,54,(0,55),16,f,f,5c 0a 00 00 00 00 00 00
1,55,(0,56),16,f,f,5d 0a 00 00 00 00 00 00
1,56,(0,57),16,f,f,5e 0a 00 00 00 00 00 00
1,57,(0,59),16,f,f,5f 0a 00 00 00 00 00 00
1,58,(0,50),16,f,f,60 0a 00 00 00 00 00 00
1,59,(7,14),16,f,f,61 0a 00 00 00 00 00 00
1,60,(0,42),16,f,f,62 0a 00 00 00 00 00 00
1,61,(0,43),16,f,f,63 0a 00 00 00 00 00 00
1,62,(0,68),16,f,f,64 0a 00 00 00 00 00 00
1,63,(0,69),16,f,f,65 0a 00 00 00 00 00 00
1,64,(7,3),16,f,f,66 0a 00 00 00 00 00 00
1,65,(7,4),16,f,f,67 0a 00 00 00 00 00 00
1,66,(0,53),16,f,f,68 0a 00 00 00 00 00 00
1,67,(7,22),16,f,f,69 0a 00 00 00 00 00 00
1,68,(7,23),16,f,f,6a 0a 00 00 00 00 00 00
1,69,(7,24),16,f,f,6b 0a 00 00 00 00 00 00
1,70,(1,72),16,f,f,6c 0a 00 00 00 00 00 00
1,71,(1,73),16,f,f,6d 0a 00 00 00 00 00 00
1,72,(1,74),16,f,f,6e 0a 00 00 00 00 00 00
1,73,(1,6),16,f,f,6f 0a 00 00 00 00 00 00
1,74,(1,7),16,f,f,70 0a 00 00 00 00 00 00
1,75,(1,75),16,f,f,71 0a 00 00 00 00 00 00
1,76,(1,76),16,f,f,72 0a 00 00 00 00 00 00
1,77,(1,67),16,f,f,73 0a 00 00 00 00 00 00
1,78,(0,40),16,f,f,74 0a 00 00 00 00 00 00
1,79,(0,41),16,f,f,75 0a 00 00 00 00 00 00
1,80,(1,50),16,f,f,76 0a 00 00 00 00 00 00
1,81,(1,51),16,f,f,77 0a 00 00 00 00 00 00
1,82,(1,49),16,f,f,78 0a 00 00 00 00 00 00
1,83,(1,58),16,f,f,79 0a 00 00 00 00 00 00
1,84,(1,59),16,f,f,7a 0a 00 00 00 00 00 00
1,85,(1,89),16,f,f,7b 0a 00 00 00 00 00 00
1,86,(1,70),16,f,f,7c 0a 00 00 00 00 00 00
1,87,(1,71),16,f,f,7d 0a 00 00 00 00 00 00
1,88,(1,56),16,f,f,7e 0a 00 00 00 00 00 00
1,89,(1,57),16,f,f,7f 0a 00 00 00 00 00 00
1,90,(1,52),16,f,f,80 0a 00 00 00 00 00 00
1,91,(1,53),16,f,f,81 0a 00 00 00 00 00 00
1,92,(1,43),16,f,f,82 0a 00 00 00 00 00 00
1,93,(7,1),16,f,f,83 0a 00 00 00 00 00 00
1,94,(1,61),16,f,f,84 0a 00 00 00 00 00 00
1,95,(1,62),16,f,f,85 0a 00 00 00 00 00 00
1,96,(0,29),16,f,f,86 0a 00 00 00 00 00 00
1,97,(0,30),16,f,f,87 0a 00 00 00 00 00 00
1,98,(6,35),16,f,f,88 0a 00 00 00 00 00 00
1,99,(1,36),16,f,f,89 0a 00 00 00 00 00 00
1,100,(1,37),16,f,f,8a 0a 00 00 00 00 00 00
1,101,(1,63),16,f,f,8b 0a 00 00 00 00 00 00
1,102,(1,64),16,f,f,8d 0a 00 00 00 00 00 00
1,103,(7,61),16,f,f,8e 0a 00 00 00 00 00 00
1,104,(2,19),16,f,f,8f 0a 00 00 00 00 00 00
1,105,(2,22),16,f,f,90 0a

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

This is great, but it's not exactly clear which bt_page_items() page
is which - some are skipped, but I can't be sure which. Would you mind
rewriting that query to indicate which block is under consideration by
bt_page_items()?

Thanks
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 2:32 PM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, Jan 14, 2015 at 12:24 PM, Peter Geoghegan p...@heroku.com wrote:
 Could you write some code to print out the block number (i.e.
 BlockNumber blkno) if there are more than, say, 5 retries within
 _bt_moveright()?

 Obviously I mean that the block number should be printed, no matter
 whether or not the P_INCOMPLETE_SPLIT() path is taken or not. So you
 should just move this to the top of the for(;;) loop, so it's always
 available to print:

 BlockNumber blkno = BufferGetBlockNumber(buf);

(gdb) print  BufferGetBlockNumber(buf)
$15 = 9

..and it stays 9, continuing several times having set breakpoint.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 4:26 PM, Merlin Moncure mmonc...@gmail.com wrote:
 The index is the oid index on pg_class.  Some more info:

 *) temp table churn is fairly high.  Several dozen get spawned and
 destroted at the start of a replication run, all at once, due to some
 dodgy coding via dblink.  During the replication run, the temp table
 churn rate drops.

 *)  running btreecheck, I see:

I knew I wrote that tool for a reason.   :-)

It would be great if I could get a raw dump of the index using
contrib/pageinspect at this juncture too, as already described.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 5:39 PM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, Jan 14, 2015 at 3:38 PM, Merlin Moncure mmonc...@gmail.com wrote:
 (gdb) print  BufferGetBlockNumber(buf)
 $15 = 9

 ..and it stays 9, continuing several times having set breakpoint.


 And the index involved? I'm pretty sure that this in an internal page, no?

The index is the oid index on pg_class.  Some more info:

*) temp table churn is fairly high.  Several dozen get spawned and
destroted at the start of a replication run, all at once, due to some
dodgy coding via dblink.  During the replication run, the temp table
churn rate drops.

*)  running btreecheck, I see:
cds2=# select bt_index_verify('pg_class_oid_index');
NOTICE:  page 7 of index pg_class_oid_index is deleted
NOTICE:  page 10 of index pg_class_oid_index is deleted
NOTICE:  page 12 of index pg_class_oid_index is deleted
 bt_index_verify
─


cds2=# select bt_leftright_verify('pg_class_oid_index');
WARNING:  left link/right link pair don't comport at level 0, block 9,
last: 2, current left: 4
WARNING:  left link/right link pair don't comport at level 0, block 9,
last: 9, current left: 4
WARNING:  left link/right link pair don't comport at level 0, block 9,
last: 9, current left: 4
WARNING:  left link/right link pair don't comport at level 0, block 9,
last: 9, current left: 4
WARNING:  left link/right link pair don't comport at level 0, block 9,
last: 9, current left: 4
[repeat infinity until cancel]

which looks like the index is corrupted?  ISTM _bt_moveright is
hanging because it's trying to move from block 9 to block 9 and so
loops forever.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 3:38 PM, Merlin Moncure mmonc...@gmail.com wrote:
 (gdb) print  BufferGetBlockNumber(buf)
 $15 = 9

 ..and it stays 9, continuing several times having set breakpoint.


And the index involved? I'm pretty sure that this in an internal page, no?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 6:26 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, Jan 14, 2015 at 5:39 PM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, Jan 14, 2015 at 3:38 PM, Merlin Moncure mmonc...@gmail.com wrote:
 (gdb) print  BufferGetBlockNumber(buf)
 $15 = 9

 ..and it stays 9, continuing several times having set breakpoint.


 And the index involved? I'm pretty sure that this in an internal page, no?

 The index is the oid index on pg_class.  Some more info:

 *) temp table churn is fairly high.  Several dozen get spawned and
 destroted at the start of a replication run, all at once, due to some
 dodgy coding via dblink.  During the replication run, the temp table
 churn rate drops.

 *)  running btreecheck, I see:
 cds2=# select bt_index_verify('pg_class_oid_index');
 NOTICE:  page 7 of index pg_class_oid_index is deleted
 NOTICE:  page 10 of index pg_class_oid_index is deleted
 NOTICE:  page 12 of index pg_class_oid_index is deleted
  bt_index_verify
 ─


 cds2=# select bt_leftright_verify('pg_class_oid_index');
 WARNING:  left link/right link pair don't comport at level 0, block 9,
 last: 2, current left: 4
 WARNING:  left link/right link pair don't comport at level 0, block 9,
 last: 9, current left: 4
 WARNING:  left link/right link pair don't comport at level 0, block 9,
 last: 9, current left: 4
 WARNING:  left link/right link pair don't comport at level 0, block 9,
 last: 9, current left: 4
 WARNING:  left link/right link pair don't comport at level 0, block 9,
 last: 9, current left: 4
 [repeat infinity until cancel]

 which looks like the index is corrupted?  ISTM _bt_moveright is
 hanging because it's trying to move from block 9 to block 9 and so
 loops forever.

per Peter the following might be useful:

cds2=# select * from bt_metap('pg_class_oid_index');
 magic  │ version │ root │ level │ fastroot │ fastlevel
┼─┼──┼───┼──┼───
 340322 │   2 │3 │ 1 │3 │ 1

cds2=# select (bt_page_stats('pg_class_oid_index', s)).* from
generate_series(1,12) s;
 blkno │ type │ live_items │ dead_items │ avg_item_size │ page_size │
free_size │ btpo_prev │ btpo_next │ btpo  │ btpo_flags
───┼──┼┼┼───┼───┼───┼───┼───┼───┼
 1 │ l│119 │  0 │16 │  8192 │
5768 │ 0 │ 4 │ 0 │  1
 2 │ l│ 25 │  0 │16 │  8192 │
7648 │ 4 │ 9 │ 0 │  1
 3 │ r│  8 │  0 │15 │  8192 │
7996 │ 0 │ 0 │ 1 │  2
 4 │ l│178 │  0 │16 │  8192 │
4588 │ 1 │ 2 │ 0 │  1
 5 │ l│  7 │  0 │16 │  8192 │
8008 │ 9 │11 │ 0 │  1
 6 │ l│  5 │  0 │16 │  8192 │
8048 │11 │ 8 │ 0 │  1
 7 │ d│  0 │  0 │ 0 │  8192 │
   0 │-1 │-1 │ 12366 │  0
 8 │ l│187 │  0 │16 │  8192 │
4408 │ 6 │ 0 │ 0 │  1
 9 │ l│ 25 │  0 │16 │  8192 │
7648 │ 4 │ 9 │ 0 │  1
10 │ d│  0 │  0 │ 0 │  8192 │
   0 │-1 │-1 │ 12366 │  0
11 │ l│  6 │  0 │16 │  8192 │
8028 │ 5 │ 6 │ 0 │  1
12 │ d│  0 │  0 │ 0 │  8192 │
   0 │-1 │-1 │ 10731 │  0


merlin
cds2=# select (bt_page_items('pg_class_oid_index', s)).* from generate_series(1,12) s;
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
 itemoffset │  ctid  │ itemlen │ nulls │ vars │  data   
┼┼─┼───┼──┼─
  1 │ (0,9)  │  16 │ f │ f│ 16 0b 00 00 00 00 00 00
  2 │ (2,50) │  16 │ f │ f│ 70 00 00 00 00 00 00 00
  3 │ (2,52) │  16 │ f │ f│ 71 00 00 00 00 00 00 00
  4 │ (2,24) │  16 │ f │ f│ ae 00 00 00 00 00 00 00
  5 │ (2,25) │  16 │ f │ f│ af 00 00 00 00 00 00 00
  6 │ (2,51) │  16 │ f │ f│ 24 02 00 00

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 5:23 PM, Peter Geoghegan p...@heroku.com wrote:
 My immediate observation here is that blocks 2 and 9 have identical
 metadata (from their page opaque area), but partially non-matching
 data items (however, the number of items on each block is consistent
 and correct according to that metadata, as far as it goes). I think
 block 9 is supposed to have a right-link to block 5 (block 5 seems to
 think so, at least -- its left link is 9).

I am mistaken on one detail here - blocks 2 and 9 are actually fully
identical. I still have no idea why, though.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 5:23 PM, Peter Geoghegan p...@heroku.com wrote:
 My immediate observation here is that blocks 2 and 9 have identical
 metadata (from their page opaque area), but partially non-matching
 data items (however, the number of items on each block is consistent
 and correct according to that metadata, as far as it goes). I think
 block 9 is supposed to have a right-link to block 5 (block 5 seems to
 think so, at least -- its left link is 9).

For the convenience of others, here is a version with a normalized
data column added (interpreting the datums as little-endian Oids).

-- 
Peter Geoghegan
page,itemoffset,ctid,itemlen,nulls,vars,data,normalized data
1,1,(0,9),16,f,f,16 0b 00 00 00 00 00 00,2838
1,2,(2,50),16,f,f,70 00 00 00 00 00 00 00,112
1,3,(2,52),16,f,f,71 00 00 00 00 00 00 00,113
1,4,(2,24),16,f,f,ae 00 00 00 00 00 00 00,174
1,5,(2,25),16,f,f,af 00 00 00 00 00 00 00,175
1,6,(2,51),16,f,f,24 02 00 00 00 00 00 00,548
1,7,(2,53),16,f,f,25 02 00 00 00 00 00 00,549
1,8,(4,66),16,f,f,3a 03 00 00 00 00 00 00,826
1,9,(2,55),16,f,f,3b 03 00 00 00 00 00 00,827
1,10,(2,56),16,f,f,3c 03 00 00 00 00 00 00,828
1,11,(4,15),16,f,f,70 04 00 00 00 00 00 00,1136
1,12,(1,31),16,f,f,71 04 00 00 00 00 00 00,1137
1,13,(4,14),16,f,f,bd 04 00 00 00 00 00 00,1213
1,14,(4,17),16,f,f,be 04 00 00 00 00 00 00,1214
1,15,(1,34),16,f,f,d0 04 00 00 00 00 00 00,1232
1,16,(1,35),16,f,f,d1 04 00 00 00 00 00 00,1233
1,17,(0,2),16,f,f,df 04 00 00 00 00 00 00,1247
1,18,(3,10),16,f,f,e1 04 00 00 00 00 00 00,1249
1,19,(3,11),16,f,f,e7 04 00 00 00 00 00 00,1255
1,20,(7,6),16,f,f,eb 04 00 00 00 00 00 00,1259
1,21,(2,23),16,f,f,ec 04 00 00 00 00 00 00,1260
1,22,(4,16),16,f,f,ed 04 00 00 00 00 00 00,1261
1,23,(4,12),16,f,f,ee 04 00 00 00 00 00 00,1262
1,24,(4,64),16,f,f,89 05 00 00 00 00 00 00,1417
1,25,(3,31),16,f,f,8a 05 00 00 00 00 00 00,1418
1,26,(1,48),16,f,f,8b 08 00 00 00 00 00 00,2187
1,27,(4,63),16,f,f,18 09 00 00 00 00 00 00,2328
1,28,(0,64),16,f,f,20 09 00 00 00 00 00 00,2336
1,29,(0,65),16,f,f,21 09 00 00 00 00 00 00,2337
1,30,(4,18),16,f,f,5c 09 00 00 00 00 00 00,2396
1,31,(1,11),16,f,f,5d 09 00 00 00 00 00 00,2397
1,32,(4,40),16,f,f,28 0a 00 00 00 00 00 00,2600
1,33,(3,47),16,f,f,29 0a 00 00 00 00 00 00,2601
1,34,(3,48),16,f,f,2a 0a 00 00 00 00 00 00,2602
1,35,(3,49),16,f,f,2b 0a 00 00 00 00 00 00,2603
1,36,(3,41),16,f,f,2c 0a 00 00 00 00 00 00,2604
1,37,(4,45),16,f,f,2d 0a 00 00 00 00 00 00,2605
1,38,(3,42),16,f,f,2e 0a 00 00 00 00 00 00,2606
1,39,(4,48),16,f,f,2f 0a 00 00 00 00 00 00,2607
1,40,(4,49),16,f,f,30 0a 00 00 00 00 00 00,2608
1,41,(4,44),16,f,f,31 0a 00 00 00 00 00 00,2609
1,42,(3,43),16,f,f,32 0a 00 00 00 00 00 00,2610
1,43,(7,26),16,f,f,33 0a 00 00 00 00 00 00,2611
1,44,(3,50),16,f,f,34 0a 00 00 00 00 00 00,2612
1,45,(4,71),16,f,f,35 0a 00 00 00 00 00 00,2613
1,46,(4,47),16,f,f,37 0a 00 00 00 00 00 00,2615
1,47,(3,46),16,f,f,38 0a 00 00 00 00 00 00,2616
1,48,(3,44),16,f,f,39 0a 00 00 00 00 00 00,2617
1,49,(4,41),16,f,f,3a 0a 00 00 00 00 00 00,2618
1,50,(0,1),16,f,f,3b 0a 00 00 00 00 00 00,2619
1,51,(4,42),16,f,f,3c 0a 00 00 00 00 00 00,2620
1,52,(0,61),16,f,f,5a 0a 00 00 00 00 00 00,2650
1,53,(0,54),16,f,f,5b 0a 00 00 00 00 00 00,2651
1,54,(0,55),16,f,f,5c 0a 00 00 00 00 00 00,2652
1,55,(0,56),16,f,f,5d 0a 00 00 00 00 00 00,2653
1,56,(0,57),16,f,f,5e 0a 00 00 00 00 00 00,2654
1,57,(0,59),16,f,f,5f 0a 00 00 00 00 00 00,2655
1,58,(0,50),16,f,f,60 0a 00 00 00 00 00 00,2656
1,59,(7,14),16,f,f,61 0a 00 00 00 00 00 00,2657
1,60,(0,42),16,f,f,62 0a 00 00 00 00 00 00,2658
1,61,(0,43),16,f,f,63 0a 00 00 00 00 00 00,2659
1,62,(0,68),16,f,f,64 0a 00 00 00 00 00 00,2660
1,63,(0,69),16,f,f,65 0a 00 00 00 00 00 00,2661
1,64,(7,3),16,f,f,66 0a 00 00 00 00 00 00,2662
1,65,(7,4),16,f,f,67 0a 00 00 00 00 00 00,2663
1,66,(0,53),16,f,f,68 0a 00 00 00 00 00 00,2664
1,67,(7,22),16,f,f,69 0a 00 00 00 00 00 00,2665
1,68,(7,23),16,f,f,6a 0a 00 00 00 00 00 00,2666
1,69,(7,24),16,f,f,6b 0a 00 00 00 00 00 00,2667
1,70,(1,72),16,f,f,6c 0a 00 00 00 00 00 00,2668
1,71,(1,73),16,f,f,6d 0a 00 00 00 00 00 00,2669
1,72,(1,74),16,f,f,6e 0a 00 00 00 00 00 00,2670
1,73,(1,6),16,f,f,6f 0a 00 00 00 00 00 00,2671
1,74,(1,7),16,f,f,70 0a 00 00 00 00 00 00,2672
1,75,(1,75),16,f,f,71 0a 00 00 00 00 00 00,2673
1,76,(1,76),16,f,f,72 0a 00 00 00 00 00 00,2674
1,77,(1,67),16,f,f,73 0a 00 00 00 00 00 00,2675
1,78,(0,40),16,f,f,74 0a 00 00 00 00 00 00,2676
1,79,(0,41),16,f,f,75 0a 00 00 00 00 00 00,2677
1,80,(1,50),16,f,f,76 0a 00 00 00 00 00 00,2678
1,81,(1,51),16,f,f,77 0a 00 00 00 00 00 00,2679
1,82,(1,49),16,f,f,78 0a 00 00 00 00 00 00,2680
1,83,(1,58),16,f,f,79 0a 00 00 00 00 00 00,2681
1,84,(1,59),16,f,f,7a 0a 00 00 00 00 00 00,2682
1,85,(1,89),16,f,f,7b 0a 00 00 00 00 00 00,2683
1,86,(1,70),16,f,f,7c 0a 00 00 00 00 00 00,2684
1,87,(1,71),16,f,f,7d 0a 00 00 00 00 00 00,2685
1,88,(1,56),16,f,f,7e 0a 00 00 00 00 00 00,2686
1,89,(1,57),16,f,f,7f 0a 00 00 00 00 00 00,2687
1,90,(1,52),16,f,f,80 0a 00 00 00 00 00 00,2688
1,91,(1,53),16,f,f,81 0a

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

Merlin Moncure mmonc...@gmail.com writes:
 On Wed, Jan 14, 2015 at 8:41 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 What are the autovac processes doing (according to pg_stat_activity)?

 pid,running,waiting,query
 7105,00:28:40.789221,f,autovacuum: VACUUM ANALYZE pg_catalog.pg_class

Hah, I suspected as much.  Is that the one that's stuck in
LockBufferForCleanup, or the other one that's got a similar
backtrace to all the user processes?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

Merlin Moncure mmonc...@gmail.com writes:
 There were seven process with that backtrace exact backtrace (except
 that randomly they are sleeping in the spinloop).   Something else
 interesting: autovacuum has been running all night as well.  Unlike
 the other process however, cpu utilization does not register on top.
 I backtraced them as well.  They are not identical.  One of them looks
 like this:

What are the autovac processes doing (according to pg_stat_activity)?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 9:05 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 On Wed, Jan 14, 2015 at 8:41 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 What are the autovac processes doing (according to pg_stat_activity)?

 pid,running,waiting,query
 7105,00:28:40.789221,f,autovacuum: VACUUM ANALYZE pg_catalog.pg_class

 Hah, I suspected as much.  Is that the one that's stuck in
 LockBufferForCleanup, or the other one that's got a similar
 backtrace to all the user processes?

Yes, it is pg_class is coming from  LockBufferForCleanup ().  As you
can see above, it has a shorter runtime.   So it was killed off once
about a half hour ago which did not free up the logjam.  However, AV
spawned it again and now it does not respond to cancel.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On 2015-01-14 09:22:45 -0600, Merlin Moncure wrote:
 On Wed, Jan 14, 2015 at 9:11 AM, Andres Freund and...@2ndquadrant.com wrote:
  On 2015-01-14 10:05:01 -0500, Tom Lane wrote:
  Merlin Moncure mmonc...@gmail.com writes:
   On Wed, Jan 14, 2015 at 8:41 AM, Tom Lane t...@sss.pgh.pa.us wrote:
   What are the autovac processes doing (according to pg_stat_activity)?
 
   pid,running,waiting,query
   7105,00:28:40.789221,f,autovacuum: VACUUM ANALYZE pg_catalog.pg_class
 
  It'd be interesting to know whether that vacuum gets very frequent
  semaphore wakeups. Could you strace it for a second or three?
 
 for 30 seconds+ it just looks like this:
 mmoncure@mernix2 ~ $ sudo strace -p 7105
 Process 7105 attached
 semop(5701638, {{4, -1, 0}}, 1

Ok. So that explains why it's not interruptible.

 all of other processes are yielding out of the spinlock, for example:
 select(0, NULL, NULL, NULL, {0, 1408})  = 0 (Timeout)

Note the above isn't the spinlock, it's the process's semaphore. It'll
only get set if the refcount ever indicates that nobody but autovac is
holding the lock.

  How did this perform  9.4?
 this is a new project. However, I can run it vs earlier version.
 
 Can you guess how many times these dynamic
  statements are planned? How many different relations are accessed in the
  dynamically planned queries?
 
 only once or twice, and only a couple of tables.

Hm. Odd. The first -g profile seemed to indicate a hell of a lot time
was spent in LWLockRelease() - indicating that there's actually
progress. Later profiles/backtraces were less clear.

If you gdb in, and type 'fin' a couple times, to wait till the function
finishes, is there actually any progress? I'm wondering whether it's
just many catalog accesses + contention, or some other
problem. Alternatively set a breakpoint on ScanPgRelation() or so and
see how often it's hit.

 I can send the code off-list if you guys think it'd help.

Might be interesting.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 8:41 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 There were seven process with that backtrace exact backtrace (except
 that randomly they are sleeping in the spinloop).   Something else
 interesting: autovacuum has been running all night as well.  Unlike
 the other process however, cpu utilization does not register on top.
 I backtraced them as well.  They are not identical.  One of them looks
 like this:

 What are the autovac processes doing (according to pg_stat_activity)?

pid,running,waiting,query
27885,15:40:46.904039,f,autovacuum: VACUUM ANALYZE onesitepmc.propertyguestcard
7209,00:24:44.162684,f,SELECT n.nspname as Schema,
  c.relname as Name,
  CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm'
THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN
'sequence' WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' END
as Type,
  pg_catalog.pg_get_userbyid(c.relowner) as Owner
FROM pg_catalog.pg_class c
 LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind IN ('r','v','m','S','f','')
  AND n.nspname  'pg_catalog'
  AND n.nspname  'information_schema'
  AND n.nspname !~ '^pg_toast'
  AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 1,2;
7105,00:28:40.789221,f,autovacuum: VACUUM ANALYZE pg_catalog.pg_class
31282,15:28:29.070942,f,SELECT CDSReconcileRunTable(867)
31792,15:27:09.483365,f,SELECT CDSReconcileRunTable(874)
7135,00:26:08.977953,t,vacuum VERBOSE pg_class;
32725,15:24:46.008341,f,SELECT CDSReconcileRunTable(884)
32367,15:25:44.4206,f,SELECT CDSReconcileRunTable(881)
32492,15:25:31.27934,f,SELECT CDSReconcileRunTable(883)
31899,15:26:57.246415,f,SELECT CDSReconcileRunTable(875)
32368,15:25:44.418445,f,SELECT CDSReconcileRunTable(880)
7270,00:21:42.274104,f,SELECT n.nspname as Schema,
 c.relname as Name,
  CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm'
THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN
'sequence' WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' END
as Type,
  pg_catalog.pg_get_userbyid(c.relowner) as Owner
FROM pg_catalog.pg_class c
 LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind IN ('r','v','m','S','f','')
  AND n.nspname  'pg_catalog'
  AND n.nspname  'information_schema'
  AND n.nspname !~ '^pg_toast'
  AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 1,2;
7403,00:00:00,f,COPY  ( select pid, now ( ) - query_start as running,
waiting, query from pg_stat_activity ) TO STDOUT csv header ;

notes:
*) SELECT CDSReconcileRunTable(xxx) are the replication processes.
they are burning cpu.
*) SELECT n.nspname ... is via psql \d.  they stick when invoked and
do not respond to cancel. they are also burning cpu
*) the autovacum processes do not respond to cancel. however, one of
them did earlier (tracing in from lazy_vacuum_index ()) did respond.
However, now, it doesn't (upon being respawned via AV) with the same
backtrace.  The blocked manual vacuum verbose did cancel

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On 2015-01-14 10:05:01 -0500, Tom Lane wrote:
 Merlin Moncure mmonc...@gmail.com writes:
  On Wed, Jan 14, 2015 at 8:41 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  What are the autovac processes doing (according to pg_stat_activity)?
 
  pid,running,waiting,query
  7105,00:28:40.789221,f,autovacuum: VACUUM ANALYZE pg_catalog.pg_class

It'd be interesting to know whether that vacuum gets very frequent
semaphore wakeups. Could you strace it for a second or three?

How did this perform  9.4?  Can you guess how many times these dynamic
statements are planned? How many different relations are accessed in the
dynamically planned queries?

 Hah, I suspected as much.  Is that the one that's stuck in
 LockBufferForCleanup, or the other one that's got a similar backtrace
 to all the user processes?

Do you have a theory? Right now it primarily looks like contention on a
single buffer due to the high number of dynamic statements, possibly
made worse by the signalling between normal pinners and vacuum waiting
for cleanup.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On 2015-01-14 10:13:32 -0500, Tom Lane wrote:
 Merlin Moncure mmonc...@gmail.com writes:
  Yes, it is pg_class is coming from  LockBufferForCleanup ().  As you
  can see above, it has a shorter runtime.   So it was killed off once
  about a half hour ago which did not free up the logjam.  However, AV
  spawned it again and now it does not respond to cancel.
 
 Interesting.  That seems like there might be two separate issues at
 play.  It's plausible that LockBufferForCleanup might be interfering
 with other attempts to scan the index, but then why wouldn't killing
 the AV have unstuck things?

LockBufferForCleanup() unfortunately isn't interruptible. I've every now
and then seen vacuums being stuck for a long while, trying to acquire
cleanup locks - IIRC I complained about that on list even.  So autovac
will only be cancelled when going to the next page.  And even if the
page ever gets a zero (well, one) refcount, by the time autovac is woken
up via the semaphore, it'll often end up being used again.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

Andres Freund and...@2ndquadrant.com writes:
 On 2015-01-14 10:05:01 -0500, Tom Lane wrote:
 Hah, I suspected as much.  Is that the one that's stuck in
 LockBufferForCleanup, or the other one that's got a similar backtrace
 to all the user processes?

 Do you have a theory? Right now it primarily looks like contention on a
 single buffer due to the high number of dynamic statements, possibly
 made worse by the signalling between normal pinners and vacuum waiting
 for cleanup.

The readers shouldn't be contending with each other.  It's certainly
plausible enough that there's a steady stream of readers that would
prevent LockBufferForCleanup from getting in, but that should not
result in any readers getting blocked.  So something's gotten broken
in that area.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 9:11 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-01-14 10:05:01 -0500, Tom Lane wrote:
 Merlin Moncure mmonc...@gmail.com writes:
  On Wed, Jan 14, 2015 at 8:41 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  What are the autovac processes doing (according to pg_stat_activity)?

  pid,running,waiting,query
  7105,00:28:40.789221,f,autovacuum: VACUUM ANALYZE pg_catalog.pg_class

 It'd be interesting to know whether that vacuum gets very frequent
 semaphore wakeups. Could you strace it for a second or three?

for 30 seconds+ it just looks like this:
mmoncure@mernix2 ~ $ sudo strace -p 7105
Process 7105 attached
semop(5701638, {{4, -1, 0}}, 1

all of other processes are yielding out of the spinlock, for example:
select(0, NULL, NULL, NULL, {0, 1408})  = 0 (Timeout)

 How did this perform  9.4?
this is a new project. However, I can run it vs earlier version.

Can you guess how many times these dynamic
 statements are planned? How many different relations are accessed in the
 dynamically planned queries?

only once or twice, and only a couple of tables. This is an
operation that should only take few seconds (inserting a few 10s of
thousands of rows), that has blocked for many hours now.  Usually it
runs through taking a few seconds.  This is either a deadlock or a
deadlock emulating sequence of operations.

I'll try to pull commits that Peter suggested and see if that helps
(I'm getting ready to bring the database down).  I can send the code
off-list if you guys think it'd help.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

Merlin Moncure mmonc...@gmail.com writes:
 Yes, it is pg_class is coming from  LockBufferForCleanup ().  As you
 can see above, it has a shorter runtime.   So it was killed off once
 about a half hour ago which did not free up the logjam.  However, AV
 spawned it again and now it does not respond to cancel.

Interesting.  That seems like there might be two separate issues at
play.  It's plausible that LockBufferForCleanup might be interfering
with other attempts to scan the index, but then why wouldn't killing
the AV have unstuck things?

Anyway it's now seeming that Peter may have the right idea about
where to look.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 9:30 AM, Andres Freund and...@2ndquadrant.com wrote:
 If you gdb in, and type 'fin' a couple times, to wait till the function
 finishes, is there actually any progress? I'm wondering whether it's
 just many catalog accesses + contention, or some other
 problem. Alternatively set a breakpoint on ScanPgRelation() or so and
 see how often it's hit.

well, i restarted the database, forgetting my looper was running which
immediately spun up and it got stuck again with a similar profile
(lots of cpu in spinlock):

Samples: 3K of event 'cycles', Event count (approx.): 2695723228
 31.16%  postgres[.] s_lock
 22.32%  postgres[.] tas
 12.13%  postgres[.] tas
  5.93%  postgres[.] spin_delay
  5.69%  postgres[.] LWLockRelease
  3.75%  postgres[.] LWLockAcquireCommon
  3.61%  perf[.] 0x000526c4
  2.51%  postgres[.] FunctionCall2Coll
  1.48%  libc-2.17.so[.] 0x0016a132

 If you gdb in, and type 'fin' a couple times,
(gdb) fin
Run till exit from #0  0x7ff4c63f7a97 in semop () from
/lib/x86_64-linux-gnu/libc.so.6
0x006de073 in PGSemaphoreLock ()
(gdb) fin
Run till exit from #0  0x006de073 in PGSemaphoreLock ()

It returned once.  Second time, it didn't at least so far (minute or so).

 I can send the code off-list if you guys think it'd help.

 Might be interesting.

sent (off-list).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On 2015-01-14 09:47:19 -0600, Merlin Moncure wrote:
 On Wed, Jan 14, 2015 at 9:30 AM, Andres Freund and...@2ndquadrant.com wrote:
  If you gdb in, and type 'fin' a couple times, to wait till the function
  finishes, is there actually any progress? I'm wondering whether it's
  just many catalog accesses + contention, or some other
  problem. Alternatively set a breakpoint on ScanPgRelation() or so and
  see how often it's hit.
 
 well, i restarted the database, forgetting my looper was running which
 immediately spun up and it got stuck again with a similar profile
 (lots of cpu in spinlock):
 
 Samples: 3K of event 'cycles', Event count (approx.): 2695723228
  31.16%  postgres[.] s_lock
  22.32%  postgres[.] tas
  12.13%  postgres[.] tas
   5.93%  postgres[.] spin_delay
   5.69%  postgres[.] LWLockRelease
   3.75%  postgres[.] LWLockAcquireCommon
   3.61%  perf[.] 0x000526c4
   2.51%  postgres[.] FunctionCall2Coll
   1.48%  libc-2.17.so[.] 0x0016a132
 
  If you gdb in, and type 'fin' a couple times,
 (gdb) fin
 Run till exit from #0  0x7ff4c63f7a97 in semop () from
 /lib/x86_64-linux-gnu/libc.so.6
 0x006de073 in PGSemaphoreLock ()
 (gdb) fin
 Run till exit from #0  0x006de073 in PGSemaphoreLock ()
 
 It returned once.  Second time, it didn't at least so far (minute or so).

Hm, that's autovac though, not the normal user backends that actually do
stuff, right? If you could additionally check those, it'd be great.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 7:22 AM, Merlin Moncure mmonc...@gmail.com wrote:
 I'll try to pull commits that Peter suggested and see if that helps
 (I'm getting ready to bring the database down).  I can send the code
 off-list if you guys think it'd help.

Thanks for the code!

I think it would be interesting to see if the bug reproduces without
the page split commit (40dae7ec537c5619fc93ad602c62f37be786d161) first
(which was applied afterwards). Then, without the page deletion commit
(efada2b8e920adfdf7418862e939925d2acd1b89). But without the page split
commit seems most interesting at the moment. It would be easier to
diagnose the issue if the bug still reproduces without that commit -
that way, we can be reasonably confident that there are no confounding
factors from the new page split code.

There were some bugfixes to those two during the 9.4 beta cycle, too.
It might be a bit tricky to generate a 9.4 that lacks one or both of
those commits.

Bugfix commits in reverse chronological order:

c73669c0e0168923e3f9e787beec980f55af2bd8 (deletion)

c91a9b5a285e20e54cf90f3660ce51ce3a5c2ef4 (incomplete split)

4fafc4ecd9e4d224d92c4a8549c5646860787a5d (deletion)

4a5d55ec2b711e13438a32d119a809a22ced410b (incomplete split)

77fe2b6d795f3f4ed282c9c980920e128a57624e (deletion)

7d98054f0dd115f57ad0ec1f424a66c13459013b (deletion)

954523cdfe229f1cb99a43a19e291a557ae2822d (incomplete split)

I think that's all of them (apart from the original commits, of course).

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Tue, Jan 13, 2015 at 7:24 PM, Peter Geoghegan p...@heroku.com wrote:
 On Tue, Jan 13, 2015 at 3:54 PM, Merlin Moncure mmonc...@gmail.com wrote:
 Some more information what's happening:
 This is a ghetto logical replication engine that migrates data from
 sql sever to postgres, consolidating a sharded database into a single
 set of tables (of which there are only two). There is only one index
 on the destination table, and it's composite int,int in both cases.

 Does the logical replication engine perform dynamic DDL at all? Does
 it dynamically add columns to the table that everything is being
 consolidated to?

I let the test burn in all night.  It's still exactly where it
was...this is not slow performance, it's stuck.

It works like this:
Dynamically create staging table for inserts/change
Dynamically create staging table for deletes
Dynamically create master table if not present (in this last run, it
was was present)
Dynamically delete from master via delete table
Dynamically delete from master via insert/change table (based on int/int key)
Dynamically insert/select from inserts/change to permanent

There is also some log/tracking information going on in other tables.


Here's a backtrace:

#0  0x00750a97 in spin_delay ()
#1  0x00750b19 in s_lock ()
#2  0x00750844 in LWLockRelease ()
#3  0x0073 in LockBuffer ()
#4  0x004b2db4 in _bt_relandgetbuf ()
#5  0x004b7116 in _bt_moveright ()
#6  0x004b6e02 in _bt_search ()
#7  0x004b8006 in _bt_first ()
#8  0x004b5aad in btgettuple ()
#9  0x00896656 in FunctionCall2Coll ()
#10 0x004acd55 in index_getnext_tid ()
#11 0x004acfad in index_getnext ()
#12 0x004ac120 in systable_getnext ()
#13 0x00878ce4 in ScanPgRelation ()
#14 0x00879eb1 in RelationBuildDesc ()
#15 0x0087bb52 in RelationIdGetRelation ()
#16 0x00495115 in relation_open ()
#17 0x004952f9 in relation_openrv_extended ()
#18 0x0049555e in heap_openrv_extended ()
#19 0x0057ef0b in parserOpenTable ()
#20 0x0056776f in setTargetTable ()
#21 0x0053628f in transformDeleteStmt ()
#22 0x0053603c in transformStmt ()
#23 0x00535fdb in transformTopLevelStmt ()
#24 0x00535db8 in parse_analyze ()
#25 0x0075dd4f in pg_analyze_and_rewrite ()
#26 0x006472a5 in _SPI_execute_plan ()
#27 0x00644551 in SPI_execute ()
#28 0x7f89b83a8ce7 in exec_stmt_dynexecute () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#29 0x7f89b83a4c43 in exec_stmt () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#30 0x7f89b83a495f in exec_stmts () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#31 0x7f89b83a480a in exec_stmt_block () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#32 0x7f89b83a2802 in plpgsql_exec_function () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#33 0x7f89b839cd6a in plpgsql_call_handler () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#34 0x00618805 in ExecMakeTableFunctionResult ()
#35 0x006376ba in FunctionNext ()
#36 0x0061f868 in ExecScanFetch ()
#37 0x0061f8d7 in ExecScan ()
#38 0x00637a1d in ExecFunctionScan ()
#39 0x00614cfd in ExecProcNode ()
#40 0x00612c32 in ExecutePlan ()
#41 0x006110c3 in standard_ExecutorRun ()
---Type return to continue, or q return to quit---
#42 0x00610fc5 in ExecutorRun ()
#43 0x00647b02 in _SPI_pquery ()
#44 0x006475e8 in _SPI_execute_plan ()
#45 0x0064471d in SPI_execute_plan_with_paramlist ()
#46 0x7f89b83a85a5 in exec_stmt_execsql () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#47 0x7f89b83a4c28 in exec_stmt () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#48 0x7f89b83a495f in exec_stmts () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#49 0x7f89b83a4429 in exec_stmt_block () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#50 0x7f89b83a4a5d in exec_stmt () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#51 0x7f89b83a495f in exec_stmts () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#52 0x7f89b83a54a5 in exec_stmt_loop () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#53 0x7f89b83a4aff in exec_stmt () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#54 0x7f89b83a495f in exec_stmts () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#55 0x7f89b83a5241 in exec_stmt_if () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#56 0x7f89b83a4ac9 in exec_stmt () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#57 0x7f89b83a495f in exec_stmts () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#58 0x7f89b83a480a in exec_stmt_block () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#59 0x7f89b83a2802 in plpgsql_exec_function () from
/home/mmoncure/pg94/lib/postgresql/plpgsql.so
#60 0x7f89b839cd6a in

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Wed, Jan 14, 2015 at 8:03 AM, Merlin Moncure mmonc...@gmail.com wrote:
 Here's a backtrace:

 #0  0x00750a97 in spin_delay ()
 #1  0x00750b19 in s_lock ()
 #2  0x00750844 in LWLockRelease ()
 #3  0x0073 in LockBuffer ()
 #4  0x004b2db4 in _bt_relandgetbuf ()
 #5  0x004b7116 in _bt_moveright ()
 #6  0x004b6e02 in _bt_search ()
 #7  0x004b8006 in _bt_first ()
 #8  0x004b5aad in btgettuple ()
 #9  0x00896656 in FunctionCall2Coll ()
 #10 0x004acd55 in index_getnext_tid ()
 #11 0x004acfad in index_getnext ()

There were seven process with that backtrace exact backtrace (except
that randomly they are sleeping in the spinloop).   Something else
interesting: autovacuum has been running all night as well.  Unlike
the other process however, cpu utilization does not register on top.
I backtraced them as well.  They are not identical.  One of them looks
like this:

#0  0x7f89c1b21a97 in semop () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x006de073 in PGSemaphoreLock ()
#2  0x0074e398 in ProcWaitForSignal ()
#3  0x007302cb in LockBufferForCleanup ()
#4  0x004b699e in btvacuumpage ()
#5  0x004b6627 in btvacuumscan ()
#6  0x004b6389 in btbulkdelete ()
#7  0x008968d7 in FunctionCall4Coll ()
#8  0x004ad196 in index_bulk_delete ()
#9  0x0060b9ed in lazy_vacuum_index ()
#10 0x0060b019 in lazy_scan_heap ()
#11 0x00609836 in lazy_vacuum_rel ()
#12 0x00609383 in vacuum_rel ()
#13 0x00607e4a in vacuum ()
#14 0x006e2d40 in autovacuum_do_vac_analyze ()
#15 0x006e2127 in do_autovacuum ()
#16 0x006e0f19 in AutoVacWorkerMain ()
#17 0x006e0b06 in StartAutoVacWorker ()
#18 0x006f4fac in StartAutovacuumWorker ()
#19 0x006f4816 in sigusr1_handler ()
#20 signal handler called
#21 0x7f89c1b17dc3 in select () from /lib/x86_64-linux-gnu/libc.so.6
#22 0x006f0619 in ServerLoop ()
#23 0x006efdec in PostmasterMain ()
#24 0x006594ec in main ()


The other autovacuum process (0 cpu usage) looks like this:
#0  0x7f89c1b17dc3 in select () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x008d3ac4 in pg_usleep ()
#2  0x00750b69 in s_lock ()
#3  0x00750844 in LWLockRelease ()
#4  0x0073 in LockBuffer ()
#5  0x004b2db4 in _bt_relandgetbuf ()
#6  0x004b7116 in _bt_moveright ()
#7  0x004b6e02 in _bt_search ()
#8  0x004b8006 in _bt_first ()
#9  0x004b5aad in btgettuple ()
#10 0x00896656 in FunctionCall2Coll ()
#11 0x004acd55 in index_getnext_tid ()
#12 0x004acfad in index_getnext ()
#13 0x004ac120 in systable_getnext ()
#14 0x00878ce4 in ScanPgRelation ()
#15 0x0087bcc4 in RelationReloadIndexInfo ()
#16 0x0087c1f1 in RelationClearRelation ()
#17 0x0087ca37 in RelationCacheInvalidate ()
#18 0x00875fce in InvalidateSystemCaches ()
#19 0x00741d81 in ReceiveSharedInvalidMessages ()
#20 0x00876080 in AcceptInvalidationMessages ()
#21 0x00745ce3 in LockRelationOid ()
#22 0x0049510b in relation_open ()
#23 0x00495359 in heap_open ()
#24 0x00608694 in vac_update_relstats ()
#25 0x0060bb98 in lazy_cleanup_index ()
#26 0x0060b0b6 in lazy_scan_heap ()
#27 0x00609836 in lazy_vacuum_rel ()
#28 0x00609383 in vacuum_rel ()
#29 0x00607e4a in vacuum ()
#30 0x006e2d40 in autovacuum_do_vac_analyze ()
#31 0x006e2127 in do_autovacuum ()
#32 0x006e0f19 in AutoVacWorkerMain ()
#33 0x006e0b06 in StartAutoVacWorker ()
#34 0x006f4fac in StartAutovacuumWorker ()
#35 0x006f4816 in sigusr1_handler ()
#36 signal handler called
#37 0x7f89c1b17dc3 in select () from /lib/x86_64-linux-gnu/libc.so.6
#38 0x006f0619 in ServerLoop ()
#39 0x006efdec in PostmasterMain ()
#40 0x006594ec in main ()

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Tue, Jan 13, 2015 at 3:54 PM, Andres Freund and...@2ndquadrant.com wrote:
 I don't remember seeing _bt_moveright()  or _bt_compare() figuring so
 prominently, where _bt_binsrch() is nowhere to be seen. I can't see a
 reference to _bt_binsrch() in either profile.

 Well, we do a _bt_moveright pretty early on, so that actually might be
 cache misses we're primarily seing.

Still, I see zero references to _bt_binsrch(). Not one. Even if there
was only one non-meta page (i.e. pre-first-root-split), and regardless
of whether this was a read or a write to the B-Tree, there'd still be
some call there as the B-Tree was initially scanned (before _bt_next()
calls, of which I also see none).

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Tue, Jan 13, 2015 at 3:54 PM, Merlin Moncure mmonc...@gmail.com wrote:
 Some more information what's happening:
 This is a ghetto logical replication engine that migrates data from
 sql sever to postgres, consolidating a sharded database into a single
 set of tables (of which there are only two). There is only one index
 on the destination table, and it's composite int,int in both cases.

Does the logical replication engine perform dynamic DDL at all? Does
it dynamically add columns to the table that everything is being
consolidated to?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Tue, Jan 13, 2015 at 4:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm not convinced that Peter is barking up the right tree.  I'm noticing
 that the profiles seem rather skewed towards parser/planner work; so I
 suspect the contention is probably on access to system catalogs.  No
 idea exactly why though.

I see no int4cmp() calls at all, but plenty of _bt_compare(), and some
FunctionCall2Coll(). And yet, no _bt_binsrch(). I can see btoidcmp(),
so if I'm right then I guess it's a system catalog index.

I too would like to see a stack trace.

Trivia: pg_attribute_relid_attnam_index has deleted B-Tree pages after
a fresh initdb.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

Hi,

On 2015-01-13 16:29:51 -0600, Merlin Moncure wrote:
 On my workstation today (running vanilla 9.4.0) I was testing some new
 code that does aggressive parallel loading to a couple of tables. It
 ran ok several dozen times and froze up with no external trigger.
 There were at most 8 active backends that were stuck (the loader is
 threaded to a cap) -- each query typically resolves in a few seconds
 but they were hung for 30 minutes+.

Interesting.

 Had to do restart immediate as
 backends were not responding to cancel...but I snapped a 'perf top'
 before I did so.  The results were interesting so I'm posting them
 here.  So far I have not been able to reproduce...FYI

Can you compile postgres with -fno-omit-frame-pointer? Then, when this
happens the next time, you can take a perf record -g, which will tell us
which lock the contention is at.

  61.03%  postgres [.] s_lock
  13.56%  postgres [.] LWLockRelease
  10.11%  postgres [.] LWLockAcquire

That profile looks like it might end up being helped by the lwlock
and/or freelist changes in 9.5.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Tue, Jan 13, 2015 at 4:33 PM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 On 2015-01-13 16:29:51 -0600, Merlin Moncure wrote:
 On my workstation today (running vanilla 9.4.0) I was testing some new
 code that does aggressive parallel loading to a couple of tables. It
 ran ok several dozen times and froze up with no external trigger.
 There were at most 8 active backends that were stuck (the loader is
 threaded to a cap) -- each query typically resolves in a few seconds
 but they were hung for 30 minutes+.

 Interesting.

 Had to do restart immediate as
 backends were not responding to cancel...but I snapped a 'perf top'
 before I did so.  The results were interesting so I'm posting them
 here.  So far I have not been able to reproduce...FYI

 Can you compile postgres with -fno-omit-frame-pointer? Then, when this
 happens the next time, you can take a perf record -g, which will tell us
 which lock the contention is at.

will do, and I'll loop it for a while and see if I can get it to re-occur.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Tue, Jan 13, 2015 at 2:29 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On my workstation today (running vanilla 9.4.0) I was testing some new
 code that does aggressive parallel loading to a couple of tables.

Could you give more details, please? For example, I'd like to see
representative data, or at least the table definitions involved. Do
you have any idea what index is involved with the _bt_compare() calls
above? I'm curious as to what its definition is.

You might also consider using my B-Tree verification tool on a
preserved data directory:
http://www.postgresql.org/message-id/cam3swzrk2yxkegj94_2wks_vbemjehc1ye23ecss01wrakz...@mail.gmail.com

I'm inclined to think that this is a livelock, and so the problem
isn't evident from the structure of the B-Tree, but it can't hurt to
check. The tool is bleeding edge, so don't use it in production.

Thanks
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On 2015-01-13 15:17:15 -0800, Peter Geoghegan wrote:
 I'm inclined to think that this is a livelock, and so the problem
 isn't evident from the structure of the B-Tree, but it can't hurt to
 check.

My guess is rather that it's contention on the freelist lock via
StrategyGetBuffer's. I've seen profiles like this due to exactly that
before - and it fits to parallel loading quite well.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Tue, Jan 13, 2015 at 5:21 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-01-13 15:17:15 -0800, Peter Geoghegan wrote:
 I'm inclined to think that this is a livelock, and so the problem
 isn't evident from the structure of the B-Tree, but it can't hurt to
 check.

 My guess is rather that it's contention on the freelist lock via
 StrategyGetBuffer's. I've seen profiles like this due to exactly that
 before - and it fits to parallel loading quite well.

I think I've got it to pop again. s_lock is only showing 35%
(increasing very slowly if at all) but performance is mostly halted.
Frame pointer is compiled out.  perf report attached.

merlin

 35.82%  postgres[.] s_lock
 23.71%  postgres[.] tas
 14.01%  postgres[.] tas
  6.82%  postgres[.] spin_delay
  5.93%  postgres[.] LWLockRelease
  4.36%  postgres[.] LWLockAcquireCommon
  2.34%  postgres[.] FunctionCall2Coll
  1.79%  postgres[.] _bt_compare
  0.80%  postgres[.] LockBuffer
  0.52%  postgres[.] btoidcmp
  0.49%  postgres[.] ReleaseAndReadBuffer
  0.47%  postgres[.] _bt_moveright
  0.36%  postgres[.] _bt_checkpage
  0.36%  postgres[.] _bt_relandgetbuf
  0.20%  perf[.] 0x0004312a
  0.19%  postgres[.] LWLockAcquire
  0.13%  libv8.so[.] 0x001bbbe0
  0.11%  libc-2.17.so[.] 0x00151134
  0.09%  libwebkit.so[.] 0x0106ccb7
  0.05%  libgdk_pixbuf-2.0.so.0.2800.1   [.] 0x000139c7
  0.04%  Xorg[.] 0x000efb00
  0.03%  libglib-2.0.so.0.3800.1 [.] 0x000876a2
  0.03%  [kernel][k] __ticket_spin_lock
  0.02%  perf-12966.map  [.] 0x0625db944621
  0.02%  libskia.so  [.] 0x0021d15f
  0.02%  libcairo.so.2.11200.16  [.] 0x0002440b
  0.02%  libpthread-2.17.so  [.]
__pthread_mutex_unlock_usercnt
  0.02%  [kernel][k]
copy_user_enhanced_fast_string
  0.02%  radeon_drv.so   [.] 0x00045002
  0.02%  libpthread-2.17.so  [.] pthread_mutex_lock
  0.02%  [kernel][k] fget_light
  0.01%  [kernel][k] __schedule
  0.01%  libexa.so   [.] 0x6e1d
  0.01%  libgdk-x11-2.0.so.0.2400.20 [.] 0x0002f0b0
  0.01%  libvte.so.9.2800.2  [.] 0x00039028
  0.01%  [radeon][k] r100_mm_rreg
  0.01%  [kernel][k] xhci_irq
  0.01%  [kernel][k] native_write_msr_safe
  0.01%  [kernel][k]
update_cfs_rq_blocked_load
  0.01%  libglib-2.0.so.0.3800.1 [.] g_hash_table_lookup
  0.01%  libgobject-2.0.so.0.3800.1  [.]
g_type_check_instance_is_a
Press '?' for help on key bindings


perf.report.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On 2015-01-13 17:39:09 -0600, Merlin Moncure wrote:
 On Tue, Jan 13, 2015 at 5:21 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2015-01-13 15:17:15 -0800, Peter Geoghegan wrote:
  I'm inclined to think that this is a livelock, and so the problem
  isn't evident from the structure of the B-Tree, but it can't hurt to
  check.
 
  My guess is rather that it's contention on the freelist lock via
  StrategyGetBuffer's. I've seen profiles like this due to exactly that
  before - and it fits to parallel loading quite well.
 
 I think I've got it to pop again. s_lock is only showing 35%
 (increasing very slowly if at all) but performance is mostly halted.
 Frame pointer is compiled out.  perf report attached.

  35.82%  postgres[.] s_lock
  23.71%  postgres[.] tas
  14.01%  postgres[.] tas
   6.82%  postgres[.] spin_delay
   5.93%  postgres[.] LWLockRelease
   4.36%  postgres[.] LWLockAcquireCommon

Interesting. This profile looks quite different?

What kind of hardware is this on?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Tue, Jan 13, 2015 at 3:21 PM, Andres Freund and...@2ndquadrant.com wrote:
 My guess is rather that it's contention on the freelist lock via
 StrategyGetBuffer's. I've seen profiles like this due to exactly that
 before - and it fits to parallel loading quite well.

I'm not saying you're wrong, but the breakdown of _bt_moveright()
relative to _bt_relandgetbuf() calls seems a bit fishy to me.

I don't remember seeing _bt_moveright()  or _bt_compare() figuring so
prominently, where _bt_binsrch() is nowhere to be seen. I can't see a
reference to _bt_binsrch() in either profile.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Tue, Jan 13, 2015 at 5:49 PM, Peter Geoghegan p...@heroku.com wrote:
 On Tue, Jan 13, 2015 at 3:21 PM, Andres Freund and...@2ndquadrant.com wrote:
 My guess is rather that it's contention on the freelist lock via
 StrategyGetBuffer's. I've seen profiles like this due to exactly that
 before - and it fits to parallel loading quite well.

 I'm not saying you're wrong, but the breakdown of _bt_moveright()
 relative to _bt_relandgetbuf() calls seems a bit fishy to me.

 I don't remember seeing _bt_moveright()  or _bt_compare() figuring so
 prominently, where _bt_binsrch() is nowhere to be seen. I can't see a
 reference to _bt_binsrch() in either profile.

hm, this is hand compiled now, I bet the symbols are wrong.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On 2015-01-13 15:49:33 -0800, Peter Geoghegan wrote:
 On Tue, Jan 13, 2015 at 3:21 PM, Andres Freund and...@2ndquadrant.com wrote:
  My guess is rather that it's contention on the freelist lock via
  StrategyGetBuffer's. I've seen profiles like this due to exactly that
  before - and it fits to parallel loading quite well.
 
 I'm not saying you're wrong, but the breakdown of _bt_moveright()
 relative to _bt_relandgetbuf() calls seems a bit fishy to me.

The hierarchical profile definitely doesn't look like my guess was
right.

 I don't remember seeing _bt_moveright()  or _bt_compare() figuring so
 prominently, where _bt_binsrch() is nowhere to be seen. I can't see a
 reference to _bt_binsrch() in either profile.

Well, we do a _bt_moveright pretty early on, so that actually might be
cache misses we're primarily seing.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Tue, Jan 13, 2015 at 3:50 PM, Merlin Moncure mmonc...@gmail.com wrote:
 I don't remember seeing _bt_moveright()  or _bt_compare() figuring so
 prominently, where _bt_binsrch() is nowhere to be seen. I can't see a
 reference to _bt_binsrch() in either profile.

 hm, this is hand compiled now, I bet the symbols are wrong.

In case it isn't clear, I think that the proximate cause here may well
be either one (or both) of commits
efada2b8e920adfdf7418862e939925d2acd1b89 and/or
40dae7ec537c5619fc93ad602c62f37be786d161. Probably the latter. I think
that the profile is roughly consistent with that, although I may well
be wrong.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Tue, Jan 13, 2015 at 5:54 PM, Peter Geoghegan p...@heroku.com wrote:
 On Tue, Jan 13, 2015 at 3:50 PM, Merlin Moncure mmonc...@gmail.com wrote:
 I don't remember seeing _bt_moveright()  or _bt_compare() figuring so
 prominently, where _bt_binsrch() is nowhere to be seen. I can't see a
 reference to _bt_binsrch() in either profile.

 hm, this is hand compiled now, I bet the symbols are wrong.

 In case it isn't clear, I think that the proximate cause here may well
 be either one (or both) of commits
 efada2b8e920adfdf7418862e939925d2acd1b89 and/or
 40dae7ec537c5619fc93ad602c62f37be786d161. Probably the latter. I think
 that the profile is roughly consistent with that, although I may well
 be wrong.

I'm out of time for the day, but I'm fairly confident I can reproduce.
I'll see if I can reverse those commits tomorrow and retest (I'm on
development box).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

On Tue, Jan 13, 2015 at 5:42 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-01-13 17:39:09 -0600, Merlin Moncure wrote:
 On Tue, Jan 13, 2015 at 5:21 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
  On 2015-01-13 15:17:15 -0800, Peter Geoghegan wrote:
  I'm inclined to think that this is a livelock, and so the problem
  isn't evident from the structure of the B-Tree, but it can't hurt to
  check.
 
  My guess is rather that it's contention on the freelist lock via
  StrategyGetBuffer's. I've seen profiles like this due to exactly that
  before - and it fits to parallel loading quite well.

 I think I've got it to pop again. s_lock is only showing 35%
 (increasing very slowly if at all) but performance is mostly halted.
 Frame pointer is compiled out.  perf report attached.

  35.82%  postgres[.] s_lock
  23.71%  postgres[.] tas
  14.01%  postgres[.] tas
   6.82%  postgres[.] spin_delay
   5.93%  postgres[.] LWLockRelease
   4.36%  postgres[.] LWLockAcquireCommon

 Interesting. This profile looks quite different?

yep, it's very stable, and the database is approximately frozen.

 What kind of hardware is this on?

linux on crap workstation box, except I have good ssd (intel 3500).

 model name : Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz


top - 17:44:00 up 11 days,  6:25, 15 users,  load average: 6.94, 6.97, 5.73
Tasks: 259 total,   8 running, 250 sleeping,   0 stopped,   1 zombie
%Cpu0  : 95.7 us,  0.0 sy,  0.0 ni,  4.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  : 93.5 us,  2.2 sy,  0.0 ni,  4.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  : 95.7 us,  0.0 sy,  0.0 ni,  4.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  : 93.6 us,  2.1 sy,  0.0 ni,  4.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   8131028 total,  7977608 used,   153420 free,21424 buffers
KiB Swap:  8340476 total,  1884900 used,  6455576 free,  5201188 cached

  PID USER  PR  NI  VIRT  RES  SHR S  %CPU %MEMTIME+  COMMAND
32492 mmoncure  20   0  175m  12m 9040 R  64.0  0.2   9:28.42 postgres
32367 mmoncure  20   0  175m  12m 9128 R  57.3  0.2   9:35.87 postgres
31792 mmoncure  20   0  175m  92m  88m R  52.9  1.2   9:57.50 postgres
32725 mmoncure  20   0  175m  13m 9852 R  52.9  0.2   9:08.50 postgres
31899 mmoncure  20   0  179m 104m  97m R  50.7  1.3   9:49.51 postgres
32368 mmoncure  20   0  175m  12m 9200 R  50.7  0.2   9:33.77 postgres
31282 mmoncure  20   0  184m 152m 140m R  46.3  1.9  10:26.29 postgres
  666 mmoncure  20   0 20640 1684 1116 R   2.2  0.0   0:00.01 top

Some more information what's happening:
This is a ghetto logical replication engine that migrates data from
sql sever to postgres, consolidating a sharded database into a single
set of tables (of which there are only two). There is only one index
on the destination table, and it's composite int,int in both cases.

A single source table is replicated in a single transaction, and it's
staged to a set of two temp tables, one for deletes, one for inserts.
I log each step in the database.  Here are the times:

cds2=# select cdsruntableid, finished - started, inserted from
cdsruntable where cdsrunid = 62 order by started;
 cdsruntableid │?column? │ inserted
───┼─┼──
   833 │ 00:00:33.24044  │11860
   834 │ 00:00:35.44981  │19099
   835 │ 00:02:01.725186 │   530923
   836 │ 00:01:29.578811 │   211737
   838 │ 00:01:17.393461 │64258
   837 │ 00:00:56.849106 │   227844
   839 │ 00:02:12.016504 │   630309
   840 │ 00:00:00.00111  │
   841 │ 00:01:09.058806 │   155309
   842 │ 00:01:54.723747 │   704422
   843 │ 00:01:14.965304 │
   844 │ 00:00:45.410953 │59934
   845 │ 00:00:34.302632 │14848
   846 │ 00:00:46.913832 │89893
   848 │ 00:05:22.155189 │  2410622
   847 │ 00:01:25.199141 │   268157
   849 │ 00:01:16.168352 │   117511
   850 │ 00:00:34.809862 │ 1175
   851 │ 00:01:12.565397 │67214
   852 │ 00:01:03.734776 │20129
   853 │ 00:00:41.038456 │62765
   854 │ 00:01:03.478917 │14967
   855 │ 00:00:33.88803  │ 6901
   856 │ 00:00:36.381886 │ 6670
   857 │ 00:00:36.626623 │ 8268
   858 │ 00:01:14.391584 │   363448
   859 │ 00:01:44.619005 │   533395
   860 │ 00:01:02.042255 │   212202
   861 │ 00:00:00.001065 │
   863 │ 00:00:58.265506 │   215234
   862 │ 00:02:45.572635 │   827941
   865 │ 00:01:28.049165 │   241020
   864 │ 00:01:51.465643 │   531012
   866 │ 00:01:20.273391 │   419357
   868 │ 00:01:25.479443 │   294262
   869 │ 00:00:46.400825 │46136
   870 │ 00:00:42.337286 │25934

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-13 Thread Tom Lane

Merlin Moncure mmonc...@gmail.com writes:
 On Tue, Jan 13, 2015 at 5:54 PM, Peter Geoghegan p...@heroku.com wrote:
 In case it isn't clear, I think that the proximate cause here may well
 be either one (or both) of commits
 efada2b8e920adfdf7418862e939925d2acd1b89 and/or
 40dae7ec537c5619fc93ad602c62f37be786d161. Probably the latter. I think
 that the profile is roughly consistent with that, although I may well
 be wrong.

 I'm out of time for the day, but I'm fairly confident I can reproduce.
 I'll see if I can reverse those commits tomorrow and retest (I'm on
 development box).

I'm not convinced that Peter is barking up the right tree.  I'm noticing
that the profiles seem rather skewed towards parser/planner work; so I
suspect the contention is probably on access to system catalogs.  No
idea exactly why though.

It would be good to collect a few stack traces from the hung backends,
rather than relying on perf statistics.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hung backends stuck in spinlock heavy endless loop