Re: [HACKERS] Elusive segfault with 9.3.5 query cancel

2014-12-09 Thread Richard Frith-Macdonald
On 5 Dec 2014, at 22:41, Jim Nasby jim.na...@bluetreble.com wrote:
 
 
 Perhaps we should also officially recommend production servers be setup to 
 create core files. AFAIK the only downside is the time it would take to write 
 a core that's huge because of shared buffers, but perhaps there's some way to 
 avoid writing those? (That means the core won't help if the bug is due to 
 something in a buffer, but that seems unlikely enough that the tradeoff is 
 worth it...)

Good idea.  It seems the madvise() system call (with MADV_DONTDUMP) is exactly 
what's needed to avoid dumping shared buffers.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Elusive segfault with 9.3.5 query cancel

2014-12-05 Thread Josh Berkus
Hackers,

This is not a complete enough report for a diagnosis.  I'm posting it
here just in case someone else sees something like it, and having an
additional report will help figure out the underlying issue.

* 700GB database with around 5,000 writes per second
* 8 replicas handling around 10,000 read queries per second each
* replicas are slammed (40-70% utilization)
* replication produces lots of replication query cancels

In this scenario, a specific query against some of the less busy and
fairly small tables would produce a segfault (signal 11) once every 1-4
days randomly.  This query could have 100's of successful runs for every
segfault. This was not reproduceable manually, and the segfaults never
happened on the master.  Nor did we ever see a segfault based on any
other query, including against the tables which were generally the
source of the query cancels.

In case it's relevant, the query included use of regexp_split_to_array()
and ORDER BY random(), neither of which are generally used in the user's
other queries.

We made some changes which decreased query cancel (optimizing queries,
turning on hot_standby_feedback) and we haven't seen a segfault since
then.  As far as the user is concerned, this solves the problem, so I'm
never going to get a trace or a core dump file.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Elusive segfault with 9.3.5 query cancel

2014-12-05 Thread Josh Berkus
On 12/05/2014 12:54 PM, Josh Berkus wrote:
 Hackers,
 
 This is not a complete enough report for a diagnosis.  I'm posting it
 here just in case someone else sees something like it, and having an
 additional report will help figure out the underlying issue.
 
 * 700GB database with around 5,000 writes per second
 * 8 replicas handling around 10,000 read queries per second each
 * replicas are slammed (40-70% utilization)
 * replication produces lots of replication query cancels
 
 In this scenario, a specific query against some of the less busy and
 fairly small tables would produce a segfault (signal 11) once every 1-4
 days randomly.  This query could have 100's of successful runs for every
 segfault. This was not reproduceable manually, and the segfaults never
 happened on the master.  Nor did we ever see a segfault based on any
 other query, including against the tables which were generally the
 source of the query cancels.
 
 In case it's relevant, the query included use of regexp_split_to_array()
 and ORDER BY random(), neither of which are generally used in the user's
 other queries.
 
 We made some changes which decreased query cancel (optimizing queries,
 turning on hot_standby_feedback) and we haven't seen a segfault since
 then.  As far as the user is concerned, this solves the problem, so I'm
 never going to get a trace or a core dump file.

Forgot a major piece of evidence as to why I think this is related to
query cancel:  in each case, the segfault was preceeded by a
multi-backend query cancel 3ms to 30ms beforehand.  It is possible that
the backend running the query which segfaulted might have been the only
backend *not* cancelled due to query conflict concurrently.
Contradicting this, there are other multi-backend query cancels in the
logs which do NOT produce a segfault.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Elusive segfault with 9.3.5 query cancel

2014-12-05 Thread Peter Geoghegan
On Fri, Dec 5, 2014 at 1:29 PM, Josh Berkus j...@agliodbs.com wrote:
 We made some changes which decreased query cancel (optimizing queries,
 turning on hot_standby_feedback) and we haven't seen a segfault since
 then.  As far as the user is concerned, this solves the problem, so I'm
 never going to get a trace or a core dump file.

 Forgot a major piece of evidence as to why I think this is related to
 query cancel:  in each case, the segfault was preceeded by a
 multi-backend query cancel 3ms to 30ms beforehand.  It is possible that
 the backend running the query which segfaulted might have been the only
 backend *not* cancelled due to query conflict concurrently.
 Contradicting this, there are other multi-backend query cancels in the
 logs which do NOT produce a segfault.

I wonder if it would be useful to add additional instrumentation so
that even without a core dump, there was some cursory information
about the nature of a segfault.

Yes, doing something with a SIGSEGV handler is very scary, and there
are major portability concerns (e.g.
https://bugs.ruby-lang.org/issues/9654), but I believe it can be made
robust on Linux. For what it's worth, this open source project offers
that kind of functionality in the form of a library:
https://github.com/vmarkovtsev/DeathHandler

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Elusive segfault with 9.3.5 query cancel

2014-12-05 Thread Jim Nasby

On 12/5/14, 4:11 PM, Peter Geoghegan wrote:

On Fri, Dec 5, 2014 at 1:29 PM, Josh Berkus j...@agliodbs.com wrote:

We made some changes which decreased query cancel (optimizing queries,
turning on hot_standby_feedback) and we haven't seen a segfault since
then.  As far as the user is concerned, this solves the problem, so I'm
never going to get a trace or a core dump file.


Forgot a major piece of evidence as to why I think this is related to
query cancel:  in each case, the segfault was preceeded by a
multi-backend query cancel 3ms to 30ms beforehand.  It is possible that
the backend running the query which segfaulted might have been the only
backend *not* cancelled due to query conflict concurrently.
Contradicting this, there are other multi-backend query cancels in the
logs which do NOT produce a segfault.


I wonder if it would be useful to add additional instrumentation so
that even without a core dump, there was some cursory information
about the nature of a segfault.

Yes, doing something with a SIGSEGV handler is very scary, and there
are major portability concerns (e.g.
https://bugs.ruby-lang.org/issues/9654), but I believe it can be made
robust on Linux. For what it's worth, this open source project offers
that kind of functionality in the form of a library:
https://github.com/vmarkovtsev/DeathHandler


Perhaps we should also officially recommend production servers be setup to 
create core files. AFAIK the only downside is the time it would take to write a 
core that's huge because of shared buffers, but perhaps there's some way to 
avoid writing those? (That means the core won't help if the bug is due to 
something in a buffer, but that seems unlikely enough that the tradeoff is 
worth it...)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Elusive segfault with 9.3.5 query cancel

2014-12-05 Thread Peter Geoghegan
On Fri, Dec 5, 2014 at 2:41 PM, Jim Nasby jim.na...@bluetreble.com wrote:
 Perhaps we should also officially recommend production servers be setup to
 create core files. AFAIK the only downside is the time it would take to
 write a core that's huge because of shared buffers

I don't think that's every going to be practical.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Elusive segfault with 9.3.5 query cancel

2014-12-05 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 On Fri, Dec 5, 2014 at 2:41 PM, Jim Nasby jim.na...@bluetreble.com wrote:
 Perhaps we should also officially recommend production servers be setup to
 create core files. AFAIK the only downside is the time it would take to
 write a core that's huge because of shared buffers

 I don't think that's every going to be practical.

I'm fairly sure that on some distros (Red Hat, at least) there is distro
policy against having daemons produce core dumps by default, for multiple
reasons including possible disk space consumption and leakage of secure
information.  So even if we recommended this, the recommendation would be
overridden by some/many packagers.

There is much to be said though for trying to emit at least a minimal
stack trace into the postmaster log file.  I'm pretty sure glibc has a
function for that; dunno if it's going to be practical on other platforms.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Elusive segfault with 9.3.5 query cancel

2014-12-05 Thread Josh Berkus
On 12/05/2014 02:41 PM, Jim Nasby wrote:
 Perhaps we should also officially recommend production servers be setup
 to create core files. AFAIK the only downside is the time it would take
 to write a core that's huge because of shared buffers, but perhaps
 there's some way to avoid writing those? (That means the core won't help
 if the bug is due to something in a buffer, but that seems unlikely
 enough that the tradeoff is worth it...)

Not practical in a lot of cases.  For example, this user was unwilling
to enable core dumps on the production replicas because writing out the
16GB of shared buffers they had took over 10 minutes in a test.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Elusive segfault with 9.3.5 query cancel

2014-12-05 Thread Peter Geoghegan
On Fri, Dec 5, 2014 at 3:49 PM, Josh Berkus j...@agliodbs.com wrote:
 to enable core dumps on the production replicas because writing out the
 16GB of shared buffers they had took over 10 minutes in a test.

No one ever thinks it'll happen to them anyway - recommending enabling
core dumps seems like a waste of time, since as Tom mentioned package
managers shouldn't be expected to get on board with that plan. I think
a zero overhead backtrace feature from within a SIGSEGV handler (with
appropriate precautions around corrupt/exhausted call stacks) using
glibc is the right thing here.

Indeed, glibc does have infrastructure that can be used to get a
backtrace [1], which is probably what we'd end up using, but even
POSIX has infrastructure like sigaltstack(). It can be done.

[1] https://www.gnu.org/software/libc/manual/html_node/Backtraces.html
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Elusive segfault with 9.3.5 query cancel

2014-12-05 Thread Jim Nasby

On 12/5/14, 5:49 PM, Josh Berkus wrote:

On 12/05/2014 02:41 PM, Jim Nasby wrote:

Perhaps we should also officially recommend production servers be setup
to create core files. AFAIK the only downside is the time it would take
to write a core that's huge because of shared buffers, but perhaps
there's some way to avoid writing those? (That means the core won't help
if the bug is due to something in a buffer, but that seems unlikely
enough that the tradeoff is worth it...)


Not practical in a lot of cases.  For example, this user was unwilling
to enable core dumps on the production replicas because writing out the
16GB of shared buffers they had took over 10 minutes in a test.


Which is why I wondered if there's a way to avoid writing out shared buffers...

But at least getting a stack trace would be a big start.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers