Re: backtrace_on_internal_error

2023-12-30 Thread Peter Eisentraut
On 19.12.23 17:29, Tom Lane wrote: IMO, we aren't really going to get a massive payoff from this with the current backtrace output; it's just not detailed enough. It's better than nothing certainly, but to really move the goalposts we'd need something approaching gdb's "bt full" output. I

Re: backtrace_on_internal_error

2023-12-20 Thread Jelte Fennema-Nio
On Wed, 20 Dec 2023 at 11:30, Andres Freund wrote: > Tom's patch imo doesn't really introduce anything really new - we already deal > with EOF that way in other places. And it's how the standard APIs deal with > the issue. I'd not design it this way on a green field, but given the current > state

Re: backtrace_on_internal_error

2023-12-20 Thread Andres Freund
Hi, On 2023-12-20 10:08:42 +0100, Jelte Fennema-Nio wrote: > On Sun, 10 Dec 2023 at 00:14, Tom Lane wrote: > > I'm not actually sure that the fe-secure.c part of v3-0002 is > > necessary, because it's guarding plain recv(2) which really shouldn't > > return -1 without setting errno. Still, it's

Re: backtrace_on_internal_error

2023-12-20 Thread Jelte Fennema-Nio
On Tue, 19 Dec 2023 at 17:12, Robert Haas wrote: > On Fri, Dec 8, 2023 at 1:34 PM Andres Freund wrote: > > Oh, very much agreed. But I suspect we won't quickly do the same for > > out-of-core extensions... > > I feel like this is a problem that will sort itself out just fine. The > rules about

Re: backtrace_on_internal_error

2023-12-20 Thread Jelte Fennema-Nio
On Sun, 10 Dec 2023 at 00:14, Tom Lane wrote: > I'm not actually sure that the fe-secure.c part of v3-0002 is > necessary, because it's guarding plain recv(2) which really shouldn't > return -1 without setting errno. Still, it's a pretty harmless > addition. v3-0002 seems have a very similar

Re: backtrace_on_internal_error

2023-12-19 Thread Robert Haas
On Tue, Dec 19, 2023 at 11:29 AM Tom Lane wrote: > IMO, we aren't really going to get a massive payoff from this with > the current backtrace output; it's just not detailed enough. It's > better than nothing certainly, but to really move the goalposts > we'd need something approaching gdb's "bt

Re: backtrace_on_internal_error

2023-12-19 Thread Tom Lane
Robert Haas writes: > The last change we made in this area that, at least for me, massively > improved debuggability was the change to log the current query string > when a backend crashes. That's such a huge help; I can't imagine going > back to the old way where you had basically no idea what

Re: backtrace_on_internal_error

2023-12-19 Thread Robert Haas
On Fri, Dec 8, 2023 at 1:34 PM Andres Freund wrote: > Oh, very much agreed. But I suspect we won't quickly do the same for > out-of-core extensions... I feel like this is a problem that will sort itself out just fine. The rules about using ereport() and elog() could probably be better documented

Re: backtrace_on_internal_error

2023-12-11 Thread Peter Eisentraut
On 08.12.23 19:14, Andres Freund wrote: FWIW, I did some analysis on aggregated logs on a larger number of machines, and it does look like that'd be a measurable increase in log volume. There are a few voluminous internal errors in core, but the bigger issue is extensions. They are typically

Re: backtrace_on_internal_error

2023-12-09 Thread Tom Lane
Andres Freund writes: > On 2023-12-09 12:41:30 -0500, Tom Lane wrote: >> On further reflection I realized that you're right so far as the SSL >> code path goes, because SSL_write() can involve physical reads as well >> as writes, so at least in principle it's possible that we'd see EOF >>

Re: backtrace_on_internal_error

2023-12-09 Thread Andres Freund
Hi, On 2023-12-09 12:41:30 -0500, Tom Lane wrote: > I wrote: > > Andres Freund writes: > >> I was wondering about that too. But if we do so, why not also do it for > >> writes? > > > Writes don't act that way, do they? EOF on a pipe gives you an error, > > not silently reporting that zero

Re: backtrace_on_internal_error

2023-12-09 Thread Tom Lane
I wrote: > Andres Freund writes: >> I was wondering about that too. But if we do so, why not also do it for >> writes? > Writes don't act that way, do they? EOF on a pipe gives you an error, > not silently reporting that zero bytes were written and leaving you > to retry indefinitely. On

Re: backtrace_on_internal_error

2023-12-09 Thread Andres Freund
Hi, On 2023-12-08 19:39:20 -0500, Tom Lane wrote: > Andres Freund writes: > > On 2023-12-08 17:29:45 -0500, Tom Lane wrote: > >> Agreed. I think we want to do that after the initial handshake, > >> too, so maybe as attached. > > > I was wondering about that too. But if we do so, why not also do

Re: backtrace_on_internal_error

2023-12-08 Thread Tom Lane
Andres Freund writes: > On 2023-12-08 17:29:45 -0500, Tom Lane wrote: >> Agreed. I think we want to do that after the initial handshake, >> too, so maybe as attached. > I was wondering about that too. But if we do so, why not also do it for > writes? Writes don't act that way, do they? EOF on

Re: backtrace_on_internal_error

2023-12-08 Thread Andres Freund
Hi, On 2023-12-08 17:35:26 -0500, Tom Lane wrote: > Andres Freund writes: > > I thought it'd be nice to have a test for this, particularly because it's > > not > > clear that the behaviour is consistent across openssl versions. > > Perhaps, but ... > > > To deal with that, I changed the test

Re: backtrace_on_internal_error

2023-12-08 Thread Andres Freund
On 2023-12-08 17:29:45 -0500, Tom Lane wrote: > Andres Freund writes: > >> I think I figured it it out. Looks like we need to translate a closed > >> socket > >> (recvfrom() returning 0) to ECONNRESET or such. > > > Seems like we should just treat errno == 0 as a reason to emit the "EOF > >

Re: backtrace_on_internal_error

2023-12-08 Thread Tom Lane
Andres Freund writes: > I thought it'd be nice to have a test for this, particularly because it's not > clear that the behaviour is consistent across openssl versions. Perhaps, but ... > To deal with that, I changed the test to instead check if "not accept SSL > connection: Success" is not

Re: backtrace_on_internal_error

2023-12-08 Thread Tom Lane
Andres Freund writes: >> I think I figured it it out. Looks like we need to translate a closed socket >> (recvfrom() returning 0) to ECONNRESET or such. > Seems like we should just treat errno == 0 as a reason to emit the "EOF > detected" message? Agreed. I think we want to do that after the

Re: backtrace_on_internal_error

2023-12-08 Thread Andres Freund
Hi, On 2023-12-08 11:33:16 -0800, Andres Freund wrote: > On 2023-12-08 10:51:01 -0800, Andres Freund wrote: > > On 2023-12-08 13:46:07 -0500, Tom Lane wrote: > > > Andres Freund writes: > > > > On 2023-12-08 13:23:50 -0500, Tom Lane wrote: > > > >> Hmm, don't suppose you have a way to reproduce

Re: backtrace_on_internal_error

2023-12-08 Thread Andres Freund
Hi, On 2023-12-08 10:51:01 -0800, Andres Freund wrote: > On 2023-12-08 13:46:07 -0500, Tom Lane wrote: > > Andres Freund writes: > > > On 2023-12-08 13:23:50 -0500, Tom Lane wrote: > > >> Hmm, don't suppose you have a way to reproduce that? > > > > > After a bit of trying, yes. I put an abort()

Re: backtrace_on_internal_error

2023-12-08 Thread Andres Freund
Hi, On 2023-12-08 13:46:07 -0500, Tom Lane wrote: > Andres Freund writes: > > On 2023-12-08 13:23:50 -0500, Tom Lane wrote: > >> Hmm, don't suppose you have a way to reproduce that? > > > After a bit of trying, yes. I put an abort() into pgtls_open_client(), > > after > > initialize_SSL().

Re: backtrace_on_internal_error

2023-12-08 Thread Tom Lane
Andres Freund writes: > On 2023-12-08 13:23:50 -0500, Tom Lane wrote: >> Hmm, don't suppose you have a way to reproduce that? > After a bit of trying, yes. I put an abort() into pgtls_open_client(), after > initialize_SSL(). Connecting does result in: > LOG: could not accept SSL connection:

Re: backtrace_on_internal_error

2023-12-08 Thread Andres Freund
Hi, On 2023-12-08 13:23:50 -0500, Tom Lane wrote: > Andres Freund writes: > > On 2023-12-08 10:05:09 -0500, Tom Lane wrote: > >> ... there was already opinion upthread that this should be on by > >> default, which I agree with. You shouldn't be hitting cases like > >> this commonly (if so,

Re: backtrace_on_internal_error

2023-12-08 Thread Tom Lane
Andres Freund writes: > On 2023-12-08 10:05:09 -0500, Tom Lane wrote: >> ... there was already opinion upthread that this should be on by >> default, which I agree with. You shouldn't be hitting cases like >> this commonly (if so, they're bugs to fix or the errcode should be >> rethought), and

Re: backtrace_on_internal_error

2023-12-08 Thread Andres Freund
Hi, On 2023-12-08 10:05:09 -0500, Tom Lane wrote: > Peter Eisentraut writes: > > One possible question for discussion is whether the default for this > > should be off, on, or possibly something like on-in-assert-builds. > > (Personally, I'm happy to turn it on myself at run time, but everyone >

Re: backtrace_on_internal_error

2023-12-08 Thread Tom Lane
Peter Eisentraut writes: > Here is a patch to play with. Didn't read the patch yet, but ... > One possible question for discussion is whether the default for this > should be off, on, or possibly something like on-in-assert-builds. > (Personally, I'm happy to turn it on myself at run time,

Re: backtrace_on_internal_error

2023-12-08 Thread Peter Eisentraut
: 7db01fbcefbd95a7c97afa128fab59f4a09b3ff1 -- 2.43.0 From 06beed94067efaf5b93da5041048347fa22a4290 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Fri, 8 Dec 2023 14:04:53 +0100 Subject: [PATCH v1 2/2] Add GUC backtrace_on_internal_error --- doc/src/sgml/config.sgml| 27 +++ src/backend

Re: backtrace_on_internal_error

2023-12-07 Thread Tom Lane
Peter Eisentraut writes: > Something else to note: I wrote the above code to check the error code; > it doesn't check whether the original code write elog() or ereport(). > There are some internal errors that are written as ereport() now. > Others might be changed from time to time; until now

Re: backtrace_on_internal_error

2023-12-07 Thread Peter Eisentraut
race_functions(edata->funcname)) +   ((edata->funcname && + backtrace_functions && + matches_backtrace_functions(edata->funcname)) || +    (edata->sqlerrcode == ERRCODE_INTERNAL_ERROR && + backtrace_on_internal_error)))     set_back

Re: backtrace_on_internal_error

2023-12-05 Thread Tom Lane
Matthias van de Meent writes: > On Tue, 5 Dec 2023 at 19:30, Robert Haas wrote: >>> I think we should consider unconditionally emitting a backtrace when >>> an elog() is hit, instead of requiring a GUC. >> Perhaps this should be a GUC that defaults to LOG or ERROR. > I can't speak for Nathan,

Re: backtrace_on_internal_error

2023-12-05 Thread Nathan Bossart
On Tue, Dec 05, 2023 at 07:47:25PM +0100, Matthias van de Meent wrote: > On Tue, 5 Dec 2023 at 19:30, Robert Haas wrote: >> On Tue, Dec 5, 2023 at 1:28 PM Nathan Bossart >> wrote: >> > Perhaps this should be a GUC that defaults to LOG or ERROR. >> >> Why? Sorry, I should've explained why in my

Re: backtrace_on_internal_error

2023-12-05 Thread Robert Haas
On Tue, Dec 5, 2023 at 1:47 PM Matthias van de Meent wrote: > I can't speak for Nathan, but my reason would be that I'm not in the > habit to attach a debugger to my program to keep track of state > progression, but instead use elog() during patch development. I'm not > super stoked for getting

Re: backtrace_on_internal_error

2023-12-05 Thread Matthias van de Meent
On Tue, 5 Dec 2023 at 19:30, Robert Haas wrote: > > On Tue, Dec 5, 2023 at 1:28 PM Nathan Bossart > wrote: > > On Tue, Dec 05, 2023 at 01:16:22PM -0500, Robert Haas wrote: > > > I think we should consider unconditionally emitting a backtrace when > > > an elog() is hit, instead of requiring a

Re: backtrace_on_internal_error

2023-12-05 Thread Robert Haas
On Tue, Dec 5, 2023 at 1:28 PM Nathan Bossart wrote: > On Tue, Dec 05, 2023 at 01:16:22PM -0500, Robert Haas wrote: > > I think we should consider unconditionally emitting a backtrace when > > an elog() is hit, instead of requiring a GUC. Or at least any elog() > > that's not at a DEBUGn level.

Re: backtrace_on_internal_error

2023-12-05 Thread Nathan Bossart
On Tue, Dec 05, 2023 at 01:16:22PM -0500, Robert Haas wrote: > I think we should consider unconditionally emitting a backtrace when > an elog() is hit, instead of requiring a GUC. Or at least any elog() > that's not at a DEBUGn level. If the extra output annoys anybody, that > means they're

Re: backtrace_on_internal_error

2023-12-05 Thread Robert Haas
On Tue, Dec 5, 2023 at 12:40 PM Nathan Bossart wrote: > On Tue, Dec 05, 2023 at 11:55:05AM +0100, Peter Eisentraut wrote: > > Would others find this useful? > > Yes. I think I would use this pretty frequently. I think we should consider unconditionally emitting a backtrace when an elog() is

Re: backtrace_on_internal_error

2023-12-05 Thread Nathan Bossart
On Tue, Dec 05, 2023 at 11:55:05AM +0100, Peter Eisentraut wrote: > Would others find this useful? Yes. I think I would use this pretty frequently. > Any other settings or variants in this area > that should be considered while we're here? IMO it would be nice to have a way to turn on

backtrace_on_internal_error

2023-12-05 Thread Peter Eisentraut
| +(edata->sqlerrcode == ERRCODE_INTERNAL_ERROR && + backtrace_on_internal_error))) set_backtrace(edata, 2); /* where backtrace_on_internal_error would be a GUC variable. Would others find this useful? Any other settings or variants in this area that should be considered while we're here?