Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-10-07 Thread Andrew Sullivan
On Sat, Sep 18, 2004 at 06:06:05AM -0400, Jan Wieck wrote: On 9/17/2004 7:32 PM, Tom Lane wrote: over time. I'm wondering about DNS lookup results in particular. Except for one localhost, one /tmp/.s.PGSQL... and the 543x lookup during the postmaster start, all lookups are IP addresses

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-09-20 Thread Andrew Sullivan
On Fri, Sep 17, 2004 at 07:32:30PM -0400, Tom Lane wrote: involve consulting DNS? If so, try to correlate the crash probability with changes in your DNS zone contents ... No changes. The systems in question have no access to DNS. /etc/hosts only. A -- Andrew Sullivan | [EMAIL PROTECTED]

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-09-19 Thread Jan Wieck
On 9/17/2004 7:32 PM, Tom Lane wrote: Jan Wieck [EMAIL PROTECTED] writes: The problem comes and goes. So either I can cause a coredump just on the snap by running a shellscript that does 100 psql -c select version() calls, or it is next to impossible to crash it at all. Hmm, that's really

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-09-17 Thread Jan Wieck
On 4/19/2004 1:18 PM, Jan Wieck wrote: Tom Lane wrote: Andrew Sullivan [EMAIL PROTECTED] writes: On Thu, Apr 15, 2004 at 07:52:59PM -0400, Tom Lane wrote: I can see from your trace that you are using the getaddrinfo code from libc, but where is configure finding a header that declares struct

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-09-17 Thread Tom Lane
Jan Wieck [EMAIL PROTECTED] writes: The problem comes and goes. So either I can cause a coredump just on the snap by running a shellscript that does 100 psql -c select version() calls, or it is next to impossible to crash it at all. Hmm, that's really bizarre. It seems like the only

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-06-18 Thread Zeugswetter Andreas SB SD
My only guess is that getaddrinfo in your libc has a bug somehow that is corrupting the stack (hance the improper backtrace), then crashing. It could be libc on AIX, I suppose, but it strikes me as sort of odd that nobody else ever seens this. Unless nobody else is using AIX 5.1, which

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-06-18 Thread Andrew Sullivan
On Thu, Jun 17, 2004 at 06:06:12PM -0400, Bruce Momjian wrote: When you say init directory, what do you mean? /bin? No. The place where the init scripts (which cause postgres to start) live. A -- Andrew Sullivan | [EMAIL PROTECTED] In the future this spectacle of the middle classes

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-06-18 Thread Christopher Browne
Quoth [EMAIL PROTECTED] (Bruce Momjian): Andrew Sullivan wrote: On Thu, Jun 17, 2004 at 01:12:10PM -0400, Bruce Momjian wrote: Well, the bad news is that this backtrace isn't very useful. No kidding. It's pretty frustrating. My only guess is that getaddrinfo in your libc has a bug

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-06-17 Thread Andrew Sullivan
On Mon, May 10, 2004 at 11:59:40AM -0400, Andrew Sullivan wrote: On the weekend, we ran a set of tests on the offending system to see if we could re-create it. We set up the triggering conditions just as they'd been when it happened, and alas, no segfault. So although this was pretty much

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-06-17 Thread Bruce Momjian
Andrew Sullivan wrote: On Mon, May 10, 2004 at 11:59:40AM -0400, Andrew Sullivan wrote: On the weekend, we ran a set of tests on the offending system to see if we could re-create it. We set up the triggering conditions just as they'd been when it happened, and alas, no segfault. So

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-06-17 Thread Andrew Sullivan
On Thu, Jun 17, 2004 at 01:12:10PM -0400, Bruce Momjian wrote: Well, the bad news is that this backtrace isn't very useful. No kidding. It's pretty frustrating. My only guess is that getaddrinfo in your libc has a bug somehow that is corrupting the stack (hance the improper backtrace),

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-06-17 Thread Bruce Momjian
Andrew Sullivan wrote: On Thu, Jun 17, 2004 at 01:12:10PM -0400, Bruce Momjian wrote: Well, the bad news is that this backtrace isn't very useful. No kidding. It's pretty frustrating. My only guess is that getaddrinfo in your libc has a bug somehow that is corrupting the stack

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-05-10 Thread Andrew Sullivan
On Wed, Apr 28, 2004 at 03:56:55PM -0400, Andrew Sullivan wrote: On Mon, Apr 26, 2004 at 03:19:21PM -0400, Bruce Momjian wrote: Has this been resolved? it elsewhere. I've been trying some alternative approaches to causing it today, and so far no luck. On the weekend, we ran a set of

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-04-28 Thread Andrew Sullivan
On Mon, Apr 26, 2004 at 03:19:21PM -0400, Bruce Momjian wrote: Has this been resolved? Not as far as I know. Unfortunately, the problem happened in an environment I Can't Play With, and I haven't been able to reproduce it elsewhere. I've been trying some alternative approaches to causing it

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-04-26 Thread Bruce Momjian
Has this been resolved? --- Andrew Sullivan wrote: On Mon, Apr 19, 2004 at 11:18:07AM -0400, Tom Lane wrote: What you'd need to do is determine which system headers are being #include'd by that config test, and then

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-04-19 Thread Tom Lane
Andrew Sullivan [EMAIL PROTECTED] writes: On Thu, Apr 15, 2004 at 07:52:59PM -0400, Tom Lane wrote: I can see from your trace that you are using the getaddrinfo code from libc, but where is configure finding a header that declares struct addrinfo? Hrm, I can't seem to tell. I see this in

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-04-19 Thread Alvaro Herrera
On Mon, Apr 19, 2004 at 11:18:07AM -0400, Tom Lane wrote: A shortcut is just to grep through /usr/include and its subdirectories for addrinfo. If you only find one definition, then you don't really need to worry too much. But if there's more than one you need to determine which is getting

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-04-19 Thread Jan Wieck
Tom Lane wrote: Andrew Sullivan [EMAIL PROTECTED] writes: On Thu, Apr 15, 2004 at 07:52:59PM -0400, Tom Lane wrote: I can see from your trace that you are using the getaddrinfo code from libc, but where is configure finding a header that declares struct addrinfo? Hrm, I can't seem to tell. I

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-04-19 Thread Andrew Sullivan
On Mon, Apr 19, 2004 at 11:18:07AM -0400, Tom Lane wrote: What you'd need to do is determine which system headers are being #include'd by that config test, and then look through them to find struct addrinfo. Well, I have this in /usr/include/netdb.h: struct addrinfo { int

[HACKERS] signal 11 on AIX: 7.4.2

2004-04-15 Thread Andrew Sullivan
We've had a backend crash with sig 11 during connection. My guess is there's something up with (maybe) the IPv6 support on AIX. I seem to recall something similar recently, but I can't find the post in the archives. Suggestions? oxrslive=# SELECT version();

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-04-15 Thread Andrew Sullivan
On Thu, Apr 15, 2004 at 01:07:33PM -0400, Andrew Sullivan wrote: We've had a backend crash with sig 11 during connection. By the way, I failed to mention, but sig 11 is segfault on AIX. A -- Andrew Sullivan | [EMAIL PROTECTED] ---(end of

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-04-15 Thread Tom Lane
Andrew Sullivan [EMAIL PROTECTED] writes: We've had a backend crash with sig 11 during connection. My guess is there's something up with (maybe) the IPv6 support on AIX. (gdb) bt #0 0xd01d7778 in memmove () from /usr/lib/libc.a(shr.o) #1 0xd0326e1c in getaddrinfo2 () from