Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-24 Thread Stephen Frost
* Vesa-Matti J Kari (vmk...@cc.helsinki.fi) wrote: > Many thanks to all who contributed to the fix. Great! Thanks for the report and the testing. Stephen signature.asc Description: Digital signature

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-24 Thread Vesa-Matti J Kari
Hello, On Mon, 23 Sep 2013, Stephen Frost wrote: > I've now committed a fix for this issue. I cloned the 9.4devel branch and linked my authmilter and a test program (based on Heikki's earlier design) against the libpq that comes with it. After hours of pretty extensive stress testing using 2,

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-23 Thread Stephen Frost
Vesa-Matti J Kari, I've now committed a fix for this issue. If you have opportunity to, it'd be great to pull down the latest git (for whichever supported branch you'd like) and give it a try. Otherwise, the fix should be out with our next round of point releases (which I expect will be

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote: > That patch looks wrong to me. Note that the if (conn->ssl) branch resets > conn->ssl to NULL. huh, it figures one would overlook the simplest things. Of course it's not locking up now- we never remove the hooks (as my original patch was doing :).

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
* Heikki Linnakangas (hlinnakan...@vmware.com) wrote: > Actually, I think there's a pre-existing bug there in git master. If > the SSL_set_app_data or SSL_set_fd call in pqsecure_open_client > fails for some reason, it will call close_SSL() with conn->ssl > already set, and the mutex held. close_SS

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Andres Freund
On 2013-09-13 15:03:31 -0400, Stephen Frost wrote: > * Andres Freund (and...@2ndquadrant.com) wrote: > > It seems slightly cleaner to just move the pqsecure_destroy(); to the > > end of that function, based on a boolean. But if you think otherwise, I > > won't protest... > > Hmm, agreed; I had ori

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote: > It seems slightly cleaner to just move the pqsecure_destroy(); to the > end of that function, based on a boolean. But if you think otherwise, I > won't protest... Hmm, agreed; I had originally been concerned that the SIGPIPE madness needed to be ar

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Heikki Linnakangas
On 13.09.2013 22:03, Stephen Frost wrote: * Andres Freund (and...@2ndquadrant.com) wrote: It seems slightly cleaner to just move the pqsecure_destroy(); to the end of that function, based on a boolean. But if you think otherwise, I won't protest... Hmm, agreed; I had originally been concerned

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
* Heikki Linnakangas (hlinnakan...@vmware.com) wrote: > Umm, with that patch, pqsecure_destroy() is never called. The "if > (conn->ssl)" test that's now at the end of the close_SSL function is > never true, because conn->ssl is set to NULL earlier. Yeah, got ahead of myself, as Andres pointed out.

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Heikki Linnakangas
On 13.09.2013 22:26, Heikki Linnakangas wrote: I'm afraid the "move_locks.diff" patch you posted earlier is also broken; close_SSL() is called in error scenarios from pqsecure_open_client(), while already holding the mutex. So it will deadlock with itself if the connection cannot be established.

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Andres Freund
On 2013-09-13 13:59:54 -0400, Stephen Frost wrote: > Unfortunately, while I can still easily get the deadlock to happen when > the hooks are reset, the hooks don't appear to ever get called when > ssl_open_connections is set to zero. You have a good point about the > additional SSL calls after the

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Andres Freund
On 2013-09-13 14:33:25 -0400, Stephen Frost wrote: > * Stephen Frost (sfr...@snowman.net) wrote: > > * Andres Freund (and...@2ndquadrant.com) wrote: > > > Hm. close_SSL() first does pqsecure_destroy() which will unset the > > > callbacks, and the count and then goes on to do X509_free() and > > > E

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote: > On 2013-09-13 13:15:34 -0400, Stephen Frost wrote: > > Good thought. Got sucked into a meeting but once I'm out I'll try having > > the lock/unlock routines abort if they're called while ssl_open_connections > > is zero, which should not be happenin

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
* Stephen Frost (sfr...@snowman.net) wrote: > * Andres Freund (and...@2ndquadrant.com) wrote: > > Hm. close_SSL() first does pqsecure_destroy() which will unset the > > callbacks, and the count and then goes on to do X509_free() and > > ENGINE_finish(), ENGINE_free() if either is used. > > > > It'

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Andres Freund
On 2013-09-13 13:15:34 -0400, Stephen Frost wrote: > Andres, > > On Friday, September 13, 2013, Andres Freund wrote: > > > > It'd be interesting to replace the origin callbacks with one immediately > > doing an abort() or similar to see whether they maybe are called after > > they shouldn't be and

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
Andres, On Friday, September 13, 2013, Andres Freund wrote: > > It'd be interesting to replace the origin callbacks with one immediately > doing an abort() or similar to see whether they maybe are called after > they shouldn't be and from where. > Good thought. Got sucked into a meeting but once

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Andres Freund
On 2013-09-13 12:40:11 -0400, Stephen Frost wrote: > Heikki, all, > > * Stephen Frost (sfr...@snowman.net) wrote: > > Very curious. Out of time right now to look into it, but will probably > > be back at it later tonight. > > Alright, I was back at this a bit today and decided to go with a hunch

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-13 Thread Stephen Frost
Heikki, all, * Stephen Frost (sfr...@snowman.net) wrote: > Very curious. Out of time right now to look into it, but will probably > be back at it later tonight. Alright, I was back at this a bit today and decided to go with a hunch- and it looks like I might have been right to try. Leaving the

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-10 Thread Stephen Frost
Heikki, * Heikki Linnakangas (hlinnakan...@vmware.com) wrote: > Hmm. Are you sure you're getting an SSL connection? Run it with > something like this to make sure: sslmode=require doesn't help on Unix domain connections. :) Was able to get it to lock with both 9.2.4 and master, and with both ver

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-10 Thread Heikki Linnakangas
On 10.09.2013 18:10, Stephen Frost wrote: I've run your test program against both git master and 9.2.4 on a couple of Ubuntu 13.04 boxes and all I see are tons of these: 1: DEBUG: database connection established 1: DEBUG: about to call PQfinish() 1: DEBUG: database connection established 1: DEBU

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-10 Thread Stephen Frost
Heikki, * Heikki Linnakangas (hlinnakan...@vmware.com) wrote: > Thanks! I tested with git master. I've run your test program against both git master and 9.2.4 on a couple of Ubuntu 13.04 boxes and all I see are tons of these: 1: DEBUG: database connection established 1: DEBUG: about to call PQfi

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-09 Thread Stephen Frost
Alvaro, * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: > Heikki Linnakangas wrote: > > I'll dig into that, but right now it seems like an OpenSSL or > > libcrypto bug to me. Or something in the way we use them, although I > > can't see anything obviously wrong in the libpq code at a quick > >

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-09 Thread Alvaro Herrera
Heikki Linnakangas wrote: > I'll dig into that, but right now it seems like an OpenSSL or > libcrypto bug to me. Or something in the way we use them, although I > can't see anything obviously wrong in the libpq code at a quick > glance. Can you please try with ssl_renegotiation_limit=0? [ looks

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-09 Thread Heikki Linnakangas
On 09.09.2013 18:20, Stephen Frost wrote: Vesa-Matti, Heikki, * Heikki Linnakangas (hlinnakan...@vmware.com) wrote: On 09.09.2013 15:36, Vesa-Matti J Kari wrote: If I interpret this correctly, threads #2 and #3 are waiting for the same lock but they make no progress. A-ha, the deadlock happe

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-09 Thread Heikki Linnakangas
On 09.09.2013 15:36, Vesa-Matti J Kari wrote: It looks like a deadlock situation of some kind... (gdb) thread 2 [Switching to thread 2 (Thread 0x7fe62f7fe700 (LWP 27284))] #0 0x7fe64c0b589c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 (gdb) bt #0 0x7fe64c0b589c in

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-09 Thread Stephen Frost
Vesa-Matti, Heikki, * Heikki Linnakangas (hlinnakan...@vmware.com) wrote: > On 09.09.2013 15:36, Vesa-Matti J Kari wrote: > >If I interpret this correctly, threads #2 and #3 are waiting for the same > >lock but they make no progress. > > A-ha, the deadlock happens while doing SSL stuff. I didn't

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-09 Thread Vesa-Matti J Kari
Hello, On Mon, 9 Sep 2013, Heikki Linnakangas wrote: > I managed to set that up and got it running. Many thanks for taking the time. > But it works fine for me, does not hang. Okay. Have you tried increasing the iterations for the smtp sender scripts? And could you please specify what is your

Re: [HACKERS] Strange hanging bug in a simple milter

2013-09-09 Thread Heikki Linnakangas
On 09.09.2013 09:34, Vesa-Matti J Kari wrote: Basically all that the authmilter now does is to connect to PostgreSQL in authmilt_connect() and close the connection in authmilt_close(). Based on the authmilter debug logging it seems to me that when the hanging occurs, the authmilter never complete

[HACKERS] Strange hanging bug in a simple milter

2013-09-08 Thread Vesa-Matti J Kari
Hello PostgreSQL gurus, (I have already posted a very similar message to comp.mail.sendmail newsgroup on August 22nd, but I haven't received any responses there. I have also tried pgsql-interfa...@postgresql.org but to no avail. Solving this problem requires some Sendmail/Postfix experience becau