On Tue, Dec 29, 2015 at 7:04 PM, Shay Rojansky <r...@roji.org> wrote:
> Could you describe the worklad a bit more? Is this rather concurrent? Do >> you use optimized or debug builds? How long did you wait for the >> backends to die? Is this all over localhost, external ip but local, >> remotely? >> > > The workload is a a rather diverse set of integration tests executed with > Npgsql. There's no concurrency whatsoever - tests are executed serially. > The backends stay alive indefinitely, until they are killed. All this is > over localhost with TCP. I can try other scenarios if that'll help. > > What procedure do you use to kill backends? Normally, if we kill via task manager using "End Process", it is considered as backend crash and the server gets restarted and all other backends got disconnected. > > Note that the number of backends that stay stuck after the tests is >> > constant (always 12). >> >> Can you increase the number of backends used in the test? And check >> whether it's still 12? >> > > Well, I ran the testsuite twice in parallel, and got... 23 backends stuck > at the end. > > >> How are your clients disconnecting? Possibly without properly >> disconnecting? >> > > That's possible, definitely in some of the test cases. > > What I can do is try to isolate things further by playing around with the > tests and trying to see if a more minimal repro can be done - I'll try > doing this later today or tomorrow. If anyone has any other specific tests > or checks I should do let me know. > I think first we should try to isolate whether the hanged backends are due to the reason that they are not disconnected properly or there is some other factor involved as well, so you can try to kill/ disconnect the sessions connected via psql in the same way as you are doing for connections with Npgsql and see if you can reproduce the same behaviour. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com