Tony Devlin <tonydev...@gmail.com> wrote: > pid 7825] write(14, > "\0\373\0\0\6\0\0\0\0\0\21iB\376\377\377\377\377\377\377\377\1\0\0\0\0\0\0\0\v\0\0\0\3^Ca\201\0\0\0\0\0\0\376\377\377\377\377\377\377\377\22\0\0\0\376\377\377\377\377\377\377\377\r\0\0\0\376\377\377\377\377\377\377\377\376\377\377\377\377\377\377\377\0\0\0\0d\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\376\377\377\377\377\377\377\377\0\0\0\0\0\0\0\0\376\377\377\377\377\377\377\377\376\377\377\377\377\377\377\377\376\377\377\377\377\377\377\377\0\0\0\0\0\0\0\0\376\377\377\377\377\377\377\377\376\377\377\377\377\377\377\377\22select > 1 from > dual\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", > 251) = 251 > [pid 7825] read(14, <unfinished ...> > [pid 7827] +++ killed by SIGKILL +++
Any update? It looks like your DB driver is not using/respecting any timeout at all[1]. It is bad to not have a timeout there. There should be a way to set a timeout so you can at least tell the user the DB connection dropped or maybe get your app to disconnect+retry once. A better looking strace would be something like: write(fd, ...); => success (poll|select|ppoll) syscall ... read(fd, ...); /* only if (poll|select|ppoll) was successful[2] */ This goes for configuring all connections/services for any app. [1] or if it's relying on SO_RCVTIMEO socket option(rare), that's set way too high. Any timeout set for any external connection should be lower than the unicorn (last-resort) timeout feature. [2] any read() syscall after (poll|select|ppoll) should be non-blocking, because (poll|select|ppoll) may spuriously wakeup.