A NOTE has been added to this issue. ====================================================================== http://www.dbmail.org/mantis/view.php?id=879 ====================================================================== Reported By: jpecar Assigned To: ====================================================================== Project: DBMail Issue ID: 879 Category: Database layer Reproducibility: always Severity: major Priority: normal Status: new target: ====================================================================== Date Submitted: 31-Jan-11 16:21 CET Last Modified: 01-Feb-11 14:11 CET ====================================================================== Summary: login hangs if DB unavailable Description: I'm setting up a HA environment of dbmail instances over replicated mysql. I want to monitor health by connecting to pop port and issuing login statements, fully expecting some kind of error msg if db has gone away. However, I see that login hangs indefinitely after pass is given, without any feedback. ======================================================================
---------------------------------------------------------------------- (0003156) paul (administrator) - 31-Jan-11 16:33 http://www.dbmail.org/mantis/view.php?id=879#c3156 ---------------------------------------------------------------------- I'm not seeing this behaviour on git-master. I'm seeing immediate network disconnects. Yes, providing an informative error would be nicer, but terminating the connection at least will tell you something is amiss. ---------------------------------------------------------------------- (0003158) jpecar (reporter) - 01-Feb-11 12:04 http://www.dbmail.org/mantis/view.php?id=879#c3158 ---------------------------------------------------------------------- Just compiled fresh code from git and it behaves the same here. Here's the strace output: .... [pid 3306] read(14, "user test2\r\n", 4095) = 12 [pid 3306] read(14, 0x7fffdd61be40, 4095) = -1 EAGAIN (Resource temporarily unavailable) [pid 3306] write(14, "+OK Password required for test2\r\n", 33) = 33 [pid 3306] epoll_ctl(8, EPOLL_CTL_MOD, 14, {EPOLLIN|EPOLLOUT, {u32=14, u64=14}}) = 0 [pid 3306] epoll_wait(8, {{EPOLLOUT, {u32=14, u64=14}}}, 32, 53841) = 1 [pid 3306] clock_gettime(CLOCK_MONOTONIC, {1616277, 212805117}) = 0 [pid 3306] epoll_ctl(8, EPOLL_CTL_MOD, 14, {EPOLLIN, {u32=14, u64=14}}) = 0 [pid 3306] epoll_wait(8, {{EPOLLIN, {u32=14, u64=14}}}, 32, 53834) = 1 [pid 3306] clock_gettime(CLOCK_MONOTONIC, {1616279, 477993117}) = 0 [pid 3306] read(14, "pass test2\r\n", 4095) = 12 [pid 3306] read(14, 0x7fffdd61be40, 4095) = -1 EAGAIN (Resource temporarily unavailable) [pid 3306] poll([{fd=3, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) [pid 3306] write(3, "\1\0\0\0\16", 5) = 5 [pid 3306] read(3, <unfinished ...> [pid 3308] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) [pid 3308] futex(0x1a57c3d8, FUTEX_WAIT_PRIVATE, 2, NULL And lsof confirms that fd3 is indeed tcp connection to mysql. Telnet session to dbmail-pop3d returns nothing after pass is given, just sits there. One thing I need to verify - I have dbmail compiled against libmysqlclient15 from mysql 5.0, while I have mysq 5.1 on server with client version 16. ---------------------------------------------------------------------- (0003159) jpecar (reporter) - 01-Feb-11 13:19 http://www.dbmail.org/mantis/view.php?id=879#c3159 ---------------------------------------------------------------------- I've recompiled libzdb to link against mysqlclient 16, but I'm still observing the same behaviour: .... [pid 3306] 13:02:55 read(4, "user test2\r\n", 4095) = 12 [pid 3306] 13:02:55 read(4, 0x7fffdd61be40, 4095) = -1 EAGAIN (Resource temporarily unavailable) [pid 3306] 13:02:55 write(4, "+OK Password required for test2\r\n", 33) = 33 [pid 3306] 13:02:55 epoll_ctl(8, EPOLL_CTL_MOD, 4, {EPOLLIN|EPOLLOUT, {u32=4, u64=4}}) = 0 [pid 3306] 13:02:55 epoll_wait(8, {{EPOLLOUT, {u32=4, u64=4}}}, 32, 43147) = 1 [pid 3306] 13:02:55 clock_gettime(CLOCK_MONOTONIC, {1620237, 726346117}) = 0 [pid 3306] 13:02:55 epoll_ctl(8, EPOLL_CTL_MOD, 4, {EPOLLIN, {u32=4, u64=4}}) = 0 [pid 3306] 13:02:55 epoll_wait(8, {{EPOLLIN, {u32=4, u64=4}}}, 32, 43140) = 1 [pid 3306] 13:02:57 clock_gettime(CLOCK_MONOTONIC, {1620239, 367304117}) = 0 [pid 3306] 13:02:57 read(4, "pass test2\r\n", 4095) = 12 [pid 3306] 13:02:57 read(4, 0x7fffdd61be40, 4095) = -1 EAGAIN (Resource temporarily unavailable) [pid 3306] 13:02:57 poll([{fd=3, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) [pid 3306] 13:02:57 write(3, "\1\0\0\0\16", 5) = 5 [pid 3306] 13:02:57 read(3, <unfinished ...> [pid 3308] 13:03:19 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) [pid 3308] 13:03:19 futex(0x1a57c3d8, FUTEX_WAIT_PRIVATE, 2, NULL waiting ... and then I unblock mysql in iptables: <unfinished ...> [pid 3306] 13:12:27 <... read resumed> "\7\0\0\1\0\0\0\2\0\0\0", 16384) = 11 [pid 3306] 13:12:27 futex(0x1a57c3d8, FUTEX_WAKE_PRIVATE, 1 <unfinished ....> [pid 3308] 13:12:27 <... futex resumed> ) = 0 [pid 3306] 13:12:27 <... futex resumed> ) = 1 [pid 3308] 13:12:27 futex(0x1a57c3d8, FUTEX_WAKE_PRIVATE, 1 <unfinished ....> [pid 3306] 13:12:27 poll([{fd=3, events=POLLIN|POLLPRI}], 1, 0 <unfinished ...> [pid 3308] 13:12:27 <... futex resumed> ) = 0 [pid 3306] 13:12:27 <... poll resumed> ) = 0 (Timeout) [pid 3308] 13:12:27 clock_gettime(CLOCK_REALTIME, <unfinished ...> [pid 3306] 13:12:27 write(3, "0\0\0\0\26SELECT 1=1 FROM dbmail_usermap LIMIT 1 OFFSET 0", 52 <unfinished ...> [pid 3308] 13:12:27 <... clock_gettime resumed> {1296562347, 953367000}) = 0 [pid 3306] 13:12:27 <... write resumed> ) = 52 [pid 3308] 13:12:27 futex(0x1a57c3ac, FUTEX_WAIT_PRIVATE, 49, {59, 46633000} <unfinished ...> [pid 3306] 13:12:27 read(3, "\f\0\0\1\0\f\0\0\0\1\0\0\0\0\0\0\31\0\0\2\3def\0\0\0\0031=1\0\f?\0\1\0\0\0\10\201\0\0\0\0\5\0\0\3\376\0\0\2\0", 16384) = 54 [pid 3306] 13:12:27 poll([{fd=3, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) [pid 3306] 13:12:27 write(3, "\n\0\0\0\27\f\0\0\0\1\1\0\0\0", 14) = 14 [pid 3306] 13:12:27 read(3, "\1\0\0\1\1\34\0\0\2\3def\0\0\0\0031=1\0031=1\f?\0\1\0\0\0\3\1\0\0\0\0\5\0\0\3\376\0\0b\0", 16384) = 46 [pid 3306] 13:12:27 poll([{fd=3, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) [pid 3306] 13:12:27 write(3, "\5\0\0\0\31\f\0\0\0", 9) = 9 [pid 3306] 13:12:27 poll([{fd=3, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) [pid 3306] 13:12:27 write(3, "\1\0\0\0\16", 5) = 5 [pid 3306] 13:12:27 read(3, "\7\0\0\1\0\0\0\2\0\0\0", 16384) = 11 [pid 3306] 13:12:27 poll([{fd=3, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) [pid 3306] 13:12:27 write(3, "x\0\0\0\26SELECT login, sock_allow, sock_deny, userid FROM dbmail_usermap WHERE login in (?,'ANY') ORDER BY sock_allow, sock_deny", 124) = 124 .... and am of course greeted with "-ERR I'm leaving, you're too slow" :) Now I'd like to figure out what's going on here. Can you point me to the code I shoud start reading? Is it in dbmail or in libzdb? ---------------------------------------------------------------------- (0003160) paul (administrator) - 01-Feb-11 14:01 http://www.dbmail.org/mantis/view.php?id=879#c3160 ---------------------------------------------------------------------- Looks like imap is doing Connection_ping, but none of the other daemons is. Will fix. ---------------------------------------------------------------------- (0003161) paul (administrator) - 01-Feb-11 14:11 http://www.dbmail.org/mantis/view.php?id=879#c3161 ---------------------------------------------------------------------- I've pushed a partial fix for this. It doesn't yet try to reconnect when the database becomes reachable again. Feel free to improve this. Issue History Date Modified Username Field Change ====================================================================== 31-Jan-11 16:21 jpecar New Issue 31-Jan-11 16:33 paul Note Added: 0003156 01-Feb-11 12:04 jpecar Note Added: 0003158 01-Feb-11 13:19 jpecar Note Added: 0003159 01-Feb-11 14:01 paul Note Added: 0003160 01-Feb-11 14:11 paul Note Added: 0003161 ====================================================================== _______________________________________________ Dbmail-dev mailing list [email protected] http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
