https://bugzilla.wikimedia.org/show_bug.cgi?id=50485
Web browser: --- Bug ID: 50485 Summary: morebots (adminbot) doesn't reliably detect disconnects Product: Tools Version: unspecified Hardware: All OS: All Status: NEW Severity: normal Priority: Unprioritized Component: [Other] Assignee: wikibugs-l@lists.wikimedia.org Reporter: o...@wikimedia.org Classification: Unclassified Mobile Platform: --- On Jun 29 02:13 UTC adminbot logged a LocalisationUpdate. Five hours later, at 07:18 UTC, it disconnected from IRC, with a server-generated ping timeout quit message. On Jul 1 Tim noticed that it was absent from the channel and checked the process state. It appeared to still be in a connected state, calling select() at regular intervals. lsof showed: adminlogb 20258 adminbot 4u IPv4 8395416 0t0 TCP wikitech-static:57198->HUBBARD.CLUB.CC.CMU.EDU:afs3-fileserver (ESTABLISHED) strace showed: 1372650153.070033 select(5, [4], [], [], {0, 51423}) = 0 (Timeout) 1372650153.122075 gettimeofday({1372650153, 122173}, NULL) = 0 1372650153.122379 select(5, [4], [], [], {0, 100000}) = 0 (Timeout) 1372650153.222975 gettimeofday({1372650153, 223084}, NULL) = 0 According to <http://poe.perl.org/?POE_Cookbook/IRC_Bot_Reconnecting>, a good disconnection detection algorithm should periodically ping the server to check that the connection is still alive. morebots does not. The IRC library that morebots uses, irclib, does have a 'set_keepalive' method on the ServerConnection object, which causes it to ping the server at regular intervals. morebots should use it. We should also add an explicit check that a ping reply has been received in a timely fashion, and recycle the connection otherwise. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list Wikibugs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikibugs-l