https://bugs.exim.org/show_bug.cgi?id=2715
a.key <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] --- Comment #6 from a.key <[email protected]> --- Hi, It was my work mate who created this ticket so I'm just gonna continue here. We haven't enabled the debugging yet as this may skew the results we're getting anyways but will be our next step. What we observe is the following: We send an email over smtp to local exim using local php's swiftmailer script. We're catching exim's children with the following: while ! pgrep -P 1307 -a exim -n;do sleep 0.1 ;done ; strace -s 300 -f -p $(pgrep -P 1307 exim -n) where 1307 is the parent exim process at the time. This catches the following: ... write(4, "2021-05-14 16:14:57.594 Received from [email protected] H=localhost [127.0.0.1]:47476 I=[127.0.0.1]:25 P=esmtp S=545 RT=0.045s [email protected] T=\"Wonderful Subject\"\n", 208) = 208 close(4) = 0 munmap(0x7fb85a18e000, 4096) = 0 select(9, [8], NULL, NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout) stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3661, ...}) = 0 stat("", 0x7ffed09ad840) = -1 ENOENT (No such file or directory) close(9) = 0 open("/var/log/exim/main.log", O_WRONLY|O_APPEND|O_CLOEXEC) = 4 fstat(4, {st_mode=S_IFREG|0640, st_size=1485802, ...}) = 0 write(4, "2021-05-14 16:14:57.595 1lhZWj-0001T1-Dr <= [email protected] H=localhost [127.0.0.1]:47476 I=[127.0.0.1]:25 P=esmtp S=545 RT=0.045s [email protected] T=\"Wonderful Subject\" from <[email protected]> for [email protected]\n", 265) = 265 rt_sigprocmask(SIG_BLOCK, [ALRM], [], 8) = 0 setitimer(ITIMER_REAL, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=43, tv_usec=625500}}, NULL) = 0 rt_sigsuspend(~[ALRM RTMIN RT_1], 8 ----- ~ 30s delay ----- ) = ? ERESTARTNOHAND (To be restarted if no handler) --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- rt_sigaction(SIGALRM, {sa_handler=0x55a88a19f6a0, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fb85763f630}, NULL, 8) = 0 rt_sigreturn({mask=[ALRM]}) = -1 EINTR (Interrupted system call) rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 close(3) = 0 munmap(0x7fb85a18f000, 4096) = 0 rt_sigaction(SIGTERM, {sa_handler=SIG_DFL, sa_mask=[TERM], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fb857298400}, {sa_handler=SIG_IGN, sa_mask=[TERM], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fb857298400}, 8) = 0 rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fb857298400}, {sa_handler=SIG_IGN, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fb857298400}, 8) = 0 write(7, "250 OK id=1lhZWj-0001T1-Dr\r\n", 28) = 28 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fb85a170b50) = 5908 rt_sigaction(SIGTERM, {sa_handler=0x55a88a1ea340, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fb85763f630}, NULL, 8) = 0 setsockopt(8, SOL_TCP, TCP_QUICKACK, [0], 4) = 0 rt_sigaction(SIGALRM, {sa_handler=0x55a88a1ea330, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fb85763f630}, NULL, 8) = 0 alarm(300) = 0 read(8, strace: Process 5908 attached <unfinished ...> I cut out the unnecessary bits but basically... After the final . (dot) is send to finish writing the body, exim does a little bit processing (AV+spamassassin) scanning and eventually get to the point where it sleeps for about 30 or more seconds. So as seen above it sends a log message and stops at: rt_sigsuspend(~[ALRM RTMIN RT_1], 8 This lasts about 30s or more and eventually continues. But the 30s (increasing with the lifetime of the exim process) delay is causing the swiftmailer and the php itself to timeout and error out. I've managed to find similar bug reports: https://www.mail-archive.com/[email protected]/msg54137.html And more recently: https://lists.exim.org/lurker/message/20210330.140234.fe889005.en.html ...both mentioning the delay at "rt_sigsuspend(~[ALRM RTMIN RT_1], 8" We do realize the debug level log would be more helpful here but please understand that this issue can only be observed after some time (days) of starting of the exim service hence debugging is difficult as we'd need to leave it running in this mode for days. Simple restart of the service fixes this issue straight away but it's not something that we'd like to deploy as a permanent fix. -- You are receiving this mail because: You are on the CC list for the bug. -- ## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
