Re: [exim] Mail with thousands recipients takes exponential longer time
On 3/12/21 9:34 AM, Olaf Hopp (SCC) via Exim-users wrote: Hi, folks, thanks for your answers so far Just a few more input: There is no rate limiting, tar pitting or so . I created a special "accept" ACL just at the beginning of my acl_smtp_rcpt I created a special router at the beginning of my routers. I commented out most of my redirect routers. So most of the ACL and routing stuff in my config is skipped. Blame on me :( I had a typo in the sender adress in my acl_smtp_rcpt So if I use "accept senders = foo@example.com" correctly as first line in acl_smtp_rcpt my test runs like hell under 2 Minutes. So the delay must come "somewhere" from below within my receipient acls. So fixed for that special sender. Regards and a have nice weekend, Olaf -- Karlsruher Institut für Technologie (KIT) Steinbuch Centre for Computing (SCC) Dipl.-Geophys. Olaf Hopp Zirkel 2 Gebäude 20.21, Raum 316 76131 Karlsruhe Telefon: +49 721 608-48009 E-Mail: olaf.h...@kit.edu Web: www.scc.kit.edu Sitz der Körperschaft: Kaiserstraße 12, 76131 Karlsruhe KIT - Die Forschungsuniversität in der Helmholtz-Gemeinschaft smime.p7s Description: S/MIME Cryptographic Signature -- ## List details at https://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
Re: [exim] Mail with thousands recipients takes exponential longer time
On 12/03/2021 09:45, Cyborg via Exim-users wrote: Am 12.03.21 um 10:21 schrieb Jeremy Harris via Exim-users: I have a static linear timeframe from one recipient to another appearing in the log file, once the actual delivery from thunderbird ended. strace says, it's due to my SQL select i use. it would go faster if the SQL result would be cached, instead of repeating itself. Could you show us the ACL line? it was a simple one like this: "select 1 from eximconfig where '${quote_mysql:${domain:$item}}' REGEXP \Nconcat('^',domain,'$')\N and name='' and value='1' LIMIT 1 You didn't show the part where $item was derived (assuming that's what the ACL line really says). Your obfuscation, not giving the entire ACL line, hinders. That said, I would expect that query to be cachable. Your strace showed repeated actual SQL lookups... could you run a test with debug enabled and show the output? Just a couple of recipients would be enough, and you could mail me direct rather than the mailinglist if you prefer not to expose info to the whole world. -- Cheers, Jeremy -- ## List details at https://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
Re: [exim] Mail with thousands recipients takes exponential longer time
Am 12.03.21 um 10:21 schrieb Jeremy Harris via Exim-users: I have a static linear timeframe from one recipient to another appearing in the log file, once the actual delivery from thunderbird ended. strace says, it's due to my SQL select i use. it would go faster if the SQL result would be cached, instead of repeating itself. Could you show us the ACL line? it was a simple one like this: "select 1 from eximconfig where '${quote_mysql:${domain:$item}}' REGEXP \Nconcat('^',domain,'$')\N and name='' and value='1' LIMIT 1 As all those 3000 recipients where to the same domain ( test1...test3000@domainname ) it was defacto the same SQL query. I do not expect a sql cache in exim, as it would be a very very complex issue and is not really the bottleneck here. BTW: Test done on Name : exim Version : 4.94 Release : 1.fc32 Architecture: x86_64 best regards, Marius -- ## List details at https://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
Re: [exim] Mail with thousands recipients takes exponential longer time
On 12/03/2021 08:44, Cyborg via Exim-users wrote: Am 12.03.21 um 08:27 schrieb Andrew C Aitchison via Exim-users: Any ideas what's happening here ? The increasing delays remind me of the delays for failed logins. Is there some sort of authentication, or similar pam or SMB access, to check the existence of each recipient ? i made a test mail with 3000 recipients in CC. Exim is increasing it's CPU Usage the more recipients are used until 100% CPU Core usage is reached: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1378404 root 20 0 66084 55300 10632 R 98,7 1,1 8:59.99 /usr/sbin/exim -Mc 1lKd2y-005m5p-3V I have a static linear timeframe from one recipient to another appearing in the log file, once the actual delivery from thunderbird ended. strace says, it's due to my SQL select i use. it would go faster if the SQL result would be cached, instead of repeating itself. Could you show us the ACL line? -- Cheers, Jeremy -- ## List details at https://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
Re: [exim] Mail with thousands recipients takes exponential longer time
Am 12.03.21 um 08:27 schrieb Andrew C Aitchison via Exim-users: Any ideas what's happening here ? The increasing delays remind me of the delays for failed logins. Is there some sort of authentication, or similar pam or SMB access, to check the existence of each recipient ? i made a test mail with 3000 recipients in CC. Exim is increasing it's CPU Usage the more recipients are used until 100% CPU Core usage is reached: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1378404 root 20 0 66084 55300 10632 R 98,7 1,1 8:59.99 /usr/sbin/exim -Mc 1lKd2y-005m5p-3V I have a static linear timeframe from one recipient to another appearing in the log file, once the actual delivery from thunderbird ended. strace says, it's due to my SQL select i use. it would go faster if the SQL result would be cached, instead of repeating itself. Conclusion: a) the ( recipient ) listfunction in exim needs a redo, to improve performance. This is a very basic list optimization problem, you learn in data structures at the university. b) the setup in question has an additional problem, increasing the time exponential . because the core is reaching 100%, it's at it's max.. at that point, things slow down even more. c) time to make exim multi-threaded best regards, Marius -- ## List details at https://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
Re: [exim] Mail with thousands recipients takes exponential longer time
Hi, folks, thanks for your answers so far Just a few more input: There is no rate limiting, tar pitting or so . I created a special "accept" ACL just at the beginning of my acl_smtp_rcpt I created a special router at the beginning of my routers. I commented out most of my redirect routers. So most of the ACL and routing stuff in my config is skipped. In the strace log I see the handling of the remaining stuff of my exim config. But this handling is done after the observed delay between the RCPT line and the socket line RAM is fine, one of my four cores is at 100% during my tests, it's a Debian 10 buster OS with an exim 4.94 from the Debian 11-backport I will try your proposals and keep you informed. Regards, Olaf On 3/11/21 9:59 PM, Olaf Hopp (SCC) via Exim-users wrote: Dear collegues, my administration writes one(!) email with more than 3000 receipients in the envelope. (Please, no discussion about the sense of this. Period.) Receiving that mail takes about 2 two hours. Delivering that mail is done very quick, as usual. I started debugging this and I recognized that after every "RCPT TO" it took a sucessive increasing time until the "250 Accepted" was answered. At the beginning this was a 10th of a second. After 1000 recipients there was a delay of ~one second after the 2000th recipients this delay increased smoothly to 2 secends ofter each rcpt-to I threw away all my bells and whistles in the exim config, but no success. So I started "strace" to see whats happening: At the beginning I see 18:35:36.048265 read(7, "RCPT TO:\r\n", 8191) = 45 18:35:36.063187 alarm(0) = 300 18:35:36.063233 rt_sigaction(SIGALRM, {sa_handler=0x55bc6f392180, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f8eff4d9730}, NULL, 8) = 0 18:35:36.140936 socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 9 see the nearly zero difference between the RCPT line and the socket line After ~1000 recipients I have: 18:40:51.670874 read(7, "RCPT TO:\r\n", 8191) = 43 18:40:51.685450 alarm(0) = 300 18:40:51.685538 rt_sigaction(SIGALRM, {sa_handler=0x55bc6f392180, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f8eff4d9730}, NULL, 8) = 0 18:40:52.277971 socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 9 so half a second between the rcpt line and the socket line After ~2000 recipients: 19:06:28.229652 read(7, "RCPT TO:\r\n", 8191) = 29 19:06:28.244041 alarm(0) = 300 19:06:28.244253 rt_sigaction(SIGALRM, {sa_handler=0x55bc6f392180, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f8eff4d9730}, NULL, 8) = 0 19:06:29.776217 socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 9 about 1.5 seconds between rcpt and the socket line There are no dropped lines between the RCPT line until the socket line in the strace output Any ideas what's happening here ? (testing was done with "swaks" talking directly to the exim) Regards, Olaf -- Karlsruher Institut für Technologie (KIT) Steinbuch Centre for Computing (SCC) Dipl.-Geophys. Olaf Hopp Zirkel 2 Gebäude 20.21, Raum 316 76131 Karlsruhe Telefon: +49 721 608-48009 E-Mail: olaf.h...@kit.edu Web: www.scc.kit.edu Sitz der Körperschaft: Kaiserstraße 12, 76131 Karlsruhe KIT - Die Forschungsuniversität in der Helmholtz-Gemeinschaft smime.p7s Description: S/MIME Cryptographic Signature -- ## List details at https://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
Re: [exim] Mail with thousands recipients takes exponential longer time
On Thu, 11 Mar 2021, Olaf Hopp (SCC) via Exim-users wrote: my administration writes one(!) email with more than 3000 receipients in the envelope. (Please, no discussion about the sense of this. Period.) Receiving that mail takes about 2 two hours. Delivering that mail is done very quick, as usual. I started debugging this and I recognized that after every "RCPT TO" it took a sucessive increasing time until the "250 Accepted" was answered. At the beginning this was a 10th of a second. After 1000 recipients there was a delay of ~one second after the 2000th recipients this delay increased smoothly to 2 secends ofter each rcpt-to Any ideas what's happening here ? The increasing delays remind me of the delays for failed logins. Is there some sort of authentication, or similar pam or SMB access, to check the existence of each recipient ? -- Andrew C. Aitchison Kendal, UK and...@aitchison.me.uk -- ## List details at https://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
Re: [exim] Mail with thousands recipients takes exponential longer time
Hello. On Thu, Mar 11, 2021 at 09:59:43PM +0100, Olaf Hopp (SCC) via Exim-users wrote: > After ~2000 recipients: >19:06:28.229652 read(7, "RCPT TO:\r\n", 8191) = 29 >19:06:28.244041 alarm(0)= 300 >19:06:28.244253 rt_sigaction(SIGALRM, {sa_handler=0x55bc6f392180, > sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f8eff4d9730}, NULL, 8) = 0 >19:06:29.776217 socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, > IPPROTO_IP) = 9 > about 1.5 seconds between rcpt and the socket line > > There are no dropped lines between the RCPT line until the socket line in the > strace output > > Any ideas what's happening here ? There may be some cpu-intensive processing inside RCPT ACL. Try "ltrace -S -T -tt ..." to display library calls. But running exim with debug options would be more informative. -- Eugene Berdnikov -- ## List details at https://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
Re: [exim] Mail with thousands recipients takes exponential longer time
On 11/03/2021 20:59, Olaf Hopp (SCC) via Exim-users wrote: I started debugging this and I recognized that after every "RCPT TO" it took a sucessive increasing time until the "250 Accepted" was answered. At the beginning this was a 10th of a second. After 1000 recipients there was a delay of ~one second after the 2000th recipients this delay increased smoothly to 2 secends ofter each rcpt-to After ~2000 recipients: 19:06:28.229652 read(7, "RCPT TO:\r\n", 8191) = 29 19:06:28.244041 alarm(0) = 300 19:06:28.244253 rt_sigaction(SIGALRM, {sa_handler=0x55bc6f392180, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f8eff4d9730}, NULL, 8) = 0 19:06:29.776217 socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 9 about 1.5 seconds between rcpt and the socket line In a test here I'm seeing under a mmillisecond between RCPT commands processed by the server, even after 2000 of them: 21:51:59.681 12433 end of inline ACL: ACCEPT 21:51:59.681 12433 SMTP>> 250 Accepted 21:51:59.681 12433 SMTP<< rcpt to: 21:51:59.681 12433 processing "accept" (/home/jgh/git/exim/test/test-config 29) 21:51:59.681 12433 accept: condition test succeeded in inline ACL 21:51:59.681 12433 end of inline ACL: ACCEPT 21:51:59.681 12433 SMTP>> 250 Accepted 21:51:59.681 12433 SMTP<< data 21:51:59.681 12433 SMTP>> 354 Enter message, ending with "." on a line by itself 21:51:59.681 12433 search_tidyup called 21:51:59.681 12433 DMARC: no dmarc_tld_file 21:51:59.681 12433 >>Headers received: 21:51:59.682 12433 This was running the server with -d+all and with a log_selector = +millisec. Are you able to do that for a test? The other useful item you might be able to infer from your strace is: what was that socket used for? -- Cheers, Jeremy -- ## List details at https://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
Re: [exim] Mail with thousands recipients takes exponential longer time
On 11 Mar 2021, at 15:59, Olaf Hopp (SCC) via Exim-users wrote: Dear collegues, my administration writes one(!) email with more than 3000 receipients in the envelope. (Please, no discussion about the sense of this. Period.) Receiving that mail takes about 2 two hours. Delivering that mail is done very quick, as usual. I started debugging this and I recognized that after every "RCPT TO" it took a sucessive increasing time until the "250 Accepted" was answered. At the beginning this was a 10th of a second. After 1000 recipients there was a delay of ~one second after the 2000th recipients this delay increased smoothly to 2 secends ofter each rcpt-to I threw away all my bells and whistles in the exim config, but no success. [...] Any ideas what's happening here ? Check your config for ACL lines setting "ratelimit" and/or "delay". See examples at https://www.exim.org/exim-html-current/doc/html/spec_html/ch-access_control_lists.html -- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Not Currently Available For Hire -- ## List details at https://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/