Re: Seeing "check: exceeded time limit in ..." and need to resolve it
> On Nov 16, 2021, at 8:03 PM, Henrik K wrote: > > On Tue, Nov 16, 2021 at 01:08:16PM -0700, Philip Prindeville wrote: >> >> Or http.sh points to an NS that's offline... > > Your resolver shoukd time out _way_ sooner than some minutes. > >> Can the async lookup be back-ported? > > No, and there will be no new 3.4 releases. > Yeah, I still need to figure that out... When I run "dig -t any http.sh" it times out after a few seconds. But SpamAssassin is doing something very different. Not sure why. In any case, the workaround seems to be: uri_block_exclude __L_BLOCK_ISP ... http.sh shlom.in Where not resolving these last two domains makes the timeouts go away. Note that the pathology is the same in both cases: philipp@macbook3 ~ % dig @8.8.8.8 -tns shlom.in. ; <<>> DiG 9.10.6 <<>> @8.8.8.8 -tns shlom.in. ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 38665 ;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;shlom.in. IN NS ;; ANSWER SECTION: shlom.in. 300 IN NS ns1gmz.name.com. shlom.in. 300 IN NS ns2jrt.name.com. shlom.in. 300 IN NS ns3qtx.name.com. shlom.in. 300 IN NS ns4blx.name.com. ;; Query time: 84 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: Sun Dec 26 15:25:44 MST 2021 ;; MSG SIZE rcvd: 129 philipp@macbook3 ~ % philipp@macbook3 ~ % dig @8.8.8.8 -tns http.sh. ; <<>> DiG 9.10.6 <<>> @8.8.8.8 -tns http.sh. ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10013 ;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;http.sh. IN NS ;; ANSWER SECTION: http.sh.60 IN CNAME park.io. park.io.14797 IN NS ns-1348.awsdns-40.org. park.io.14797 IN NS ns-1624.awsdns-11.co.uk. park.io.14797 IN NS ns-441.awsdns-55.com. park.io.14797 IN NS ns-672.awsdns-20.net. ;; Query time: 245 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: Sun Dec 26 15:25:03 MST 2021 ;; MSG SIZE rcvd: 197 philipp@macbook3 ~ % Seems a little broken that the NS records aren't accompanied by 'A' glue records, but that's not catastrophic... normally a 2nd query would be done. Should the resolver code in SpamAssassin be more robust when it comes to such failures? philipp@macbook3 ~ % dig -ta ns-1348.awsdns-40.org. ; <<>> DiG 9.10.6 <<>> -ta ns-1348.awsdns-40.org. ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37011 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;ns-1348.awsdns-40.org. IN A ;; ANSWER SECTION: ns-1348.awsdns-40.org. 78740 IN A 205.251.197.68 ;; Query time: 51 msec ;; SERVER: 192.168.1.1#53(192.168.1.1) ;; WHEN: Sun Dec 26 15:27:16 MST 2021 ;; MSG SIZE rcvd: 66 philipp@macbook3 ~ % dig @205.251.197.68 -ta http.sh ; <<>> DiG 9.10.6 <<>> @205.251.197.68 -ta http.sh ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 28411 ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;http.sh. IN A ;; Query time: 67 msec ;; SERVER: 205.251.197.68#53(205.251.197.68) ;; WHEN: Sun Dec 26 15:27:32 MST 2021 ;; MSG SIZE rcvd: 25 philipp@macbook3 ~ % I'm not exactly sure what's falling down or why. Is there anyone with more BIND-fu than me that's willing to venture a guess? -Philip
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
On Tue, Nov 16, 2021 at 01:08:16PM -0700, Philip Prindeville wrote: > > Or http.sh points to an NS that's offline... Your resolver shoukd time out _way_ sooner than some minutes. > Can the async lookup be back-ported? No, and there will be no new 3.4 releases.
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
> On Nov 16, 2021, at 3:30 AM, Martin Gregorie wrote: > > On Mon, 2021-11-15 at 17:12 -0700, Philip Prindeville wrote: >> >> >>> On Nov 15, 2021, at 5:06 PM, Greg Troxel wrote: >>> >>> >>> Philip Prindeville writes: >>> Ah, the rule _eval_tests_type11_pri0_set1() took 4:20. Why can't I even find the rule? >>> > try "locate txrep" > > On my Fedora system 'locate' says that TxRep is a plugin, enabled by > installing: /usr/share/spamassassin/60_txrep.cf > > and, using "locate" again, that the plugin's manpage is > /usr/share/man/man3/Mail::SpamAssassin::Plugin::TxRep.3pm.gz > > So, "man 3 Mail::SpamAssassin::Plugin::TxRep" describes the TxRep plugin > and 'locate' says it is installed as > /usr/share/perl5/vendor_perl/Mail/SpamAssassin/Plugin/TxRep.pm > > Of course, other Linux distros may put it somewhere else, so use > 'locate' and, if it doesn't find 'txrep', run 'sudo updatedb' and try > again. > > Not trying to teach you to suck eggs, but, incredible as it may sound, > there are still some people who don't know about the 'locate' command. > > Martin > I know how to find the code modules. I wasn't sure where the database was kept. And I invoke SA from within Mimedefang, not via procmail on individual delivery, so only global rules get run. Nothing is per-user. -Philip
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
> On Nov 15, 2021, at 11:12 PM, Henrik K wrote: > > On Mon, Nov 15, 2021 at 04:25:55PM -0700, Philip Prindeville wrote: >> >> >>> On Nov 12, 2021, at 10:35 PM, Henrik K wrote: >>> >>> On Fri, Nov 12, 2021 at 07:49:00PM -0800, John Hardin wrote: What would be helpful here would be logging of when a rule *starts* evaluation. Normally that would be painful, but for tracking a runaway it would be useful. Perhaps I can code up something to capture that and log it on a timeout... >>> >>> It already exists >>> >>> spamassassin -D all,rules-all < msg >>> >> >> >> Ah, that was useful. >> >> Seeing: >> >> ... >> Nov 15 16:09:40.479 [54834] dbg: check: uri_block_cidr list >> 54.69.70.160-54.69.70.160 65.181.64.0-65.181.127.255 97.74.42.81-97.74.42.81 >> 98.124.199.1-98.124.199.1 167.114.67.88-167.114.67.88 >> 173.45.101.58-173.45.101.58 185.53.128.0-185.53.131.255 >> 192.31.186.4-192.31.186.4 192.99.0.0-192.99.255.255 >> 194.213.125.57-194.213.125.57 >> Nov 15 16:11:50.610 [54834] dbg: check: uri_local_bl http.sh addrs >> ... >> Nov 15 16:16:00.876 [54834] dbg: async: timing: 5.193 . >> dns:A:100.136.134.134.psbl.surriel.com >> Nov 15 16:16:00.876 [54834] dbg: async: timing: 5.199 . >> dns:A:100.136.134.134.dnsbl.sorbs.net >> Nov 15 16:16:00.876 [54834] dbg: async: timing: 385.726 X A:http.sh >> Nov 15 16:16:00.876 [54834] dbg: async: timing: 385.726 X NS:http.sh >> ... >> >> >> Why would resolving http.sh take this long? And can we bring down the >> timeout? >> >> Hard to imagine DNS requests taking more than a couple of seconds. >> >> -Philip > > SA 3.4 has old version of URILocalBL without async lookups, probably that's > why. Not sure why it would take minutes though, unless your system DNS is > configured strangely and/or is not caching.. > Or http.sh points to an NS that's offline... Can the async lookup be back-ported?
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
Replies... some duplication of conversation on "mimedefang". > On Nov 15, 2021, at 10:34 PM, Bill Cole > wrote: > > On 2021-11-15 at 18:08:20 UTC-0500 (Mon, 15 Nov 2021 16:08:20 -0700) > Philip Prindeville > is rumored to have said: > >>> On Nov 12, 2021, at 8:49 PM, John Hardin wrote: >>> >>> On Fri, 12 Nov 2021, Philip Prindeville wrote: >>> I got the message, saved it to a flat file, and ran "spamassassin -t -D rules < netdev.eml" and saw: ... Nov 12 11:45:38.048 [36367] dbg: rules: ran eval rule __ANY_TEXT_ATTACH_DOC ==> got hit (1) ... Nov 12 11:45:38.063 [36367] dbg: rules: ran eval rule __ANY_TEXT_ATTACH ==> got hit (1) Nov 12 11:49:58.565 [36367] info: check: exceeded time limit in Mail::SpamAssassin::Plugin::Check::_eval_tests_type11_pri0_set1, skipping further tests ... Am I correct that __ANY_TEXT_ATTACH alone took 4:30s? >>> >>> "ran ... got hit" is past tense. And it needs to complete the rule to know >>> whether it got a hit. >>> >>> 11:45:38.048 -> 11:45:38.063 = less than 20 msec. >>> >>> The next rule, whatever that was, is the one that timed out after 4m20s. >> >> >> Ah, the rule _eval_tests_type11_pri0_set1() took 4:20. >> >> Why can't I even find the rule? > > That "rule" is actually a subroutine that is assembled and named at runtime > in M:SA:Check from set 1 (probably the only set) of the "body eval" (type 11) > rules running at priority 0. Yeah, if we can get the source file and line # that's almost as good as a function name (since there isn't one, if it's an anonymous sub block). > > Which suggests a tough bit of troubleshooting. > Could there be rules that *aren't* matching but are taking a while? >>> >>> It's timing out on a rule that's running away. The timeout triggers before >>> "hit/no hit" is known. >>> >>> What would be helpful here would be logging of when a rule *starts* >>> evaluation. Normally that would be painful, but for tracking a runaway it >>> would be useful. Perhaps I can code up something to capture that and log it >>> on a timeout... >> >> >> Whenever a rule gets started, you could save the name and start time, and >> then burp that during timeout handling, right? > > I like that idea. I have no idea how feasible it is. Me neither. I use Perl less and less everyday, and my fu is fading almost as fast as new features are coming in. -Philip
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
On Mon, 15 Nov 2021, Philip Prindeville wrote: On Nov 12, 2021, at 8:49 PM, John Hardin wrote: On Fri, 12 Nov 2021, Philip Prindeville wrote: I got the message, saved it to a flat file, and ran "spamassassin -t -D rules < netdev.eml" and saw: ... Nov 12 11:45:38.048 [36367] dbg: rules: ran eval rule __ANY_TEXT_ATTACH_DOC ==> got hit (1) ... Nov 12 11:45:38.063 [36367] dbg: rules: ran eval rule __ANY_TEXT_ATTACH ==> got hit (1) Nov 12 11:49:58.565 [36367] info: check: exceeded time limit in Mail::SpamAssassin::Plugin::Check::_eval_tests_type11_pri0_set1, skipping further tests ... Am I correct that __ANY_TEXT_ATTACH alone took 4:30s? "ran ... got hit" is past tense. And it needs to complete the rule to know whether it got a hit. 11:45:38.048 -> 11:45:38.063 = less than 20 msec. The next rule, whatever that was, is the one that timed out after 4m20s. Ah, the rule _eval_tests_type11_pri0_set1() took 4:20. Why can't I even find the rule? Run it with "-D rules,rules-all" and it should list each rule as it starts executing. Could there be rules that *aren't* matching but are taking a while? It's timing out on a rule that's running away. The timeout triggers before "hit/no hit" is known. What would be helpful here would be logging of when a rule *starts* evaluation. Normally that would be painful, but for tracking a runaway it would be useful. Perhaps I can code up something to capture that and log it on a timeout... Whenever a rule gets started, you could save the name and start time, and then burp that during timeout handling, right? The rule name at least. If you want to send me that message zipped up I can try it here with those changes and see if it's a base rule running away. Sent out-of-band. Doh. Forgot to zip it. I'll be happy to take a look, but running with rules-all at your end would be faster... I forgot I'd already added that. But I will still take a look at capturing the rule name for the timeout message. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.org pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Never forget, even for an instant, that the one and only reason anyone has for taking your gun away is to make you weaker than he is, so he can do something to you that you wouldn’t let him do if you were equipped to prevent it. This goes for burglars, muggers, and rapists, and even more so for policemen, bureaucrats, and politicians. -- Alexander Pope --- 535 days since the first private commercial manned orbital mission (SpaceX)
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
For that matter how many know about 'apropos'? And, even if they do, they may not discover 'locate' because 'apropos search' doesn't find either 'updatedb' or 'locate'. You have to enter 'apropos find' to discover that 'locate' exists, and even then you could get side tracked into trying to use the much more complex 'find' utility. Or the old-school rpm: $ rpm -ql spamassassin|grep TxRep /usr/share/man/man3/Mail::SpamAssassin::Plugin::TxRep.3pm.gz /usr/share/perl5/vendor_perl/Mail/SpamAssassin/Plugin/TxRep.pm Dave Martin
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
On Tue, 2021-11-16 at 08:33 -0500, Bill Cole wrote: > > Worth noting: locate & updatedb aren't always installed. > Fair comment: they're a standard part of Fedora. IIRC they are also part of the RaspberryPi OS distro, so are likely to be included in Debian and most of its clones. But: how many "intro to Linux" books mention 'updatedb' or 'locate'? They're not mentioned in 'UNIX in a Nutshell' (so maybe not in 'Linux in a Nutshell' either) or in the 'RaspberryPi User Guide', though they are in Michael Kopfler's 1999 Linux sysadmin book. You don't see them mentioned much in newsgroups either, so I'm left wondering how many Linux users who either taught themselves to use Linux or picked it up at work from a colleague would ever know about 'locate' and 'updatedb'? For that matter how many know about 'apropos'? And, even if they do, they may not discover 'locate' because 'apropos search' doesn't find either 'updatedb' or 'locate'. You have to enter 'apropos find' to discover that 'locate' exists, and even then you could get side tracked into trying to use the much more complex 'find' utility. Martin
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
On Tue, Nov 16, 2021 at 08:33:56AM -0500, Bill Cole wrote: > On 2021-11-16 at 05:30:25 UTC-0500 (Tue, 16 Nov 2021 10:30:25 +) > Martin Gregorie > is rumored to have said: > > > Of course, other Linux distros may put it somewhere else, so use > > 'locate' and, if it doesn't find 'txrep', run 'sudo updatedb' and try > > again. > > > > Not trying to teach you to suck eggs, but, incredible as it may sound, > > there are still some people who don't know about the 'locate' command. > > Worth noting: locate & updatedb aren't always installed. Offtopic, but there's even very little reason to install it, it just eats resources, not to mention one more point of vulnerability. If you need to search something, "find /" takes some seconds on any remotely modern system and always returns fresh results.
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
On 2021-11-16 at 05:30:25 UTC-0500 (Tue, 16 Nov 2021 10:30:25 +) Martin Gregorie is rumored to have said: > Of course, other Linux distros may put it somewhere else, so use > 'locate' and, if it doesn't find 'txrep', run 'sudo updatedb' and try > again. > > Not trying to teach you to suck eggs, but, incredible as it may sound, > there are still some people who don't know about the 'locate' command. Worth noting: locate & updatedb aren't always installed. -- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Not Currently Available For Hire
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
On Mon, 2021-11-15 at 17:12 -0700, Philip Prindeville wrote: > > > > On Nov 15, 2021, at 5:06 PM, Greg Troxel wrote: > > > > > > Philip Prindeville writes: > > > > > Ah, the rule _eval_tests_type11_pri0_set1() took 4:20. > > > > > > Why can't I even find the rule? > > try "locate txrep" On my Fedora system 'locate' says that TxRep is a plugin, enabled by installing: /usr/share/spamassassin/60_txrep.cf and, using "locate" again, that the plugin's manpage is /usr/share/man/man3/Mail::SpamAssassin::Plugin::TxRep.3pm.gz So, "man 3 Mail::SpamAssassin::Plugin::TxRep" describes the TxRep plugin and 'locate' says it is installed as /usr/share/perl5/vendor_perl/Mail/SpamAssassin/Plugin/TxRep.pm Of course, other Linux distros may put it somewhere else, so use 'locate' and, if it doesn't find 'txrep', run 'sudo updatedb' and try again. Not trying to teach you to suck eggs, but, incredible as it may sound, there are still some people who don't know about the 'locate' command. Martin >
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
On Mon, Nov 15, 2021 at 04:25:55PM -0700, Philip Prindeville wrote: > > > > On Nov 12, 2021, at 10:35 PM, Henrik K wrote: > > > > On Fri, Nov 12, 2021 at 07:49:00PM -0800, John Hardin wrote: > >> > >> What would be helpful here would be logging of when a rule *starts* > >> evaluation. Normally that would be painful, but for tracking a runaway it > >> would be useful. Perhaps I can code up something to capture that and log it > >> on a timeout... > > > > It already exists > > > > spamassassin -D all,rules-all < msg > > > > > Ah, that was useful. > > Seeing: > > ... > Nov 15 16:09:40.479 [54834] dbg: check: uri_block_cidr list > 54.69.70.160-54.69.70.160 65.181.64.0-65.181.127.255 97.74.42.81-97.74.42.81 > 98.124.199.1-98.124.199.1 167.114.67.88-167.114.67.88 > 173.45.101.58-173.45.101.58 185.53.128.0-185.53.131.255 > 192.31.186.4-192.31.186.4 192.99.0.0-192.99.255.255 > 194.213.125.57-194.213.125.57 > Nov 15 16:11:50.610 [54834] dbg: check: uri_local_bl http.sh addrs > ... > Nov 15 16:16:00.876 [54834] dbg: async: timing: 5.193 . > dns:A:100.136.134.134.psbl.surriel.com > Nov 15 16:16:00.876 [54834] dbg: async: timing: 5.199 . > dns:A:100.136.134.134.dnsbl.sorbs.net > Nov 15 16:16:00.876 [54834] dbg: async: timing: 385.726 X A:http.sh > Nov 15 16:16:00.876 [54834] dbg: async: timing: 385.726 X NS:http.sh > ... > > > Why would resolving http.sh take this long? And can we bring down the > timeout? > > Hard to imagine DNS requests taking more than a couple of seconds. > > -Philip SA 3.4 has old version of URILocalBL without async lookups, probably that's why. Not sure why it would take minutes though, unless your system DNS is configured strangely and/or is not caching..
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
On 2021-11-15 at 18:08:20 UTC-0500 (Mon, 15 Nov 2021 16:08:20 -0700) Philip Prindeville is rumored to have said: On Nov 12, 2021, at 8:49 PM, John Hardin wrote: On Fri, 12 Nov 2021, Philip Prindeville wrote: I got the message, saved it to a flat file, and ran "spamassassin -t -D rules < netdev.eml" and saw: ... Nov 12 11:45:38.048 [36367] dbg: rules: ran eval rule __ANY_TEXT_ATTACH_DOC ==> got hit (1) ... Nov 12 11:45:38.063 [36367] dbg: rules: ran eval rule __ANY_TEXT_ATTACH ==> got hit (1) Nov 12 11:49:58.565 [36367] info: check: exceeded time limit in Mail::SpamAssassin::Plugin::Check::_eval_tests_type11_pri0_set1, skipping further tests ... Am I correct that __ANY_TEXT_ATTACH alone took 4:30s? "ran ... got hit" is past tense. And it needs to complete the rule to know whether it got a hit. 11:45:38.048 -> 11:45:38.063 = less than 20 msec. The next rule, whatever that was, is the one that timed out after 4m20s. Ah, the rule _eval_tests_type11_pri0_set1() took 4:20. Why can't I even find the rule? That "rule" is actually a subroutine that is assembled and named at runtime in M:SA:Check from set 1 (probably the only set) of the "body eval" (type 11) rules running at priority 0. Which suggests a tough bit of troubleshooting. Could there be rules that *aren't* matching but are taking a while? It's timing out on a rule that's running away. The timeout triggers before "hit/no hit" is known. What would be helpful here would be logging of when a rule *starts* evaluation. Normally that would be painful, but for tracking a runaway it would be useful. Perhaps I can code up something to capture that and log it on a timeout... Whenever a rule gets started, you could save the name and start time, and then burp that during timeout handling, right? I like that idea. I have no idea how feasible it is. -- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Not Currently Available For Hire
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
Philip Prindeville writes: >> That looks very familiar. I was having timeouts, and saw that in the >> logs, on certain messages. I ended up nuking and rebuilding my TXREP >> database and then things were ok. >> >> That doesn't explain why we can't find the rule, which is a good >> question. > > Where is the TXREP database? For me, it's in: $HOME/.spamassassin/tx-reputation > Also, is it possible that the name is generated through some sort of > mangling, the way that C function names can be generated from macro > expansions, etc? Almost anything is possible... signature.asc Description: PGP signature
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
> On Nov 15, 2021, at 5:06 PM, Greg Troxel wrote: > > > Philip Prindeville writes: > >> Ah, the rule _eval_tests_type11_pri0_set1() took 4:20. >> >> Why can't I even find the rule? > > That looks very familiar. I was having timeouts, and saw that in the > logs, on certain messages. I ended up nuking and rebuilding my TXREP > database and then things were ok. > > That doesn't explain why we can't find the rule, which is a good > question. > Where is the TXREP database? Also, is it possible that the name is generated through some sort of mangling, the way that C function names can be generated from macro expansions, etc?
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
Philip Prindeville writes: > Ah, the rule _eval_tests_type11_pri0_set1() took 4:20. > > Why can't I even find the rule? That looks very familiar. I was having timeouts, and saw that in the logs, on certain messages. I ended up nuking and rebuilding my TXREP database and then things were ok. That doesn't explain why we can't find the rule, which is a good question. signature.asc Description: PGP signature
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
> On Nov 12, 2021, at 10:35 PM, Henrik K wrote: > > On Fri, Nov 12, 2021 at 07:49:00PM -0800, John Hardin wrote: >> >> What would be helpful here would be logging of when a rule *starts* >> evaluation. Normally that would be painful, but for tracking a runaway it >> would be useful. Perhaps I can code up something to capture that and log it >> on a timeout... > > It already exists > > spamassassin -D all,rules-all < msg > Ah, that was useful. Seeing: ... Nov 15 16:09:40.479 [54834] dbg: check: uri_block_cidr list 54.69.70.160-54.69.70.160 65.181.64.0-65.181.127.255 97.74.42.81-97.74.42.81 98.124.199.1-98.124.199.1 167.114.67.88-167.114.67.88 173.45.101.58-173.45.101.58 185.53.128.0-185.53.131.255 192.31.186.4-192.31.186.4 192.99.0.0-192.99.255.255 194.213.125.57-194.213.125.57 Nov 15 16:11:50.610 [54834] dbg: check: uri_local_bl http.sh addrs ... Nov 15 16:16:00.876 [54834] dbg: async: timing: 5.193 . dns:A:100.136.134.134.psbl.surriel.com Nov 15 16:16:00.876 [54834] dbg: async: timing: 5.199 . dns:A:100.136.134.134.dnsbl.sorbs.net Nov 15 16:16:00.876 [54834] dbg: async: timing: 385.726 X A:http.sh Nov 15 16:16:00.876 [54834] dbg: async: timing: 385.726 X NS:http.sh ... Why would resolving http.sh take this long? And can we bring down the timeout? Hard to imagine DNS requests taking more than a couple of seconds. -Philip
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
> On Nov 12, 2021, at 8:49 PM, John Hardin wrote: > > On Fri, 12 Nov 2021, Philip Prindeville wrote: > >> I got the message, saved it to a flat file, and ran "spamassassin -t -D >> rules < netdev.eml" and saw: >> >> ... >> Nov 12 11:45:38.048 [36367] dbg: rules: ran eval rule __ANY_TEXT_ATTACH_DOC >> ==> got hit (1) >> ... >> Nov 12 11:45:38.063 [36367] dbg: rules: ran eval rule __ANY_TEXT_ATTACH >> ==> got hit (1) >> Nov 12 11:49:58.565 [36367] info: check: exceeded time limit in >> Mail::SpamAssassin::Plugin::Check::_eval_tests_type11_pri0_set1, skipping >> further tests >> ... >> >> Am I correct that __ANY_TEXT_ATTACH alone took 4:30s? > > "ran ... got hit" is past tense. And it needs to complete the rule to know > whether it got a hit. > > 11:45:38.048 -> 11:45:38.063 = less than 20 msec. > > The next rule, whatever that was, is the one that timed out after 4m20s. Ah, the rule _eval_tests_type11_pri0_set1() took 4:20. Why can't I even find the rule? > >> Could there be rules that *aren't* matching but are taking a while? > > It's timing out on a rule that's running away. The timeout triggers before > "hit/no hit" is known. > > What would be helpful here would be logging of when a rule *starts* > evaluation. Normally that would be painful, but for tracking a runaway it > would be useful. Perhaps I can code up something to capture that and log it > on a timeout... Whenever a rule gets started, you could save the name and start time, and then burp that during timeout handling, right? > > If you want to send me that message zipped up I can try it here with those > changes and see if it's a base rule running away. > Sent out-of-band. Doh. Forgot to zip it.
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
On Sat, 13 Nov 2021, Loren Wilton wrote: What would be helpful here would be logging of when a rule *starts* evaluation. Normally that would be painful, but for tracking a runaway it would be useful. Perhaps I can code up something to capture that and log it on a timeout... Actually what sounds like it would be useful would be knowing the name of the rule that timed out. I'm presuming when the timeout occurs that there is still some indication of the current rule being processed so it can be killed. I'd think that should be enough to backtrack to the rule name. A modification to the timeout message could display the name of the rule and even how long it took to that point. That's what I was thinking when I said "capture and log". -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.org pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Running away is the coward's way out of a war; appeasement is the coward's way into a war. -- Thorax --- 532 days since the first private commercial manned orbital mission (SpaceX)
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
On Sat, 13 Nov 2021, Henrik K wrote: On Fri, Nov 12, 2021 at 07:49:00PM -0800, John Hardin wrote: What would be helpful here would be logging of when a rule *starts* evaluation. Normally that would be painful, but for tracking a runaway it would be useful. Perhaps I can code up something to capture that and log it on a timeout... It already exists spamassassin -D all,rules-all < msg Ugh, yeah, I remember doing that now. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.org pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Running away is the coward's way out of a war; appeasement is the coward's way into a war. -- Thorax --- 532 days since the first private commercial manned orbital mission (SpaceX)
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
What would be helpful here would be logging of when a rule *starts* evaluation. Normally that would be painful, but for tracking a runaway it would be useful. Perhaps I can code up something to capture that and log it on a timeout... Actually what sounds like it would be useful would be knowing the name of the rule that timed out. I'm presuming when the timeout occurs that there is still some indication of the current rule being processed so it can be killed. I'd think that should be enough to backtrack to the rule name. A modification to the timeout message could display the name of the rule and even how long it took to that point. I guess there might be multiple rules running when the timeout occurs and not know which one really timed out, but that would still be a small number of rule names. Loren
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
On Fri, Nov 12, 2021 at 07:49:00PM -0800, John Hardin wrote: > > What would be helpful here would be logging of when a rule *starts* > evaluation. Normally that would be painful, but for tracking a runaway it > would be useful. Perhaps I can code up something to capture that and log it > on a timeout... It already exists spamassassin -D all,rules-all < msg
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
On Fri, 12 Nov 2021, Philip Prindeville wrote: I got the message, saved it to a flat file, and ran "spamassassin -t -D rules < netdev.eml" and saw: ... Nov 12 11:45:38.048 [36367] dbg: rules: ran eval rule __ANY_TEXT_ATTACH_DOC ==> got hit (1) ... Nov 12 11:45:38.063 [36367] dbg: rules: ran eval rule __ANY_TEXT_ATTACH ==> got hit (1) Nov 12 11:49:58.565 [36367] info: check: exceeded time limit in Mail::SpamAssassin::Plugin::Check::_eval_tests_type11_pri0_set1, skipping further tests ... Am I correct that __ANY_TEXT_ATTACH alone took 4:30s? "ran ... got hit" is past tense. And it needs to complete the rule to know whether it got a hit. 11:45:38.048 -> 11:45:38.063 = less than 20 msec. The next rule, whatever that was, is the one that timed out after 4m20s. Could there be rules that *aren't* matching but are taking a while? It's timing out on a rule that's running away. The timeout triggers before "hit/no hit" is known. What would be helpful here would be logging of when a rule *starts* evaluation. Normally that would be painful, but for tracking a runaway it would be useful. Perhaps I can code up something to capture that and log it on a timeout... If you want to send me that message zipped up I can try it here with those changes and see if it's a base rule running away. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.org pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- The most glaring example of the cognitive dissonance on the left is the concept that human beings are inherently good, yet at the same time cannot be trusted with any kind of weapon, unless the magic fairy dust of government authority gets sprinkled upon them. -- Moshe Ben-David --- 531 days since the first private commercial manned orbital mission (SpaceX)