Re: Benchmarking SA
On Fri, 15 Jul 2016 12:00:59 -0400 Olivier Coutu wrote: > On 2016-07-15 10:22, Reindl Harald wrote: > > > > > > Am 15.07.2016 um 16:06 schrieb Olivier Coutu: > >> I am trying to figure out what part of SA is taking the most time > >> on certain e-mails, e.g > >> > >> time spamassassin ham-1468528393442166.eml > >> > >> [...] > >> > >> real0m34.531s > >> > >> user0m33.958s > >> > >> sys0m0.452s > >> > >> > Some DNS lookups indeed seem to take a certain amount of time, but > the first 4-second hop cannot be explained that way. Is there a way > that I could limit the number of DNS lookups being made, say to max > 50? Take a look at the parameters to rbl_timeout. However, the timings above suggest that this has nothing to do with network delays. If it were then "user"+"sys" would be a few percent of "real". IIWY I'd be more worried about why it's using so much cpu time.
Re: Benchmarking SA
Am 15.07.2016 um 18:00 schrieb Olivier Coutu: On 2016-07-15 10:22, Reindl Harald wrote: just use spamassassin -D which gives you some timing informations, most likely DNS - DNSBL/DNSWL/URIBL Some DNS lookups indeed seem to take a certain amount of time, but the first 4-second hop cannot be explained that way. Is there a way that I could limit the number of DNS lookups being made, say to max 50? you have a timing report at the end - just look closer it makes *no sense* to limit dns lookups - which ones would you skip - slower ones? what when the slowest ones are the RBL/URIBL hits which would have been the most important of the whole scan when would you decide what is fast and what is slow - they are all fired asynchron at the same time - so it's not possible _ at the end before the sa-report you have something like this Jul 15 18:08:46.165 [6210] dbg: check: tests=BAYES_99,BAYES_999,DATE_IN_PAST_96_XX,HTML_IMAGE_ONLY_32,HTML_MESSAGE,MISSING_HEADERS Jul 15 18:08:46.165 [6210] dbg: check: subtests=__ANY_TEXT_ATTACH,__ANY_TEXT_ATTACH_DOC,__BODY_TEXT_LINE,__BODY_TEXT_LINE,__BODY_TEXT_LINE,__CLICK_HERE,__CR_IN_SUBJ,__CT,__CTE,__CTYPE_HAS_BOUNDARY,__CTYPE_MULTIPART_ALT,__CTYPE_MULTIPART_ANY,__DKIM_DEPENDABLE,__DOS_HAS_ANY_URI,__DOS_LINK,__DOS_RCVD_MON,__ENV_AND_HDR_FROM_MATCH,__FB_COST,__FB_NUM_PERCNT,__FROM_FULL_NAME,__HAS_ANY_EMAIL,__HAS_ANY_URI,__HAS_DATE,__HAS_FROM,__HAS_MESSAGE_ID,__HAS_MSGID,__HAS_RCVD,__HAS_SUBJECT,__HAS_URI,__HUSH_HUSH,__JM_REACTOR_DATE,__LCL__ENV_AND_HDR_FROM_MATCH,__LOCAL_PP_NONPPURL,__MIME_HTML,__MIME_VERSION,__MISSING_REF,__MISSING_REPLY,__NONEMPTY_BODY,__NOT_A_PERSON,__NOT_SPOOFED,__RATWARE_0_TZ_DATE,__SANE_MSGID,__SUBSCRIPTION_INFO,__TO_NO_ARROWS_R,__TO_NO_BRKTS_PCNT,__TVD_MIME_ATT_TP,__UNPARSEABLE_RELAY_COUNT,__UNSUB_LINK,__URI_MAILTO Jul 15 18:08:46.166 [6210] dbg: timing: total 2373 ms - init: 754 (31.8%), parse: 0.61 (0.0%), extract_message_metadata: 39 (1.6%), get_uri_detail_list: 3.7 (0.2%), tests_pri_-1000: 7 (0.3%), compile_gen: 107 (4.5%), compile_eval: 16 (0.7%), tests_pri_-950: 3.2 (0.1%), tests_pri_-900: 6 (0.3%), tests_pri_-500: 28 (1.2%), dkim_load_modules: 15 (0.6%), check_dkim_signature: 0.32 (0.0%), tests_pri_-400: 17 (0.7%), check_bayes: 13 (0.6%), b_tokenize: 7 (0.3%), b_tok_get_all: 2.5 (0.1%), b_comp_prob: 2.4 (0.1%), b_tok_touch_all: 0.13 (0.0%), b_finish: 0.81 (0.0%), tests_pri_0: 1173 (49.4%), check_spf: 21 (0.9%), check_dkim_adsp: 87 (3.7%), check_razor2: 552 (23.2%), check_pyzor: 139 (5.9%), tests_pri_500: 341 (14.4%), poll_dns_idle: 296 (12.5%) signature.asc Description: OpenPGP digital signature
Re: Benchmarking SA
On 2016-07-15 10:22, Reindl Harald wrote: Am 15.07.2016 um 16:06 schrieb Olivier Coutu: I am trying to figure out what part of SA is taking the most time on certain e-mails, e.g time spamassassin ham-1468528393442166.eml [...] real0m34.531s user0m33.958s sys0m0.452s I have installed HitFreqsRuleTiming and the timing result file has specific rules that takes long, but the total is about 21 seconds: awk '{print $3}' timing.log | awk '{s+=$1}END{print s}' 21.4295 I was wondering if there was a way to find out what the extra 13 seconds were used for. Also, would there be a better benchmarking strategy? I have attempted to look at the debug, but the timing seems to skip multiple seconds at random intervals that are not indicative of what rule ran just use spamassassin -D which gives you some timing informations, most likely DNS - DNSBL/DNSWL/URIBL I have looked at -D: spamassassin -D 2>&1 < ham-1468528393442166.eml I am still getting multi-second hops in the debug outputs: [...] jui 15 11:47:28.527 [29298] dbg: rules: Z_HAMMY_SPF (pri -60) requires __ENV_FROM_4SQUARE (pri 0): fixed jui 15 11:47:34.402 [29298] dbg: rules: Z_HAMMY_SPF (pri -60) requires __ENV_FROM_DIGG (pri 0): fixed [...] jui 15 11:47:36.723 [29298] dbg: dns: dns reply 5194 is OK, 1 answer records jui 15 11:47:46.829 [29298] dbg: async: calling callback on key A:ns1.dnsmadeeasy.com [...] jui 15 11:47:46.853 [29298] dbg: async: calling callback on key DNSBL:2.126.80.208:zen.spamhaus.org, rule URIBL_SBL jui 15 11:52:07.569 [29298] dbg: uridnsbl: complete_dnsbl_lookup URIBL_SBL DNSBL:2.126.80.208:zen.spamhaus.org [...] jui 15 11:52:07.594 [29298] dbg: rules: flush_evalstr (run_generic_tests) compiling 469 chars of Mail::SpamAssassin::Plugin::Check::_uri_tests_neg950_1 jui 15 11:52:19.274 [29298] dbg: rules: run_generic_tests - compiling eval code: uri, priority -950 [...] Some DNS lookups indeed seem to take a certain amount of time, but the first 4-second hop cannot be explained that way. Is there a way that I could limit the number of DNS lookups being made, say to max 50?
Re: Benchmarking SA
Am 15.07.2016 um 16:06 schrieb Olivier Coutu: I am trying to figure out what part of SA is taking the most time on certain e-mails, e.g time spamassassin ham-1468528393442166.eml [...] real0m34.531s user0m33.958s sys0m0.452s I have installed HitFreqsRuleTiming and the timing result file has specific rules that takes long, but the total is about 21 seconds: awk '{print $3}' timing.log | awk '{s+=$1}END{print s}' 21.4295 I was wondering if there was a way to find out what the extra 13 seconds were used for. Also, would there be a better benchmarking strategy? I have attempted to look at the debug, but the timing seems to skip multiple seconds at random intervals that are not indicative of what rule ran just use spamassassin -D which gives you some timing informations, most likely DNS - DNSBL/DNSWL/URIBL signature.asc Description: OpenPGP digital signature
Benchmarking SA
I am trying to figure out what part of SA is taking the most time on certain e-mails, e.g time spamassassin ham-1468528393442166.eml [...] real0m34.531s user0m33.958s sys0m0.452s I have installed HitFreqsRuleTiming and the timing result file has specific rules that takes long, but the total is about 21 seconds: awk '{print $3}' timing.log | awk '{s+=$1}END{print s}' 21.4295 I was wondering if there was a way to find out what the extra 13 seconds were used for. Also, would there be a better benchmarking strategy? I have attempted to look at the debug, but the timing seems to skip multiple seconds at random intervals that are not indicative of what rule ran. -- Olivier Coutu Assistance technique Technical Support T : 514-527-3232 x 2 n...@zerospam.ca