[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Bug 7987 depends on bug 7735, which changed state. Bug 7735 Summary: Meta rules need to handle missing/unrun dependencies https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Bug 7987 depends on bug 7735, which changed state. Bug 7735 Summary: Meta rules need to handle missing/unrun dependencies https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Bug 7987 depends on bug 7735, which changed state. Bug 7735 Summary: Meta rules need to handle missing/unrun dependencies https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Bug 7987 depends on bug 7735, which changed state. Bug 7735 Summary: Meta rules need to handle missing/unrun dependencies https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Bug 7987 depends on bug 7735, which changed state. Bug 7735 Summary: Meta rules need to handle missing/unrun dependencies https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Bug 7987 depends on bug 7735, which changed state. Bug 7735 Summary: Meta rules need to handle missing/unrun dependencies https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Henrik Krohns changed: What|Removed |Added Depends on||7735 --- Comment #30 from Henrik Krohns --- - Fix eval functions returning unintended "undef" Committed revision 1901403. Referenced Bugs: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7735 [Bug 7735] Meta rules need to handle missing/unrun dependencies -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Henrik Krohns changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED|RESOLVED --- Comment #29 from Henrik Krohns --- No objections to the combo of "return undef" for eval and $pms->rule_ready() usage have been seen. Any bugs or theoretical performance improvements can be discussed here or in a new bug, but the current state "works as intended" and should not hold up 4.0.0, so I'm closing this. -- You are receiving this mail because: You are the assignee for the bug.
Re: [Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
Am 2022-05-19 17:01, schrieb Henrik K: On Thu, May 19, 2022 at 04:11:53PM +0200, Michael Storz wrote: Am 2022-05-19 11:53, schrieb bugzilla-dae...@spamassassin.apache.org: > https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 > > --- Comment #28 from Henrik Krohns --- > A quick cleanup to tidy things: > > - Use rule_ready() everywhere instead of direct tests_already_hit modify Is this really necessary? It introduces thousands of additional subroutine calls. Not sure why you even ask. Why do functions exist? Do you now prefer to duplicate same code blocks everywhere? The algorithm you have now implemented looks too complicated and does not convince me. Maybe for once you could post an actual tested and benchmarked patch, instead of some random code snippets and comments? :-) There's only so much time I can spare myself. Ok, I will see what I can do. However, it will take until next week, as I am busy with something else at the moment. Michael
Re: [Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
On Thu, May 19, 2022 at 04:11:53PM +0200, Michael Storz wrote: > Am 2022-05-19 11:53, schrieb bugzilla-dae...@spamassassin.apache.org: > > https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 > > > > --- Comment #28 from Henrik Krohns --- > > A quick cleanup to tidy things: > > > > - Use rule_ready() everywhere instead of direct tests_already_hit modify > > Is this really necessary? It introduces thousands of additional subroutine > calls. Not sure why you even ask. Why do functions exist? Do you now prefer to duplicate same code blocks everywhere? > The algorithm you have now implemented looks too complicated and does not > convince me. Maybe for once you could post an actual tested and benchmarked patch, instead of some random code snippets and comments? :-) There's only so much time I can spare myself.
Re: [Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
Am 2022-05-19 11:53, schrieb bugzilla-dae...@spamassassin.apache.org: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #28 from Henrik Krohns --- A quick cleanup to tidy things: - Use rule_ready() everywhere instead of direct tests_already_hit modify Is this really necessary? It introduces thousands of additional subroutine calls. - Simple tracking of meta dependency hits, run do_meta_tests only when needed - Do not run do_meta_tests on last priority, as finish_meta_tests will run anyway Committed revision 1901060. It already reduces unnecessary do_meta_tests calls a lot. Before: do_meta_tests -1000 ready 2 do_meta_tests -950 ready 0 do_meta_tests -900 ready 0 do_meta_tests -100 ready 5 do_meta_tests -90 ready 0 do_meta_tests 0 ready 1234 do_meta_tests 500 ready 0 After: do_meta_tests -950 ready 2 do_meta_tests -100 ready 5 do_meta_tests 0 ready 1234 Though in the grand total runtimes, it makes very little difference. It's hard to optimize things, as most metas will end up ready at priority 0 anyway. Will look if there's further to improve, but I'm quite sceptical about the --$conf->{meta_dep_count} stuff, as it's really hard to guarantee that someone doesn't call rule_ready() multiple times, obviously then the counts will end up wrong and metas could be run prematurely. It's possible to track readiness perfectly with a hash from which you delete dependencies as they occur. I already tried something like that, but it gets so complex that it actually increases runtimes. No, this is absolutely not a problem. After going through the array with the dependencies the array should be deleted. Since it is only used once at the first time the rule is declared finished either by a call to got_hit or rule_ready, it can be deleted. Any further call of got_hit or rule_ready makes no change of the counters of the meta rules. The algorithm you have now implemented looks too complicated and does not convince me. Try my algorithm and do it even for the last priority. Then look which rules must be handled by finish_meta_tests and if your algorithm is needed for a better evaluation. Michael
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #28 from Henrik Krohns --- A quick cleanup to tidy things: - Use rule_ready() everywhere instead of direct tests_already_hit modify - Simple tracking of meta dependency hits, run do_meta_tests only when needed - Do not run do_meta_tests on last priority, as finish_meta_tests will run anyway Committed revision 1901060. It already reduces unnecessary do_meta_tests calls a lot. Before: do_meta_tests -1000 ready 2 do_meta_tests -950 ready 0 do_meta_tests -900 ready 0 do_meta_tests -100 ready 5 do_meta_tests -90 ready 0 do_meta_tests 0 ready 1234 do_meta_tests 500 ready 0 After: do_meta_tests -950 ready 2 do_meta_tests -100 ready 5 do_meta_tests 0 ready 1234 Though in the grand total runtimes, it makes very little difference. It's hard to optimize things, as most metas will end up ready at priority 0 anyway. Will look if there's further to improve, but I'm quite sceptical about the --$conf->{meta_dep_count} stuff, as it's really hard to guarantee that someone doesn't call rule_ready() multiple times, obviously then the counts will end up wrong and metas could be run prematurely. It's possible to track readiness perfectly with a hash from which you delete dependencies as they occur. I already tried something like that, but it gets so complex that it actually increases runtimes. -- You are receiving this mail because: You are the assignee for the bug.
Re: [Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
Am 2022-05-17 17:09, schrieb bugzilla-dae...@spamassassin.apache.org: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #27 from Michael Storz --- (In reply to Henrik Krohns from comment #24) (In reply to Michael Storz from comment #23) > Great work! Are you now ready to switch from the brute force algorithm to a > deterministic algorithm? I am not sure which of the two algorithms would be > faster. That needs to be tested. It should not matter if I'm ready or not. If you have something, then share it for everyone. :-) Ok, let's see if the change leads to better performance or not. The idea is really simple, no big surprise, no fancy algorithm :-) Each meta rule needs a counter with the number of dependent rules. For each rule, an array is created with all meta rules that depend on that rule. When a rule has finished, got_hit/got_mis/rule_ready, it decrements the counter of each dependent meta rule. If the counter is 0, then the meta rule is moved from meta_pending to meta_ready queue. The meta_ready queue is then processed by do_meta_tests. Alternative: Instead of moving the meta rule, it is executed immediately. The do_meta_tests subroutine then becomes redundant (what to do with the reuse part then, I have no idea). In the end, the leftover meta-rules must then be handled with finish_meta_tests. A few code fragments for clarification: sub compile_meta_rules foreach my $name (keys %{$conf->{tests}}) { ... $conf->{meta_dep_count}->{$name} = scalar @{$rule_deps{$name}}; foreach my $rulename (@{$rule_deps{$name}}) { push @{$conf->{rule_dependencies}->{$rulename}}, $name; } } --- got_hit/got_miss/rule_ready foreach my $meta_rule (@{$conf->{rule_dependencies}->{$rulename}) { move_meta_p2r($meta_rule) unless --$conf->{meta_dep_count}->{$meta_rule}; } foreach my $meta_rule (@{$conf->{rule_dependencies}->{$rulename}) { run_meta_rule($meta_rule) unless --$conf->{meta_dep_count}->{$meta_rule}; } run_meta_rule like do_meta_tests + delete $pms->{meta_pending}->{$meta_rule} sub do_meta_tests { my ($self, $pms, $priority) = @_; ... return if $self->{am_compiling}; # nothing to compile here my $mr = $pms->{meta_ready}; my $mt = $pms->{conf}->{meta_tests}; my $h = $pms->{tests_already_hit}; while (my $rulename = shift @$mr) { # Metasubs look like ($_[1]->{$rulename}||($_[2]->{$rulename}?1:0)) ... my $result = $mt->{$rulename}->($pms, $h, {}); if ($result) { dbg("rules: ran meta rule $rulename ==> got hit ($result)"); $pms->got_hit($rulename, '', ruletype => 'meta', value => $result); } else { dbg("rules-all: ran meta rule $rulename, no hit") if $would_log_rules_all; $pms->got_miss($rulename); # mark meta done } } } Performance: The standard ruleset of SpamAssassin includes about 3,000 rules. One third of these are meta rules. The performance of the evaluation of the meta rules therefore matters. My code fragments with move_meta_p2r and run_meta_rule would therefore degrade performance if these were actually implemented as subroutine calls, since about 1,000 additional subroutine calls would then be made. The immediate execution of meta rules by run_meta_rule makes little sense, since we do not short-circuit the boolean expressions of the meta rules. Short-circuit would only bring something, if then in consequence large parts of the rules would not be evaluated any more. Therefore only move_meta_p2r comes into question. move_meta_p2r can be implemented as the two statements add ready + delete pending. In do_meta_tests we then have the set of ready meta rules. Instead of processing them individually, which would mean executing 1,000 subroutine calls, we can now generate subroutines that execute many meta rules at once, analogous to the procedure for the header tests. This could reduce the number of calls to a handful and would boost performance. Michael
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #27 from Michael Storz --- (In reply to Henrik Krohns from comment #24) > (In reply to Michael Storz from comment #23) > > Great work! Are you now ready to switch from the brute force algorithm to a > > deterministic algorithm? I am not sure which of the two algorithms would be > > faster. That needs to be tested. > > It should not matter if I'm ready or not. If you have something, then share > it for everyone. :-) Ok, let's see if the change leads to better performance or not. The idea is really simple, no big surprise, no fancy algorithm :-) Each meta rule needs a counter with the number of dependent rules. For each rule, an array is created with all meta rules that depend on that rule. When a rule has finished, got_hit/got_mis/rule_ready, it decrements the counter of each dependent meta rule. If the counter is 0, then the meta rule is moved from meta_pending to meta_ready queue. The meta_ready queue is then processed by do_meta_tests. Alternative: Instead of moving the meta rule, it is executed immediately. The do_meta_tests subroutine then becomes redundant (what to do with the reuse part then, I have no idea). In the end, the leftover meta-rules must then be handled with finish_meta_tests. A few code fragments for clarification: sub compile_meta_rules foreach my $name (keys %{$conf->{tests}}) { ... $conf->{meta_dep_count}->{$name} = scalar @{$rule_deps{$name}}; foreach my $rulename (@{$rule_deps{$name}}) { push @{$conf->{rule_dependencies}->{$rulename}}, $name; } } --- got_hit/got_miss/rule_ready foreach my $meta_rule (@{$conf->{rule_dependencies}->{$rulename}) { move_meta_p2r($meta_rule) unless --$conf->{meta_dep_count}->{$meta_rule}; } foreach my $meta_rule (@{$conf->{rule_dependencies}->{$rulename}) { run_meta_rule($meta_rule) unless --$conf->{meta_dep_count}->{$meta_rule}; } run_meta_rule like do_meta_tests + delete $pms->{meta_pending}->{$meta_rule} sub do_meta_tests { my ($self, $pms, $priority) = @_; ... return if $self->{am_compiling}; # nothing to compile here my $mr = $pms->{meta_ready}; my $mt = $pms->{conf}->{meta_tests}; my $h = $pms->{tests_already_hit}; while (my $rulename = shift @$mr) { # Metasubs look like ($_[1]->{$rulename}||($_[2]->{$rulename}?1:0)) ... my $result = $mt->{$rulename}->($pms, $h, {}); if ($result) { dbg("rules: ran meta rule $rulename ==> got hit ($result)"); $pms->got_hit($rulename, '', ruletype => 'meta', value => $result); } else { dbg("rules-all: ran meta rule $rulename, no hit") if $would_log_rules_all; $pms->got_miss($rulename); # mark meta done } } } -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #26 from Henrik Krohns --- (In reply to Henrik Krohns from comment #25) > - Bring back async pending check in do_meta_tests My brain is starting to be exhausted. It's not really needed. Committed revision 1900984. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #25 from Henrik Krohns --- Looking at my old plugin and popular things like SH.pm, it's clear the latest change will heavily break backwards compatibility. So here's a small fix that mostly mitigates things: - Use rule_ready() in run_eval_tests to allow async even for "return 0" - Bring back async pending check in do_meta_tests Committed revision 1900974. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #24 from Henrik Krohns --- (In reply to Michael Storz from comment #23) > Great work! Are you now ready to switch from the brute force algorithm to a > deterministic algorithm? I am not sure which of the two algorithms would be > faster. That needs to be tested. It should not matter if I'm ready or not. If you have something, then share it for everyone. :-) -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #23 from Michael Storz --- Great work! Are you now ready to switch from the brute force algorithm to a deterministic algorithm? I am not sure which of the two algorithms would be faster. That needs to be tested. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #22 from Henrik Krohns --- (In reply to Henrik Krohns from comment #10) > Created attachment 5784 [details] > Eval functions returning undef for async > > As discussed on list, here's a patch to change all plugins async eval > functions to return undef. > > Care needs to be taken to check all bgsend* return values, so there is no > unnecessary async marking. Committed revision 1900961. Need to clean ups some docs later. There's also some suspicious unrun rules/metas popping up randomly, still investigating if there's something to fix. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #21 from Henrik Krohns --- (In reply to Michael Storz from comment #20) > > I would therefore implement nothing for the support of the decision logic > for the time being. Then I'll just make a decision and go with rule_ready() in it's current state. You'll see it in use for the whole 4.0 lifetime, which can be long. All fine to me. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #20 from Michael Storz --- (In reply to Henrik Krohns from comment #18) > (In reply to Michael Storz from comment #17) > > If that > > could happen the logic of the eval rule is too complex and should be > > changed. > > A DNSBL can do this: > - It parses for example 4 IP addresses from Received > - For each of the IP it starts a separate bgsend_and_start_lookup, expecting > 4 separate callbacks > > Obviously if got_hit() is called, we know rule is ready regardless of any > pending lookups. > > What things should the callback do if there is no hit? > > Current logic: > > sub callback { > rule_ready(thisrule) > foreach (answer) { > if (match) got_hit(thisrule) > } > } > > Using got_miss() would probably be: > > sub callback { > foreach (answer) { > if (match) got_hit(thisrule) > } > if (!tests_already_hit(thisrule) && !get_async_pending_rules(thisrule)) > got_miss(thisrule) > } > > I guess we could brind back the deprecated Dns/is_rule_complete function, > which can do the checks. So would this be the most elegant flow? > > sub callback { > foreach (answer) { > if (match) got_hit(thisrule) > } > if (is_rule_complete(thisrule)) got_miss(thisrule) > } short executive answer for Henrik: Since the decision logic of how to process the different asynchronous calls is completely contained in the plugin and therefore cannot be detected from the outside, I don't know how a support system could look like. For this we probably need first several examples from which we could take an idea for the realization. long version: If we take the standard case of one rule one result, then we can represent the decision logic for your example of a DNSBL with 4 lookups l1 to l4 like this: hit = l1 || l2 || l3 || l4 miss = !hit = !(l1 || l2 || l3 || l4) = !l1 && !l2 && !l3 && !l4 With short-circuit evaluation we can evaluate a hit already with the result of one lookup hit. But for a miss we need the evaluation of all lookups. The logic could be the other way around in another case: hit = l1 && l2 && l3 && l4 Now we would need all lookup results for the evaluation of a hit and only one for a miss. In principle, the logic can take any form of a boolean expression: hit = (l1 && l2) || (l3 && l4) Here, too, there are combinations where we can only decide whether it is a hit or a miss after evaluating all lookups. I.e. logically we always need three levels - eval function - asynchronous lookups - decision function i.e. the callbacks of the lookups must always call a decision function, which keeps track of the status of all lookups and makes a decision about hit or miss based on this data. Besides the standard case, there are also the cases with one rule many results. In the extreme case, there could also be four hit/miss results associated with the four lookups, i.e. the lookups are completely independent of each other. Which logic is implemented in the plugin cannot be determined from the outside. At the moment I don't know how to support the implementation of the logic. For this we need first of all examples from which we can then perhaps derive something. I would therefore implement nothing for the support of the decision logic for the time being. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #19 from Henrik Krohns --- (In reply to Henrik Krohns from comment #18) > > I guess we could brind back the deprecated Dns/is_rule_complete function, > which can do the checks. So would this be the most elegant flow? > > sub callback { > foreach (answer) { > if (match) got_hit(thisrule) > } > if (is_rule_complete(thisrule)) got_miss(thisrule) > } Even still, that is confusing naming to me, since is_rule_complete() or got_miss() must ignore any previous got_hit(). You can't be sure that all developers make 100% correct logic in plugins. In that sense rule_ready() atleast doesn't convey anything about rule hitting or missing. Rule missing should be assumed as a default. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #18 from Henrik Krohns --- (In reply to Michael Storz from comment #17) > If that > could happen the logic of the eval rule is too complex and should be changed. A DNSBL can do this: - It parses for example 4 IP addresses from Received - For each of the IP it starts a separate bgsend_and_start_lookup, expecting 4 separate callbacks Obviously if got_hit() is called, we know rule is ready regardless of any pending lookups. What things should the callback do if there is no hit? Current logic: sub callback { rule_ready(thisrule) foreach (answer) { if (match) got_hit(thisrule) } } Using got_miss() would probably be: sub callback { foreach (answer) { if (match) got_hit(thisrule) } if (!tests_already_hit(thisrule) && !get_async_pending_rules(thisrule)) got_miss(thisrule) } I guess we could brind back the deprecated Dns/is_rule_complete function, which can do the checks. So would this be the most elegant flow? sub callback { foreach (answer) { if (match) got_hit(thisrule) } if (is_rule_complete(thisrule)) got_miss(thisrule) } -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #17 from Michael Storz --- (In reply to Henrik Krohns from comment #15) > (In reply to Henrik Krohns from comment #12) > > Of course we can vote if got_miss() etc would be better name for > > rule_ready(). And documentation needs some more polishing. > > What I don't like about got_miss(), is that it doesn't necessarily mark a > "miss". It's possible to call got_hit() after it, especially in > multiple-lookups-for-one-rule scenario, where it would be complex for the > plugin itself to track all lookups. Much simpler to call rule_ready on every > callback. Though xurrent logic is not perfect either.. if the last lookup > timeouts, it ends up as unrun rule. I think it is the responsibility of the plugin to decide if a rule is a hit or a miss. It would be really bad, if it called got_miss first, which means I'm done and it is a miss, and then later to decide to call got_hit. If that could happen the logic of the eval rule is too complex and should be changed. However, if you think rule_ready is a tool which could support the decision making of the rule, then it is ok. I haven't thought much about that scenario. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #16 from Henrik Krohns --- (In reply to Michael Storz from comment #14) > I remember someone renamed get_pending_lookups to get_async_pending_rules, > which is the name I would have chosen too ;-) Well that function never even existed outside trunk.. but I forgot alias for it just in case.. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #15 from Henrik Krohns --- (In reply to Henrik Krohns from comment #12) > Of course we can vote if got_miss() etc would be better name for > rule_ready(). And documentation needs some more polishing. What I don't like about got_miss(), is that it doesn't necessarily mark a "miss". It's possible to call got_hit() after it, especially in multiple-lookups-for-one-rule scenario, where it would be complex for the plugin itself to track all lookups. Much simpler to call rule_ready on every callback. Though xurrent logic is not perfect either.. if the last lookup timeouts, it ends up as unrun rule. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #14 from Michael Storz --- I remember someone renamed get_pending_lookups to get_async_pending_rules, which is the name I would have chosen too ;-) -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #13 from Henrik Krohns --- What comes to: "a better name for tests_already_hit would be tests_already_finished or tests_finished" .. I would say renaming ancient stuff for cosmetic purposes only is always a bit sus. But it is more or less internal variable, atleast Google doesn't find any third part plugins using it .. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #12 from Henrik Krohns --- Of course we can vote if got_miss() etc would be better name for rule_ready(). And documentation needs some more polishing. -- You are receiving this mail because: You are the assignee for the bug.
Re: [Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
Am 2022-05-15 21:19, schrieb Henrik K: On Sun, May 15, 2022 at 06:58:27PM +0200, Michael Storz wrote: THIS IS WRONG. LOL. You could have just summarized it in one or two lines without all the "error" hyperbole, and I would have understood it much faster. Summary: - Modify eval functions to return undef if they are async, after that rule_pending() and {tests_pending} are redundant and can be removed, tests_already_hit is enough to check status, as rule_ready() does not mark it until last DNS lookup is finished. Quite elegant solution. Will post patch to Bug 7987 soon for verifying. Ok, ok, I will try to be more brief in the future ;-) Michael
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Henrik Krohns changed: What|Removed |Added Resolution|FIXED |--- Status|RESOLVED|REOPENED --- Comment #11 from Henrik Krohns --- Reopening for review -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Henrik Krohns changed: What|Removed |Added Attachment #5779|0 |1 is obsolete|| --- Comment #10 from Henrik Krohns --- Created attachment 5784 --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5784=edit Eval functions returning undef for async As discussed on list, here's a patch to change all plugins async eval functions to return undef. Care needs to be taken to check all bgsend* return values, so there is no unnecessary async marking. -- You are receiving this mail because: You are the assignee for the bug.
Re: [Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
On Sun, May 15, 2022 at 06:58:27PM +0200, Michael Storz wrote: > >THIS IS WRONG. > LOL. You could have just summarized it in one or two lines without all the "error" hyperbole, and I would have understood it much faster. Summary: - Modify eval functions to return undef if they are async, after that rule_pending() and {tests_pending} are redundant and can be removed, tests_already_hit is enough to check status, as rule_ready() does not mark it until last DNS lookup is finished. Quite elegant solution. Will post patch to Bug 7987 soon for verifying.
Re: [Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
Am 2022-05-14 20:01, schrieb Henrik K: On Sat, May 14, 2022 at 07:15:45PM +0200, Michael Storz wrote: Despite all these changes, however, I still see room for improvement. At the moment I'm not very happy about how the do_meta_tests subroutine is implemented. It seems to work now, but the query for the dependency of the meta rules looks too complicated to me: foreach my $r (@{$md->{$rulename}|[]}) { next RULE unless exists $h->{$r} && !$tp->{$r} && !$pl{$r}; } Instead of three dependencies, there should really only be one query at this point for "is the rule ready or not". If we knew exactly the state of a rule at each point in time, then it would also be possible to move from a brute force algorithm to a deterministic algorithm, where a rule that becomes ready automatically sets all meta rules to ready if that rule was the last dependency analogous to the algorithm for tags. do_meta_tests would then simply process a queue of ready meta rules. If a new meta rule is ready it will be added to the end of the queue. If the queue is empty, all rules of a priority class are processed. I heartily agree with everything. The problem is that I don't have any formal programming training or knowledge of fancy algorithms or deterministic stuff. Frankly I'm not even interested in most of that stuff, I just script and try to make things do stuff correctly with some common sense and logic, would be happy if someone showed how to make it simpler. Glad I was able to make it work with maybe 10% performance loss, but it's balanced back by dumb metas not needing to wait for async stuff for no reason etc. I'm glad you agree with my analysis. Then let's have one last try. I think I found the logical error in the evaluation that makes everything so complicated. The error already exists before 3.4.6. Let's recap: At the beginning of the evaluation, ALL rules are in the pending state by definition. When rules are run, they change from state pending to state finished. How is this state change accomplished? - The normal case is the indirect change for the synchronous standard rules. After they are run, they return either 0, which means it was a miss, or a positive integer, which means it was a hit. - Then there is the case of synchronous rules with explicit state change. These rules call got_hit for a hit and should call got_miss if it was a miss. got_miss is similar to rule_ready, it is just the statement $self->{tests_already_hit}->{$rule} ||= 0; BTW, a better name for tests_already_hit would be tests_already_finished or tests_finished. - And then we have the asynchronous rules, which must be divided into two parts: the synchronous part and the asynchronous part. The synchronous part is the call of the eval function, which in turn submits the asynchronous part to a queue function. At the end, the synchronous part is supposed to return a result. And here comes the error: it returns the value 0, which means that the rule is finished and the result is a miss. THIS IS WRONG. The rule is not yet finished and the result is not yet known. Therefore, it must NOT return 0, it must return undef. This means that it is still pending (no state change) and the asynchronous part will decide if it is a hit or a miss. Now, let's test a correction: - First, add a debug statement to the foreach loop of sub do_meta_tests to see the values of the three conditions. The idea is to find out how many and which rules depend not only on $h->{$deprule}, but also on $tp->{$deprule} and $pl{$deprule}. If there are none left, we can eliminate the two conditions. The states of the three conditions are result $h $tp $pl OK 0 0 0 OK 0 1 0 OK 0 0 1 OK 0 1 1 OK 1 0 0 NO 1 1 0 NO 1 0 1 NO 1 1 1 NO is the state we do not want to have: there is a hit/the rule is finished, but $tp and/or $pl are true which means the rule has not yet finished. # Meta is not ready if some dependency has not run yet if (exists $h->{$deprule} && ($tp->{$deprule} || $pl{$deprule})) { dbg("rules: NO: \$h exists, \tp=" . ($tp->{$deprule} ? 'true' : 'false') . "\$pl=" . ($pl->{$deprule} ? 'true' : 'false')); } else { dbg("rules: OK: \$h" . (exists $h->{$deprule} ? '' : ' not') . " exists, \tp=" . ($tp->{$deprule} ? 'true' : 'false') . "\$pl=" . ($pl->{$deprule} ? 'true' : 'false')); } foreach my $deprule (@{$md->{$rulename}||[]}) { if (!exists $h->{$deprule} || $tp->{$deprule} || $pl{$deprule}) { next RULE; } } Then run a test and see how many NO we get. - Second, change sub run_eval_tests in # Make sure rule is marked ready for meta rules using $hitsptr $evalstr .= ' if ($scoresptr->{q{'.$rulename.'}}) { $hitsptr->{q{'.$rulename.'}} ||= 0; $rulename = q#'.$rulename.'#; delete $hitsptr->{q{'.$rulename.'}} ||= 0; change
Re: [Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
On Sat, May 14, 2022 at 07:15:45PM +0200, Michael Storz wrote: > > Despite all these changes, however, I still see room for improvement. At the > moment I'm not very happy about how the do_meta_tests subroutine is > implemented. It seems to work now, but the query for the dependency of the > meta rules looks too complicated to me: > > foreach my $r (@{$md->{$rulename}|[]}) { > next RULE unless exists $h->{$r} && !$tp->{$r} && !$pl{$r}; > } > > Instead of three dependencies, there should really only be one query at this > point for "is the rule ready or not". If we knew exactly the state of a rule > at each point in time, then it would also be possible to move from a brute > force algorithm to a deterministic algorithm, where a rule that becomes > ready automatically sets all meta rules to ready if that rule was the last > dependency analogous to the algorithm for tags. do_meta_tests would then > simply process a queue of ready meta rules. If a new meta rule is ready it > will be added to the end of the queue. If the queue is empty, all rules of a > priority class are processed. I heartily agree with everything. The problem is that I don't have any formal programming training or knowledge of fancy algorithms or deterministic stuff. Frankly I'm not even interested in most of that stuff, I just script and try to make things do stuff correctly with some common sense and logic, would be happy if someone showed how to make it simpler. Glad I was able to make it work with maybe 10% performance loss, but it's balanced back by dumb metas not needing to wait for async stuff for no reason etc.
Re: [Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
Am 2022-05-13 08:37, schrieb bugzilla-dae...@spamassassin.apache.org: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Henrik Krohns changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #9 from Henrik Krohns --- Feel free to discuss here or on dev-list if something is still unclear. First of all, I would like to thank Henrik for all the work he has done on SpamAssassin. The further I get with my code review, the more I see all the places where he has made minor or major changes that have made the code faster, more readable and thus more maintainable. Despite all these changes, however, I still see room for improvement. At the moment I'm not very happy about how the do_meta_tests subroutine is implemented. It seems to work now, but the query for the dependency of the meta rules looks too complicated to me: foreach my $r (@{$md->{$rulename}|[]}) { next RULE unless exists $h->{$r} && !$tp->{$r} && !$pl{$r}; } Instead of three dependencies, there should really only be one query at this point for "is the rule ready or not". If we knew exactly the state of a rule at each point in time, then it would also be possible to move from a brute force algorithm to a deterministic algorithm, where a rule that becomes ready automatically sets all meta rules to ready if that rule was the last dependency analogous to the algorithm for tags. do_meta_tests would then simply process a queue of ready meta rules. If a new meta rule is ready it will be added to the end of the queue. If the queue is empty, all rules of a priority class are processed. The next possible step would be the implementation of short-circuit behavior of && and || analogous to Perl itself. E.g. the rule meta BITCOIN_SPAM_10 __BITCOIN_ID && ( HTML_IMAGE_ONLY_04 || HTML_IMAGE_ONLY_08 ) would immediately evaluate to false if __BITCOIN_ID is false. HTML_IMAGE_ONLY_04 and HTML_IMAGE_ONLY_08 would then no longer need to be evaluated. Recently the question was asked why Check.pm is a plugin if it is not optional. Check.pm is a plugin so that you can implement more than one check plugin. Currently SpamAssassin still assumes that filtering happens postqueue, where you can respond to the different wishes of individual users. If you use SpamAssassin in a prequeue filtering, then only the decision rejection or acceptance is possible for ALL recipients. A consideration of different recipient wishes is not possible. Due to the possibility to switch between admin and user rules per evaluation of an email, one accepts that the evaluation of the rules must be rebuilt for each email. What we need instead is the possibility to build the rules once at the start of the daemon and then use it to evaluate any number of emails. You can go one step further and first run a SpamAssassin instance with a lightweight ruleset in prequeue mode to be able to decide as quickly as possible whether to reject an email and then run another instance with a heavyweight ruleset that makes a more precise distinction between marking the email as spam or ham. The second instance can then also evaluate user rules. There should be a feedback loop between the instances so that the first instance can use the results of the second instance, e.g. to quickly reject emails via a local blocklist using the HashBL.pm plugin. Currently we use a feedback loop between the SpamAssassin instance in prequeue mode and Postfix to (temporarily) reject emails for 24 hours. In any case, SpamAssassin should arrive in 2022 and offer a highly optimized version of the analysis for MTAs that process millions of emails per day. These are my general remarks about the evaluation of rules. An email with some minor cosmetic changes will follow. Michael
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Henrik Krohns changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #9 from Henrik Krohns --- Feel free to discuss here or on dev-list if something is still unclear. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #8 from Henrik Krohns --- Should be a bit better now. - fix body rules considered unrun when using sa-compile - fix check_rbl_sub rules considered unrun and other DNSEval cleanups - improve rule_pending/rule_ready/got_hit() logic - rename $pms->get_pending_lookups to get_async_pending_rules - other minor async cleanups - test and documentation improvements Committed revision 1900849. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Henrik Krohns changed: What|Removed |Added Target Milestone|Undefined |4.0.0 Severity|minor |blocker --- Comment #7 from Henrik Krohns --- Setting as blocker for 4.0.0. There's some remaining bugs, especially the way check_rbl_sub is handled, which makes some rules considered unrun occasionally. Already made some progress, but I really need a small break and a breather, will work on in the coming days.. -- You are receiving this mail because: You are the assignee for the bug.
Re: [Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
Am 2022-05-08 18:13, schrieb Henrik K: On Sun, May 08, 2022 at 05:33:24PM +0200, Michael Storz wrote: Am 2022-05-08 06:43, schrieb bugzilla-dae...@spamassassin.apache.org: > https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 > > --- Comment #4 from Henrik Krohns --- > I mostly agree with everything, great to have extra eyeballs. Can you > comment > if my previous list comment and Revision 1900667 additions cleared any > things > up for you and changes anything? > Unfortunately not, I'm still struggling to understand the various aspects of processing. However, I understand, that a call of rule_ready for a sync rule makes sense. It does not need a call to rule_pending in this case. Example for a problem: a rule_pending must be followed by a rule_ready or a got_hit. But how a got_hit should work instead of a rule_ready is a mystery to me. In sub do_meta_tests the query whether a rule dependency exists is defined as: I noticed that got_hit doesn't do everything that it can. But I also found many other cases that need fixing and better testing. Spent 8 hours today trying different things and planning how to do things, trying to make bgsend automatically mark things ready is hard, as it's so complex in itself.. this is going to take few days.. Wow, maybe it would help when you try to explain what is going wrong? I have made the experience that I have often already found the solution when I tried to describe a problem to someone else, even though they didn't have much of an idea about the problem. The only other problem I found with bgsend_and_start_lookup is the handling of end cases. When abort_remaining_lookups is called, it triggers a callback for the lookups that are still waiting. They in turn can register lookups with bgsend_and_start_lookup again. Therefore, bgsend_and_start_lookup must check whether it is in the aborting state, and instead of queuing the lookup, it should immediately abort the lookup in the same way as abort_remaining_lookups. However, it should be prepared to break a loop if the plugin misbehaves and schedules lookup after lookup. Michael
Re: [Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
On Sun, May 08, 2022 at 05:33:24PM +0200, Michael Storz wrote: > Am 2022-05-08 06:43, schrieb bugzilla-dae...@spamassassin.apache.org: > > https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 > > > > --- Comment #4 from Henrik Krohns --- > > I mostly agree with everything, great to have extra eyeballs. Can you > > comment > > if my previous list comment and Revision 1900667 additions cleared any > > things > > up for you and changes anything? > > > > Unfortunately not, I'm still struggling to understand the various aspects of > processing. However, I understand, that a call of rule_ready for a sync rule > makes sense. It does not need a call to rule_pending in this case. > > Example for a problem: a rule_pending must be followed by a rule_ready or a > got_hit. But how a got_hit should work instead of a rule_ready is a mystery > to me. In sub do_meta_tests the query whether a rule dependency exists is > defined as: I noticed that got_hit doesn't do everything that it can. But I also found many other cases that need fixing and better testing. Spent 8 hours today trying different things and planning how to do things, trying to make bgsend automatically mark things ready is hard, as it's so complex in itself.. this is going to take few days..
Re: [Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
Am 2022-05-08 06:43, schrieb bugzilla-dae...@spamassassin.apache.org: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #4 from Henrik Krohns --- I mostly agree with everything, great to have extra eyeballs. Can you comment if my previous list comment and Revision 1900667 additions cleared any things up for you and changes anything? Unfortunately not, I'm still struggling to understand the various aspects of processing. However, I understand, that a call of rule_ready for a sync rule makes sense. It does not need a call to rule_pending in this case. Example for a problem: a rule_pending must be followed by a rule_ready or a got_hit. But how a got_hit should work instead of a rule_ready is a mystery to me. In sub do_meta_tests the query whether a rule dependency exists is defined as: next RULE unless exists $h->{$deprule} && !$tp->{$deprule} && !$pl{$deprule}; (condition rearranged by me to make it easier to understand). If we have an asynchronous rule $deprule other than a DNS lookup, a call to rule_pending makes an entry in $tp. If the rule is true, the call to got_hit makes an entry in $h, but it does not delete the entry in $tp unlike rule_ready. Thus, the $deprule dependency is still present and the rule cannot be evaluated. Or did I miss something? Regards, Michael PS: Unfortunately, I do not have a running 4.0 system. Therefore I cannot test it myself.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #6 from Henrik Krohns --- Added mention of bgsend to rule_pending() docs, Revision 1900680 And then I found some bugs in the logic.. For example when URIDNSBL is launching and finishing a rule fast, and only after that eval check_uridnsbl is called, it marks the rule again as rule_pending(), and the rule will end up considered unrun.. duh.. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Kevin A. McGrail changed: What|Removed |Added CC||kmcgr...@apache.org --- Comment #5 from Kevin A. McGrail --- (In reply to Henrik Krohns from comment #4) > I > even think SA originally went too far to "pluginize" every little detail, > with it creating it's own problems.. there's just way too many public hooks > and tidbits to understand.. Very fair comment. The idea was good though because it gave us different standards for code quality: Core code > Plugin Enabled by Default > Plugin not Enabled by Default And it gave the ability to create proprietary plugins which was good because an important part of the ASF is the pro-business ASLv2 with no copyleft licensing or common clause concepts. Plus, I think sometimes when you are writing code, you think, this will only get used for a year. Not 20 and we're getting close to that. And there are definitely things I've wanted to fix for eons. I remember after Vixie created RBLs and we implemented them, I made a statement like DNS is a brilliant hack but we need to replace using DNS in a year, lol. DNS was distributed and resilient but you've seen all the blocking code in trying to handle lookups on DNS and timeouts. Spa.ghet.ti. And don't get me started on Net::DNS. Anyway, just thought it might be good to have some of the thought process from the olden days. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #4 from Henrik Krohns --- I mostly agree with everything, great to have extra eyeballs. Can you comment if my previous list comment and Revision 1900667 additions cleared any things up for you and changes anything? As a developer I can have hard time figuring out what to document about things, so they are understandable for any third party plugin developers. I even think SA originally went too far to "pluginize" every little detail, with it creating it's own problems.. there's just way too many public hooks and tidbits to understand.. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 --- Comment #3 from Michael Storz --- Well, it took me more than a day of carefully reviewing the call diagram and implementation of bgsend_and_start_lookup to come to this conclusion. My first thought was that maybe some tests are not working properly and therefore overlooked that some calls to rule_pending are missing. Only afterwards did I realize that it was the other way around. We both know how complicated all the rule processing is. Therefore, from my point of view, it is extremely important to remove everything superfluous so that we have a chance to understand how the different algorithms work. This has nothing to do with cosmetic details, but is essential. Next, I'm going to look at asynchronous tag processing. After your second change to FromNameSpoof.pm, I'm pretty sure there's a general processing error here. But I'll have to check that out in more detail. The use of the rule_pending and rule_ready routines you introduced should be clearly described and also implemented accordingly. From the current description I understand that the routines are to be used in principle with all asynchronous eval functions, so that the evaluation of the meta routines works. I am of the opinion that this description is wrong and does not correspond to your implementation. If the processing of the asynchronous calls, i.e. the queueing, polling and dequeueing is done centrally, then the handling must be done automatically. The end handling should then also be done centrally. This applies to both the central mechanism via bgsend_and_start_lookup and via action_depends_on_tags. For bgsend_and_start_lookup, I'm pretty sure that's how you implemented it. And that's a big step forward. rule_pending/rule_ready should only become necessary if the processing is decentralized and the periodic queries are each triggered via check_tick. This is the case with DCC.pm, Pyzor.pm and Razor2.pm. Here it also makes sense to have the end handling done by the respective plugin and triggered by check_cleanup. And also here, I think, you have implemented it exactly like that. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Henrik Krohns changed: What|Removed |Added CC||apa...@hege.li --- Comment #2 from Henrik Krohns --- I'll look into this again, but note that I spent considerable effort and testing to make sure things worked when I implemented these meta changes. Not everything is always as it seems at first glance and reading the sparse documentation I made might not reveal all the intricacies. So great care should be taken to remove something, especially if it just for "cosmetic reasons", if they really are redundant calls, they shouldn't have ill effects.. -- You are receiving this mail because: You are the assignee for the bug.
[Bug 7987] DNSEval.pm,HashBL.pm,URILocalBL.pm: unnecessary use of rule_pending and rule_ready
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7987 Michael Storz changed: What|Removed |Added CC||sa-...@lrz.de --- Comment #1 from Michael Storz --- Created attachment 5779 --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5779=edit deletion of rule_pending, rule_ready -- You are receiving this mail because: You are the assignee for the bug.