On Wed, 5 Jun 2019, Kevin A. McGrail wrote:

Good point, Henrik & John.

OK, I've left the output alone except for the calls from dbg so it
shouldn't break anything in the public interface.

Thoughts on this version?

Looks much safer, but I still wonder whether the repetitive hits in the normal output has any value. After all, masscheck only cares *whether* the rule hit, not *how many times* in a given message...

But I haven't performed an analysis of everything else that consumes that output.

Regards,
KAM

On 6/4/2019 1:51 PM, John Hardin wrote:
On Tue, 4 Jun 2019, Kevin A. McGrail wrote:

Yes, I was thinking about that and wanting to fix uritests so well
for the
template.   Thanks for the feedback.  I will take another pass at it.

Just do the deduplication without modifying the output format.

If we want to log the hit counts, then make another function that does
what you did and use it for logging.


On Tue, Jun 4, 2019, 03:23 Henrik K <[email protected]> wrote:


If you want to modify debug output, you have to modify only the dbg()
output
itself.  You can't modify internal functions that have specific output
formats and start adding random strings to them.  Atleast these places
depend on the comma delimited rules:

./masses/mass-check:    push @tests, split(/,/,
$status->get_names_of_subtests_hit());
./t/rule_tests.t:    my %rules_hit = map { $_ => 1 }
split(/,/,$msg->get_names_of_tests_hit()),
split(/,/,$msg->get_names_of_subtests_hit());
./t.rules/run:  my $testsline =
$status->get_names_of_tests_hit().",".$status->get_names_of_subtests_hit();




On Tue, Jun 04, 2019 at 01:56:26AM -0400, Kevin A. McGrail wrote:
Morning All,

After a few thoughts on limits, it appears that any duplicate subtest
hits are best combined for debug output.

Any thoughts on the attached?  It looks like it will help me with rule
development while support rules with valid but large maxhits like
__LOWER_E

Regards,
KAM

On 5/31/2019 10:30 AM, Bill Cole wrote:
On 30 May 2019, at 20:35, Kevin A. McGrail wrote:

I was curious if anyone noticed the debug output for subtests has
gotten
insane:

It got a little discussion on users@ when I created those rules.

[...]

72_active.cf:    body            __LOWER_E       /e/
72_active.cf:    tflags          __LOWER_E       multiple
maxhits=230

72_active.cf:    body            __E_LIKE_LETTER /<lcase_e>/
72_active.cf:    tflags          __E_LIKE_LETTER multiple
maxhits=320

Assuming those maxhits are correct,

They are. In fact they were carefully tuned to catch the targeted
extortion spam.

maybe we need something in the debug
output that says __E_LIKE_LETTER (number of hits if more than 1).

That would be a useful enhancement even without my flagrant log
vandalism.


--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


Index: lib/Mail/SpamAssassin/PerMsgStatus.pm
===================================================================
--- lib/Mail/SpamAssassin/PerMsgStatus.pm       (revision 1860582)
+++ lib/Mail/SpamAssassin/PerMsgStatus.pm       (working copy)
@@ -769,7 +769,38 @@
 sub get_names_of_subtests_hit {
   my ($self) = @_;

-  return join(',', sort @{$self->{subtest_names_hit}});
+  #return join(',', sort @{$self->{subtest_names_hit}});
+
+  #This routine prints only one instance of a subrule hit with a
count
of how many times it hit if greater than 1
+  my (%subtest_names_hit, $i, $key, @keys, @sorted, $string, $rule,
$total_hits, $deduplicated_hits);
+
+  $total_hits = scalar(@{$self->{subtest_names_hit}});
+
+  for ($i=0; $i < $total_hits; $i++) {
+    $rule = ${$self->{subtest_names_hit}}[$i];
+    $subtest_names_hit{$rule}++;
+  }
+
+  foreach $key (keys %subtest_names_hit) {
+    push (@keys, $key);
+  }
+  @sorted = sort @keys;
+
+  $deduplicated_hits = scalar(@sorted);
+
+  for ($i=0; $i < $deduplicated_hits; $i++) {
+    $string .= $sorted[$i];
+    if ($subtest_names_hit{$sorted[$i]} > 1) {
+      $string .= "($subtest_names_hit{$sorted[$i]})"
+    }
+    $string .= ",";
+  }
+
+  $string =~ s/,$//;
+
+  $string .= " (Total Subtest Hits: $total_hits / Deduplicated Total
Hits: $deduplicated_hits)";
+
+  return $string;
 }


###########################################################################

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 [email protected]    FALaholic #11174     pgpk -a [email protected]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Teach a man to fish, and he'll eat for life.
  Give him someone else's fish, and he'll vote for you.
-----------------------------------------------------------------------
 Tomorrow: the 75th anniversary of D-Day

Reply via email to