https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7822

--- Comment #3 from Michael Grant <mgr...@grant.org> ---
While debugging this, I also see that HashBL.pm is only reporting the first
hit.  

HashBL.pm looks though a message and extracts out all the email addresses in
the message headers and body.  

        *  0.1 HASHBL_EMAIL Message contains email address found on
        *      the EBL
        *      [user[at]example.com]

Here's what's going on.  On lines 646,647:
 646│  $ent = $pms->{async}->bgsend_and_start_lookup($lookup, $type, undef,
$ent,
 647│    sub { my ($ent, $pkt) = @_; $self->_finish_query($pms, $ent, $pkt); },

HashBL uses this bgsend_and_start_lookup function which seems to be DEPRECATED.
 It then calls _fininsh_query() which calls got_hit() which I think may be not
causing it to collect up any further evidence.  

One could argue that if a message contains at least one address on the hashbl,
why continue?  However, we have already made the queries for each address (or
gotten them from the cache).  If you don't put them in the headers, you obscure
why valuable information in downstream spam reporting as to why this message
was flagged as spam.  PLEASE REPORT ALL ADDRESSES that get hits in the hashbl.

Here's a larger fix for that, but I don't know if I did this the best, cleanest
way. Probably updating this module not to use the deprecated
bgsend_and_start_lookup() function and rewriting whatever replacement for the
finish_query() function is probably the better way to fix this, but for now,
this at least shows me all the addresses that get hits in a , separated list. 
I did not know how to get the addresses to come out on separate lines, and
adding a space after the , (as in ", ") seems to force a separate line in a
strange way.

*** /usr/share/perl5/Mail/SpamAssassin/Plugin/HashBL.pm.orig    2019-11-10
23:09:44.000000000 -0500
--- /usr/share/perl5/Mail/SpamAssassin/Plugin/HashBL.pm 2020-06-07
06:53:06.459383712 -0400
***************
*** 150,155 ****
--- 150,157 ----
    $self->register_eval_rule("check_hashbl_uris");
    $self->register_eval_rule("check_hashbl_bodyre");
    $self->set_config($mailsa->{conf});
+   $self->{outstanding_queries}=0;
+   $self->{hits}="";

    return $self;
  }
***************
*** 341,347 ****
          }
        }
      }
!     my $body = join('', $pms->get_decoded_stripped_body_text_array());
      if ($opts =~ /\bnouri\b/) {
        # strip urls with possible emails inside
        $body =~ s#<?https?://\S{0,255}(?:\@|%40)\S{0,255}# #gi;
--- 343,349 ----
          }
        }
      }
!     my $body = join('', @{$pms->get_decoded_stripped_body_text_array()});
      if ($opts =~ /\bnouri\b/) {
        # strip urls with possible emails inside
        $body =~ s#<?https?://\S{0,255}(?:\@|%40)\S{0,255}# #gi;
***************
*** 643,648 ****
--- 645,651 ----
      value => $value,
      subtest => $subtest,
    };
+   $self->{outstanding_queries}++;
    $ent = $pms->{async}->bgsend_and_start_lookup($lookup, $type, undef, $ent,
      sub { my ($ent, $pkt) = @_; $self->_finish_query($pms, $ent, $pkt); },
      master_deadline => $pms->{master_deadline}
***************
*** 665,676 ****
      if ($rr->address =~ $dnsmatch) {
        dbg("$ent->{rulename}: $ent->{zone} hit '$ent->{value}'");
        $ent->{value} =~ s/\@/[at]/g;
!       $pms->test_log($ent->{value});
        $pms->got_hit($ent->{rulename}, '', ruletype => 'eval');
        $pms->register_async_rule_finish($ent->{rulename});
-       return;
-     }
    }
  }

  # Version features
--- 668,684 ----
      if ($rr->address =~ $dnsmatch) {
        dbg("$ent->{rulename}: $ent->{zone} hit '$ent->{value}'");
        $ent->{value} =~ s/\@/[at]/g;
!       if ($self->{hits}) { $self->{hits} .= ","; }
!       $self->{hits} .= $ent->{value};
!     }
!   }
!   $self->{outstanding_queries}--;
!   if ($self->{outstanding_queries}==0) {
!       $pms->test_log($self->{hits});
        $pms->got_hit($ent->{rulename}, '', ruletype => 'eval');
        $pms->register_async_rule_finish($ent->{rulename});
    }
+   return;
  }

  # Version features

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to