https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8388

            Bug ID: 8388
           Summary: Capturing tags (4.0.0+) do not work in spamd mode -
                    rules compiled once with warm-up values, never
                    recompiled per message
           Product: Spamassassin
           Version: 4.0.2
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: spamc/spamd
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: Undefined

The CAPTURING TAGS feature (regex named capture groups, introduced in 4.0) is
non-functional when running in spamd mode. Rules that use %{TAGNAME}
substitution always fail to match real emails.

During spamd startup, a warm-up scan is performed using a synthetic message
(with identity compiling.spamassassin.taint.org).

During this scan, capture_rules_replace() in Plugin/Check.pm is called for each
rule that uses capture tag substitution. At this point the tags have valid
values derived from the warm-up message.

The generated code is compiled via eval() into Perl subroutines. The %{TAGNAME}
placeholders are substituted with the warm-up values and the resulting regex is
compiled into the subroutine.

For all subsequent real email messages, these pre-compiled subroutines are
reused directly. capture_rules_replace() is never called again — there is no
mechanism to recompile the rules with tag values from the actual message being
scanned.

As a result, rules that depend on capturing tags either never match (if the
warm-up values differ from the real message values) or produce false positives
(if they accidentally match).

Confirmed via custom debug logging in Plugin/Check.pm and PerMsgStatus.pm:
- During warm-up: capture_rules_replace() is triggered for all capture tag
rules, tags are substituted with warm-up values.
- During real message processing: tags are correctly computed and available,
but capture_rules_replace() is never called and pre-compiled subroutines with
stale values are used.

The feature works correctly when using the command-line spamassassin tool,
where rules are compiled and executed linearly for each message. The issue is
specific to spamd mode, where rules are compiled once at startup and reused for
all subsequent messages.

Maybe the subroutines generated by capture_rules_replace() should be recompiled
per message, using the actual tag values for that message, with the original
regex template containing %{TAGNAME} placeholders preserved for this purpose.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to