https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7689
Bug ID: 7689
Summary: reduce lint time from quadratic to linear
Product: Spamassassin
Version: 3.4.2
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P2
Component: spamassassin
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: Undefined
Created attachment 5642
--> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5642&action=edit
Patch to make lint time linear in number of rules instead of quadratic
c.f. Patch to reduce lint time in
https://mail-archives.apache.org/mod_mbox/spamassassin-dev/201902.mbox/browser
Lint checking does not re-use calculated dependencies, rendering the lint time
quadratic in the number of chained dependencies. This patch changes that time
to linear.
/_meta_deps_recurse/ now returns a hash that acts as a set of dependencies.
These dependencies are stored in /$conf->{meta_dependencies}->{$name}/
as they are calculated.
The result is not exactly the same as it was before, as I had some cases
where duplicates in /$conf->{meta_dependencies}->{$name}/ would arise.
This is no longer the case. The sets of dependencies stay identical.
I also considered changing /$conf->{meta_dependencies}->{$name}/ to a
list instead of a string, but that was not necessary to achieve a 4-fold
performance improvement and would require additional refactoring.
In practice, my lint time with a large number of dependencies goes from 20
seconds to 5 seconds on an unloaded machine with a 2.1 GHZ E5-2450 CPU. Memory
usage is mostly unaffected.
I have done tests with the standard ruleset. Since the number of chained
dependancies is low, the lint time is mostly unaffected, from 1.30s to 1.08s.
--
You are receiving this mail because:
You are the assignee for the bug.