One example:
All of my users have set their "optimal" spam thresholds to some number between 0.0 and 10.0.
If the SA developers correctly shift around test scores, add new and/or improved algorithms, etc., and I need to take advantage of the latest, greatest technology and upgrade to the latest version of SA, then without such a mechanism, all of my users' spam threshold settings (that they had previously spent a lot of hopeful time setting) will be totally off the mark and are all of a sudden likely to miss all kinds of legitimate email messages! i.e., kill me!
Joe, I think you're misunderstanding a few things..
Neither spamassassin, nor the SpamAssassin developers have any critera for the "average" email score. It's NOT a consideration at all. It's not even possible to BE a consideration because of how the scores are assigned. Average score doesn't even make SENSE.
There's only one consideration in the score tuning: correct placement of spam and ham into bins.
SA's scores are assigned by a genetic algorithm that evolves out the best scores for all the rules as one gigantic simultaneous equation. It tunes this equation to get the most email correctly placed into the spam and ham bins given the default threshold of 5.0, and treats false positives as 100 times worse than false negatives.
To this logical end, SA should be constantly/automatically shifting this midpoint back to 5.0 anyway.
Not really.. SA will in general average much higher, due to absurdly-high-scoring spams. SA's GA has no critera that would limit the number of spams that score 30.5. It has critera to limit the FPs, and this tends to keep the spam rule scores down, but if the GA starts seeing that with it's scoreset half of the spam scores over 30, it's not going to start reducing the scores of rules just to try to "fix" the problem.
SA isn't about the "average" it's about the accuracy.
By trying to fix an average you'd be creating FP and FN problems (serious ones!), with no realistic gains other than cross-version consistency of how much affect an additional point of score has on your hit-rates.
And let's face it.. average fixing is futile anyway.. As a version ages the average score drifts anyway, as yours has done. Even if SA had an average of 5, this wouldn't help you, since your average is much lower.
Now, it does look like you have serious false negative problems. THIS is your real problem, not what your "average" is.
You certainly want your average spam score to be well over 5, or you're missing a lot of spam. Perhaps you should consider bayes training, or adding Mail::SpamcopURI or some of the more conservative SARE rulesets (check their filesize and their mass-check data first!) to elevate the spam scores a bit.
