> However, there *is* overlap,
Yes, I expect overlap or SA would be perfect with no FPs or FNs.
> and spam and ham (separately, or together) scores are *not* normally distributed.
I was thinking about and deferring to the Central Limit Theorem:
"The conclusion of the theorem about the sampling distribution being approximately normal in shape applies no matter what the shape of the population distribution. For large sample sizes, the sampling distribution is approximately normal even if the population distribution is highly skewed or U-shaped." "The CLT can be proved theoretically using advanced mathematical arguments."
> They don't have to be to calculate the mean of the means, but, in doing so, you're going to have a great deal of false positives.
I would expect to get no more FPs than I originally got.
