Theo Van Dinter wrote:
On Tue, Mar 07, 2006 at 08:44:59PM -0500, Gabriel M. Wachman wrote:
The perceptron (form of neural net used  in SA 3.0.0 and higher) is used by the
developers to generate the scores prior to release. 99.9% of end-users do not
ever use the perceptron.

By "do not use" do you mean that it is completely ignored during
classification, or that only the fixed pre-trained neural net is used

The output from the perceptron are scores (weights) which are used during
classification.  As Matt said, users tend not to generate their own scores,
and so therefore don't run the perceptron, they just use the output from when
it's run pre-release.
OK, I think I see where the confusion is; is it a perceptron or a neural net? For anyone who doesn't know, a perceptron is a single element neural net if one wanted to call it that, but really it's just a linear classifier. There are two reasons why it seems highly unlikely that SpamAssassin was trained on a neural net. 1) Back-propagation is an algorithm used on multi-layer neural nets and so does not really make sense in the context of training a perceptron (there's nothing to back-propagate to). 2) You can't save "scores" from a multilayer neural net as "if feature X is 1, add Y to the score." Neural nets compute complex functions that aren't simple conjunctions of features (and if they are simple conjunctions of features, just use a perceptron). That may be the crux of my confusion, since if there is a neural net somewhere, it needs to be running inside SpamAssassin during classification (even if it does not update itself). If it's just a perceptron, then I see how this works.

The motivation for this is that I'm comparing a filter a colleague wrote to various other filters (including SpamAssassin) and I want to make sure that the summary I give of SpamAssassin in my paper is accurate. Neural net vs. perceptron is a large distinction in our community, so I wouldn't want to be wrong about it.

Thanks again,
Gabriel

Reply via email to