> > What's the real difference between corpus and inoculation in the > --source options? > > I know I'm not very smart, but I couldn't understand the explanation > below from the README file: > > Corpus: > The message being presented is from a mail corpus, and should be > trained as a new message, rather than re-trained based on a > signature. The message's full headers and body will be analyzed and > the correct classification will be incremented, without its > opposite being decremented. > > You should use corpus only when feeding messages in from corpus, not > for correcting errors. >
Here the message is only incrementing the token counters, like normal messages would do in teft mode. Dspam will train each message only once. > > > Inoculation: > The message being presented is in pristine form, and should > be trained as an inoculation. Inoculations are a more > intense mode of training designed to cause DSPAM to > train the user's metadata repeatedly on previously unknown > tokens, in an attepmt to vaccinate the user from future > messages similar to the one being presented. > You should use inoculation only on honeypots and the like. > Here the message will be trained and trained again until dspam is giving you 100% probability for the message being spam. Counters will be strongly affected. Sydney.
