On Thu, 19 Nov 2009 11:09:14 +0100, "Steve" <[email protected]> wrote: > -------- Original-Nachricht -------- >> Datum: Thu, 19 Nov 2009 10:32:34 +0100 >> Von: coma <[email protected]> >> An: [email protected] >> Betreff: [Dspam-user] Dspam Headers > >> Hi, >> > Hallo Coma, > > >> I have a question on the X-DSPAM-Confidence and X-DSPAM-Probability >> calculation, >> >> I have searched on the net, in archives (2004-2009) and source code (but >> It's difficult for me) but i have not found a clear answer that allows me >> to >> understand. >> >> I think graham & burton algorithms calculates the probability and >> confidence >> that the mail is a spam or notspam with the frequency of occurrence of >> each >> token corresponding to the words of the mail, it's good? >> > +/- Yes. It's not the words that count but the tokens that count. The > tokenizer is responsible what gets considered as token. Example: > WORD: token -> uniGram (single word) > CHAIN: token -> biGram (chained tokens) > SBPH: token -> Sparse Binary Polynomial Hashing > OSB: token -> Orthogonal Sparse biGram > > You can read here a more detailed description of the tokenizers used in > DSPAM -> > http://sourceforge.net/apps/mediawiki/dspam/index.php?title=Tokenizers > > >> But I don't really understand how this calculation is made. >> >> I would like to know at whitch moment a mail is considered as a spam, I >> think > 0.5 no? >> > Normally: Yes. After > 0.5. > A value of 0.5 indicates that a token is neutral. Neither Spam nor Ham. > > >> For X-dspam-Factors, for example: X-dspam-Factors: 15, and + everything, >> 0.99000, + call us, 0.99000, Judicial, 0.99000, + per day, 0.99000, + and >> lose, 0.99000, cost + to, 0.99000, ........... >> >> I think it's the probability for each word corresponding to a token, but >> what the 15? >> > 15 is the amount of tokens considered. Graham takes the most significant 15 > tokens and uses them for the computation. > > >> Thank you in advance if you can help me once again, and sorry again for >> my >> strange English. >> > I feel guilty! I do! I was the one writing about your English and I feel > guilty! Don't apologize for your English. Don't do that. Just write how > ever you think is right. I will ask you if I don't understand your > question/comment. Others will probably do the same. So please don't > apologize any more for your language. I guess I would be terrible in your > native language and you sure would still try to help me if I would write in > your native language (what is that anyway?). We are here on the mailing > list a bunch of people from all over the world. It's not always easy but we
Your welcome!! :-) > manage it :) > > >> coma >> > Steve ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Dspam-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspam-user
