2009/11/19 weber <[email protected]>
> On Thu, 19 Nov 2009 11:09:14 +0100, "Steve" <[email protected]> wrote:
> > -------- Original-Nachricht --------
> >> Datum: Thu, 19 Nov 2009 10:32:34 +0100
> >> Von: coma <[email protected]>
> >> An: [email protected]
> >> Betreff: [Dspam-user] Dspam Headers
> >
> >> Hi,
> >>
> > Hallo Coma,
> >
> >
> >> I have a question on the X-DSPAM-Confidence and X-DSPAM-Probability
> >> calculation,
> >>
> >> I have searched on the net, in archives (2004-2009) and source code (but
> >> It's difficult for me) but i have not found a clear answer that allows
> me
> >> to
> >> understand.
> >>
> >> I think graham & burton algorithms calculates the probability and
> >> confidence
> >> that the mail is a spam or notspam with the frequency of occurrence of
> >> each
> >> token corresponding to the words of the mail, it's good?
> >>
> > +/- Yes. It's not the words that count but the tokens that count. The
> > tokenizer is responsible what gets considered as token. Example:
> > WORD: token -> uniGram (single word)
> > CHAIN: token -> biGram (chained tokens)
> > SBPH: token -> Sparse Binary Polynomial Hashing
> > OSB: token -> Orthogonal Sparse biGram
> >
> > You can read here a more detailed description of the tokenizers used in
> > DSPAM ->
> > http://sourceforge.net/apps/mediawiki/dspam/index.php?title=Tokenizers
> >
> >
> >> But I don't really understand how this calculation is made.
> >>
> >> I would like to know at whitch moment a mail is considered as a spam, I
> >> think > 0.5 no?
> >>
> > Normally: Yes. After > 0.5.
> > A value of 0.5 indicates that a token is neutral. Neither Spam nor Ham.
> >
> >
> >> For X-dspam-Factors, for example: X-dspam-Factors: 15, and + everything,
> >> 0.99000, + call us, 0.99000, Judicial, 0.99000, + per day, 0.99000, +
> and
> >> lose, 0.99000, cost + to, 0.99000, ...........
> >>
> >> I think it's the probability for each word corresponding to a token, but
> >> what the 15?
> >>
> > 15 is the amount of tokens considered. Graham takes the most significant
> 15
> > tokens and uses them for the computation.
>
>
Ok, thank you again for your quick answers and your explanations Steve, you
help me a lot =)
> >> Thank you in advance if you can help me once again, and sorry again for
> >> my
> >> strange English.
> >>
> > I feel guilty! I do! I was the one writing about your English and I feel
> > guilty! Don't apologize for your English. Don't do that. Just write how
> > ever you think is right. I will ask you if I don't understand your
> > question/comment. Others will probably do the same. So please don't
> > apologize any more for your language. I guess I would be terrible in your
> > native language and you sure would still try to help me if I would write
> in
> > your native language (what is that anyway?). We are here on the mailing
> > list a bunch of people from all over the world. It's not always easy but
> we
>
>
Thank you! but don't feel guilty, i'm French and I haven't made a lot of
effort to learn English, so i'm happy when you try to understand me =)
> Your welcome!! :-)
>
> > manage it :)
> >
> >
> >> coma
> >>
> > Steve
>
>
> Thank you weber =)
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
Dspam-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-user