http://bugzilla.spamassassin.org/show_bug.cgi?id=2910





------- Additional Comments From [EMAIL PROTECTED]  2004-01-22 14:25 -------
Created an attachment (id=1723)
 --> (http://bugzilla.spamassassin.org/attachment.cgi?id=1723&action=view)
example comparison between GA-generated and perceptron-generated scores

Out of curiosity, I wroted up a simple perl script to comapre the scores in
one score file to another (see previously attached program, and example
comparison attached to this message).

Here's an example of the output:

SUBJ_REMOVE             0.001   1.887  188600.0
SUB_HELLO               0.001   1.782  178100.0
MARKET_SOLUTION         0.001   0.661   66000.0
HTML_FONTCOLOR_NAME     0.001   0.506   50500.0
UP_TO_OR_MORES          0.001   0.201   20000.0
FROM_HAS_ULINE_NUMS     0.001   0.104   10300.0
SAVE_MONEY              0.001   0.098    9700.0
RECEIVED_CACHEFLOW      0.001   0.057    5600.0
HTML_TAG_EXISTS_PARAM   0.004   0.206    5050.0
LARGE_HEX               0.001   0.042    4100.0
MSGID_THREESIXSIX       0.001   0.039    3800.0
BE_AMAZED               0.001   0.038    3700.0
HTML_COMMENT_8BITS      0.001   0.035    3400.0
DATE_MISSING            0.001   0.034    3300.0

The second column is the score from file1, in this case the GA scoring.
The third column is the perceptron scoring, and the last column is the
percentage change from the score in the second column to the
score in the third column. The list is sorted in descending order
by the absolute value of the percentage change. Maybe just the simple
difference would be more illustrative.

Conclusion: the perceptron model and the GA model calculate wildly
different scores but achieve similar accuracy (per JM's note).

Got a question. In the perceptron score files, there seems to be
an extra column for scores that is always zero. What's the purpose
of that column? Here's an example:

score ACCEPT_CREDIT_CARDS            0 0.002




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to