So assuming that @t=sort {abs($b-.5)<=>abs($a-.5)} @t;
is not working well ? Thomas Wim Borghs <wim.bor...@gmail.com> 26.03.2009 12:31 Bitte antworten an ASSP development mailing list <assp-test@lists.sourceforge.net> An ASSP development mailing list <assp-test@lists.sourceforge.net> Kopie Thema [Assp-test] absurd misclassifications by bayesian check We've been seeing some really absurd classifications done by the bayesian check. Mails that are obviously not spam are deemed as spam by it and vice versa. I think it started when we switched to the version 2 branch. After having looked into it, It seems to me that the sort in sub BayesOK doesn't work correct in about half the cases. The sort in that sub is meant to pick the most interesting bayesian scores for calculating the spamprobability and bayesian confidence. So if the sort fails those probability and confidence scores are based on random token-pairs instead of the most interesting ones. I do know this is absurd, ridiculous, whatever... and part of me feels I must be making some mistake and I will feel shame if I discover what it is... After all, Perl is far from obscure, the ActiveState implementation is far from obscure, assp isn't an obscure piece of software. So if there would be a bug like this somewhere it would have been found and fixed by now. But nonetheless, this is what it seems like to me now. I added this (blue) code to sub BayesOK to check if @t was sorted ok and have some logging on the issue: $itime=time-$stime; mlog($fh,"info: Bayesian-Check has taken $itime seconds") if $BayesianLog == 2; my $index; for ($index=0; $index<(@t-1); $index++) { if (abs($t[$index]-0.5) < abs($t[$index+1]-0.5)) { $this->{spamprob}=0; mlog($fh, "Sort-in-sub-BayesOK-incorrect at element $index!!!"); last; } } if ($index == @t-1) { mlog($fh, "Sort-in-sub-BayesOK-correct"); } return $this->{spamprob}=0 if $DoBayesian==2; This should always log "Sort-in-sub-BayesOK-correct" I think. But it logs incorrect in slightly more than half the cases. Unfortunately we're very busy here so I don't have a lot of time to spend on this. Some things I thought about that I can check: - Does the version 1 branch have the same issue? - Do other uses of sort in assp show the same issue? I tried if uninstalling and reinstalling ActivePerl-5.10.0.1004-MSWin32-x86-287188.msi fixes it but it didn't. I tried if ActivePerl v5.8 shows the same issue and it does. ------------------------------------------------------------------------------ _______________________________________________ Assp-test mailing list Assp-test@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/assp-test DISCLAIMER: ******************************************************* This email and any files transmitted with it may be confidential, legally privileged and protected in law and are intended solely for the use of the individual to whom it is addressed. This email was multiple times scanned for viruses. There should be no known virus in this email! ******************************************************* ------------------------------------------------------------------------------ _______________________________________________ Assp-test mailing list Assp-test@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/assp-test