[ http://issues.apache.org/jira/browse/JAMES-387?page=comments#action_12358307 ]
Bernd Fondermann commented on JAMES-387: ---------------------------------------- I looked at the Mailet code and found that in buildCorpus(), instance variable "corpus" is filled with all ham and spam tokens which appear to be Maps of (String, Integer) pairs. Afterwards, the map is iterated and all values are replaced by Doubles, but while this is running (and taking longer every time) there could still be a fair amount of Integer-typed values. If another thread is stepping into line 591 at the same time this is still in process the error could very well occur because "corpus" is read there. Are new mails fed in a separate thread? The class cast in line 591 could be changed to "Number", as a very simple solution. Maybe it would also be appropriate to refactor buildCorpus() to work on a local map until it is ready with re-filling it with Doubles. Hope this analysis makes some sense and I did not completely misread this whole case... :-) > Exception in BayesianAnalysis > ----------------------------- > > Key: JAMES-387 > URL: http://issues.apache.org/jira/browse/JAMES-387 > Project: James > Type: Bug > Components: Matchers/Mailets (bundled) > Versions: 3.0 > Environment: James from svn-trunk 2005-08-01. > MySQL 4.0 > Reporter: Stefano Bagnara > Assignee: Vincenzo Gianferrari Pini > Priority: Minor > > Got this exception for every incoming mail: > 02/08/05 00:39:25 INFO James.Mailet: BayesianAnalysis: Exception: > java.lang.Integer > java.lang.ClassCastException: java.lang.Integer > at > org.apache.james.util.BayesianAnalyzer.getTokenProbabilityStrengths(BayesianAnalyzer.java:591) > at > org.apache.james.util.BayesianAnalyzer.computeSpamProbability(BayesianAnalyzer.java:340) > at > org.apache.james.transport.mailets.BayesianAnalysis.service(BayesianAnalysis.java:289) > at > org.apache.james.transport.LinearProcessor.service(LinearProcessor.java:407) > at > org.apache.james.transport.JamesSpoolManager.process(JamesSpoolManager.java:460) > at > org.apache.james.transport.JamesSpoolManager.run(JamesSpoolManager.java:369) > at java.lang.Thread.run(Unknown Source) > If I clean my spam/ham db the exceptions disappears but they start again when > the spam/ham db become large. > My bayesiananalysis_spam contains 200000 rows. > The following are the spam tokens with higher "occurrences". > +---------------------------+-------------+ > | token | occurrences | > +---------------------------+-------------+ > | 3D | 82151 | > | a | 59953 | > | the | 45295 | > | FONT | 42771 | > | Content-Type | 39058 | > | to | 36626 | > | com | 32902 | > | http | 32886 | > | of | 32504 | > | font | 31803 | > | and | 31577 | > | Content-Transfer-Encoding | 31576 | > | p | 29746 | > | text | 29482 | > | in | 29418 | > | it | 28498 | > | br | 28037 | > | DIV | 27431 | -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]