https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8064
Bug ID: 8064 Summary: Sa-learn takes a very long time to learn each letter Product: Spamassassin Version: 3.4.6 Hardware: PC OS: Linux Status: NEW Severity: normal Priority: P2 Component: Learner Assignee: dev@spamassassin.apache.org Reporter: al...@data-netsoft.ru Target Milestone: Undefined Created attachment 5845 --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5845&action=edit Log and settings When I teach him the Bayesian classifier, a lot of time is spent on each letter, more than 30 seconds! I can't understand why this is happening. Here is a piece of the sa-learn log where you can see the delay: ---- begin ----- sa-learn -D --spam --no-sync --username=vmail /tmp/111.msg ... Oct 14 16:18:14.126 [482455] dbg: uri: canonicalizing parsed uri: mailto:al...@mydomain.com Oct 14 16:18:14.126 [482455] dbg: uri: cleaned uri: mailto:al...@mydomain.com Oct 14 16:18:14.126 [482455] dbg: uri: added host: mydomain.com domain: mydomain.com Oct 14 16:18:14.126 [482455] dbg: uri: canonicalizing domainkeys uri: domainkeys:mydomain.com Oct 14 16:18:14.126 [482455] dbg: uri: cleaned uri: domainkeys:mydomain.com Oct 14 16:18:14.126 [482455] dbg: uri: added host: mydomain.com domain: mydomain.com Oct 14 16:18:14.358 [482455] dbg: bayes: tokenized body: 11 tokens Oct 14 16:18:14.358 [482455] dbg: bayes: tokenized uri: 5 tokens Oct 14 16:18:14.358 [482455] dbg: bayes: tokenized invisible: 0 tokens Oct 14 16:18:14.360 [482455] dbg: bayes: tokenized header: 145 tokens Oct 14 16:18:49.346 [482455] dbg: bayes: tokenized body: 11 tokens Oct 14 16:18:49.346 [482455] dbg: bayes: tokenized uri: 5 tokens Oct 14 16:18:49.346 [482455] dbg: bayes: tokenized invisible: 0 tokens Oct 14 16:18:49.347 [482455] dbg: bayes: tokenized header: 145 tokens Oct 14 16:19:25.725 [482455] dbg: bayes: seen (92892bf23689ce621c550aee0ed36d2e8264a618@sa_generated) put Oct 14 16:19:25.725 [482455] dbg: bayes: learned '92892bf23689ce621c550aee0ed36d2e8264a618@sa_generated', atime: 1665752160 Oct 14 16:19:25.725 [482455] dbg: TxRep: learning a message Oct 14 16:19:25.725 [482455] dbg: check: pms new, time limit in 228.393 s Oct 14 16:19:25.725 [482455] dbg: message: using Return-Path header as EnvelopeFrom: 'al...@mydomain.com' Oct 14 16:19:25.725 [482455] dbg: check: tagrun - tag SENDERDOMAIN is now ready, value: mydomain.com Oct 14 16:19:25.725 [482455] dbg: check: tagrun - tag AUTHORDOMAIN is now ready, value: mydomain.com ... ... ----- end ------ I thought at first that it might be Ackdns, I tried to comment out the plugin in the v340.pre file, but it didn't help. I can't understand why there is a delay in these places. I tried running spamassassin without using Mysql - the delay in training is about the same. I didn't include any exclusive parameters. Everything was set up with a clean install. I attach the full output of sa-lern logs, as well as all my configuration files. Otherwise, spamassassin works as it should in a bundle of Postfix+Dovecot+Spamassassin+Roundcube (Ubuntu 20.04). I need to get rid of the delay, because when a user clicks the "spam" button in Roundcube, it takes a very long time until the email is examined. Users complain about such a long delay. -- You are receiving this mail because: You are the assignee for the bug.