Hello I have Bayes in SQL for each users (emails) on test server. SA is trigger by /usr/local/bin/spamc -U /var/run/spamd/spamd.socket -u $local_part@$domain
I looked at the results in database and have doubt. select * from bayes_vars; id | username | spam_count | ham_count | token_count 1 | a@x.x | 1 | 8 | 3937 13 | t@x.x | 0 | 1 | 356 15 | i@x.x | 0 | 1 | 360 Column skiped: last_expire | last_atime_delta | last_expire_reduce | oldest_token_age | newest_token_age | account id 1 is oldest created few days ago. "Trained" myself. 13 and 15 is new account received only one email: Why both account have token_count ~ 360 ? Not 1? whether these tokens are inherited? sa-learn -ut@x.x --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 0 0 non-token data: nspam 0.000 0 1 0 non-token data: nham 0.000 0 356 0 non-token data: ntokens 0.000 0 1406154984 0 non-token data: oldest atime 0.000 0 1406154984 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 0 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count for id: 15 sa-learn -ui@x.x --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 0 0 non-token data: nspam 0.000 0 1 0 non-token data: nham 0.000 0 360 0 non-token data: ntokens 0.000 0 1406159567 0 non-token data: oldest atime 0.000 0 1406159567 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 0 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count Probably I should make --sync. Second question: whether SA draws attention to mail's header TO, CC etc.? I want make pre learning. Collect dozens of "super" spam mails from different accounts and by script learn all accounts in loop sa-learn --spam --username=$account /spam/dir/* Mail addressed to another person will not be a problem in learning process? Best Regards.