Re: Spamassassin Bayes... "why give that spam that score???"

2016-02-25 Thread John Hardin
On Thu, 25 Feb 2016, RW wrote: On Thu, 25 Feb 2016 13:58:03 -0800 (PST) John Hardin wrote: On Thu, 25 Feb 2016, Steve wrote: b) Configure spamc -C report (run as any user) to initiate training of the amavis bayes database (in ~amavis/.spamassassin) ? That would probably be a code change

Re: Spamassassin Bayes... "why give that spam that score???"

2016-02-25 Thread RW
On Thu, 25 Feb 2016 13:58:03 -0800 (PST) John Hardin wrote: > On Thu, 25 Feb 2016, Steve wrote: > > b) Configure spamc -C report (run as any user) to initiate > > training of the amavis bayes database (in ~amavis/.spamassassin) ? > > That would probably be a code chang

Re: Spamassassin Bayes... "why give that spam that score???"

2016-02-25 Thread Reindl Harald
Am 25.02.2016 um 22:58 schrieb John Hardin: b) Configure spamc -C report (run as any user) to initiate training of the amavis bayes database (in ~amavis/.spamassassin) ? That would probably be a code change, unless you want to write a wrapped script that calls the real spamc and then sa

Re: Spamassassin Bayes... "why give that spam that score???"

2016-02-25 Thread John Hardin
;t working - and autolearn became self-reinforcing as a result. I had been misinterpreting my logs (face-palm)! I now see that the training initiated by spamc (behind dovecot antispam) was trying to train the bayes database in ~/.spamassassin/bayes* - but amavis was using the bayes database i

Re: Spamassassin Bayes... "why give that spam that score???"

2016-02-25 Thread Bill Cole
yes, so I would say that autolearn is probably the cause of this behavior. Note that the bayes score doesn't contribute to the autolearning decision to avoid positive feedback, but if there are no non-Bayes spam signs and the message scores lightly negative like that one does, it can be le

Re: Spamassassin Bayes... "why give that spam that score???"

2016-02-25 Thread RW
On Thu, 25 Feb 2016 00:41:04 + Steve wrote: > On 24/02/2016 22:59, John Hardin wrote: > > How do you train your Bayes? Autolearn? General user submissions? > > Trusted user submissions? Only you, from only your personal mail? > Only my personal mailbox *really* matte

Re: Spamassassin Bayes... "why give that spam that score???"

2016-02-24 Thread Reindl Harald
positive spam) with the result of purge the whole bayes (commercial appliance using SpamAssassin as one part) after build up my own spamfilter solution, keep the whole corpus and *only* train by hand with no autlearning/autoexpire the bayes is 100% trustworthy and can be scored as nearly po

Re: Spamassassin Bayes... "why give that spam that score???"

2016-02-24 Thread John Hardin
On Thu, 25 Feb 2016, Reindl Harald wrote: 7.0 URIBL_BLACKContains an URL listed in the URIBL blacklist [URIs: leslie-bib***b.org] That, too. Steve, you might consider boosting your local score for URIBL_BLACK. :) -- John Hardin KA7OHZ

Re: Spamassassin Bayes... "why give that spam that score???"

2016-02-24 Thread John Hardin
y the cause of this behavior. Note that the bayes score doesn't contribute to the autolearning decision to avoid positive feedback, but if there are no non-Bayes spam signs and the message scores lightly negative like that one does, it can be learned as ham. That would make any subsequent sim

Re: Spamassassin Bayes... "why give that spam that score???"

2016-02-24 Thread Reindl Harald
net] 3.0 INVESTMENT_ADVICE BODY: Message mentions investment advice 1.5 BAYES_50 BODY: Bayes spam probability is 40 to 60% [score: 0.5002] 0.0 HTML_MESSAGE BODY: HTML included in message -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signa

Re: Spamassassin Bayes... "why give that spam that score???"

2016-02-24 Thread Steve
itives all match BAYES_00 - attracting a default score of -1.9. BAYES_00 seems to be at the crux of the misclassification. Is there a way to delve into why these messages have been allocated such a low bayes score - while (to a human) appearing blatant, simple, spam on "vanilla" spam t

Re: Spamassassin Bayes... "why give that spam that score???"

2016-02-24 Thread John Hardin
The false positives all match BAYES_00 - attracting a default score of -1.9. BAYES_00 seems to be at the crux of the misclassification. Is there a way to delve into why these messages have been allocated such a low bayes score - while (to a human) appearing blatant, simple, spam on "vanilla&quo

Spamassassin Bayes... "why give that spam that score???"

2016-02-24 Thread Steve
cts. * The emails consist fairly plain HTML and appear not to employ any significant obfuscation. * I have tried to train spamassassin with many of these spam samples - without any effect. * The bayes database is updated. The bayes_journal (37k), bayes_seen (5.2mb) and bayes_toks (5.4mb) files all h

Re: Error when trying to re-use Bayes database from one server to another

2016-02-16 Thread Kris Deugau
John Hardin wrote: > On Fri, 12 Feb 2016, Kris Deugau wrote: > >> In general though, if you're operating at a scale where one server isn't >> enough to handle your SA load, you may want to start thinking about SQL >> for Bayes, which can be shared much more easil

[Solved] Re: Error when trying to re-use Bayes database from one server to another

2016-02-14 Thread Sebastian Arcus
estions. Just to confirm that in the end I decided not to mess too much with a working system and didn't upgrade to db48 on the older system. I went down the route of backing up and restoring the bayes database using sa-learn - which worked perfectly fine. There is still the question of

Re: Error when trying to re-use Bayes database from one server to another

2016-02-13 Thread Sebastian Arcus
On 13/02/16 18:58, Bill Cole wrote: On 13 Feb 2016, at 3:49, Sebastian Arcus wrote: Thank you. The donor machine has db42, db44 and db44 packages installed, Based on the question below, I'll assume the second db44 above was a typo for db48, i.e. a Berkeley DB v4.8.x package. Yes - sorry, y

Re: Error when trying to re-use Bayes database from one server to another

2016-02-13 Thread Bill Cole
On 13 Feb 2016, at 3:49, Sebastian Arcus wrote: Thank you. The donor machine has db42, db44 and db44 packages installed, Based on the question below, I'll assume the second db44 above was a typo for db48, i.e. a Berkeley DB v4.8.x package. Tangentially: that's a risky mess. It's a common pr

Re: Error when trying to re-use Bayes database from one server to another

2016-02-13 Thread Reindl Harald
ation database to the customer, for use on their own machines. Sounds perfectly reasonable to share this as a commercial service, to me that's exactly what happens, fetch the bayes over a webservice when the checksum has changed - nobody on both sides want to use redis for a million reasons

Re: Error when trying to re-use Bayes database from one server to another

2016-02-13 Thread Antony Stone
On Saturday 13 February 2016 at 16:50:56, Reindl Harald wrote: > a different company with it's own infrastructure has no business to > ssh-tunneling or access *my server* in any other way directly > > DIFFERENT NETWORKS > DIFFERENT INFRASTRUCTURES > DIFFERENT OWNERS > DIFFERENT ADMINS > > NO DIR

Re: Error when trying to re-use Bayes database from one server to another

2016-02-13 Thread Reindl Harald
in your infrastructure but it won't work in the cases we have in real life where another company with independent infrastructure fetchs our bayes in context of a subscription over webservices, move the files in a temp-folder and train own samples before replace the local bayes with the result

Re: Error when trying to re-use Bayes database from one server to another

2016-02-13 Thread Marc Perkel
use Redis when it comes to different servers in different networks for different clients BDB works fine and relieable, at least without autolearning and autoexpire and having the bayes-db path read-only for the running spamd with namespaces 0 60388SPAM 0 21651HAM 02510401T

Re: Error when trying to re-use Bayes database from one server to another

2016-02-13 Thread Reindl Harald
rs in different networks for different clients BDB works fine and relieable, at least without autolearning and autoexpire and having the bayes-db path read-only for the running spamd with namespaces 0 60388SPAM 0 21651HAM 02510401TOKEN insgesamt 73M -rw--- 1 sa-milt s

Re: Error when trying to re-use Bayes database from one server to another

2016-02-13 Thread Marc Perkel
nts BDB works fine and relieable, at least without autolearning and autoexpire and having the bayes-db path read-only for the running spamd with namespaces 0 60388SPAM 0 21651HAM 02510401TOKEN insgesamt 73M -rw--- 1 sa-milt sa-milt 10M 2016-02-13 09:12 bayes_seen

Re: Error when trying to re-use Bayes database from one server to another

2016-02-13 Thread Sebastian Arcus
DB (Hash, version 9, native byte-order) On the receiver machine, but with bayes files created locally: #file bayes_seen bayes_seen: Berkeley DB (Hash, version 8, native byte-order) # file bayes_toks bayes_toks: Berkeley DB (Hash, version 8, native byte-order) Could the hash version account for

Re: Error when trying to re-use Bayes database from one server to another

2016-02-13 Thread Reindl Harald
t least without autolearning and autoexpire and having the bayes-db path read-only for the running spamd with namespaces 0 60388SPAM 0 21651HAM 02510401TOKEN insgesamt 73M -rw--- 1 sa-milt sa-milt 10M 2016-02-13 09:12 bayes_seen -rw--- 1 sa-milt sa-milt 81M 2

Re: Error when trying to re-use Bayes database from one server to another

2016-02-13 Thread Reindl Harald
Am 13.02.2016 um 02:46 schrieb Benny Pedersen: On 12. feb. 2016 20.06.52 Marc Perkel wrote: # ls -l /var/spool/spamd/bayes/ Set bayes path without bayes why do you always give wrong advises? https://wiki.apache.org/spamassassin/SiteWideBayesSetup Note that the argument to bayes_path is

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Bill Cole
) On the receiver machine, but with bayes files created locally: #file bayes_seen bayes_seen: Berkeley DB (Hash, version 8, native byte-order) # file bayes_toks bayes_toks: Berkeley DB (Hash, version 8, native byte-order) Could the hash version account for the errors I am seeing? Absolutely

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Marc Perkel
For what it's worth - just used Redis. Redis is the only thing that's worked reliably for me.

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Benny Pedersen
On 12. feb. 2016 20.06.52 Marc Perkel wrote: # ls -l /var/spool/spamd/bayes/ Set bayes path without bayes bayes: cannot open bayes databases /var/spool/spamd/bayes/bayes_* R/W: Remove bayes from local.cf Sent with AquaMail for Android http://www.aqua-mail.com

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Sebastian Arcus
On 12/02/16 21:40, Kris Deugau wrote: Sebastian Arcus wrote: On 12/02/16 20:31, Antony Stone wrote: On Friday 12 February 2016 at 17:29:23, Sebastian Arcus wrote: As per advice from this list, I have been re-using my bayes databases on several different servers running SA. On one of the

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Sebastian Arcus
On 12/02/16 21:40, Kris Deugau wrote: Sebastian Arcus wrote: On 12/02/16 20:31, Antony Stone wrote: On Friday 12 February 2016 at 17:29:23, Sebastian Arcus wrote: As per advice from this list, I have been re-using my bayes databases on several different servers running SA. On one of the

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread John Hardin
On Fri, 12 Feb 2016, Kris Deugau wrote: In general though, if you're operating at a scale where one server isn't enough to handle your SA load, you may want to start thinking about SQL for Bayes, which can be shared much more easily than pushing file-based Bayes data around. Or Re

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Reindl Harald
Am 12.02.2016 um 22:40 schrieb Kris Deugau: In general though, if you're operating at a scale where one server isn't enough to handle your SA load, you may want to start thinking about SQL for Bayes, which can be shared much more easily than pushing file-based Bayes data around

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Reindl Harald
that case ignore my last post, which assumed it was an SElinux problem. Could the problem be down to differing versions of the bayes database manager? If so, it may be worth letting SA set up an empty Bayes database and using the backup tool to make a backup on the source system in a version-agnostic f

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Kris Deugau
Sebastian Arcus wrote: > On 12/02/16 20:31, Antony Stone wrote: >> On Friday 12 February 2016 at 17:29:23, Sebastian Arcus wrote: >> >>> As per advice from this list, I have been re-using my bayes databases on >>> several different servers running SA. On one of the

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Martin Gregorie
hich assumed it was an SElinux problem. Could the problem be down to differing versions of the bayes database manager? If so, it may be worth letting SA set up an empty Bayes database and using the backup tool to make a backup on the source system in a version-agnostic format, e.g as a CSV file, and th

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Martin Gregorie
On Fri, 2016-02-12 at 15:49 -0500, Bowie Bailey wrote: > On 2/12/2016 3:45 PM, Sebastian Arcus wrote: > > On 12/02/16 20:31, Antony Stone wrote: > > > On Friday 12 February 2016 at 17:29:23, Sebastian Arcus wrote: > > > > > > > As per advice from t

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Sebastian Arcus
On 12/02/16 20:49, Bowie Bailey wrote: On 2/12/2016 3:45 PM, Sebastian Arcus wrote: On 12/02/16 20:31, Antony Stone wrote: On Friday 12 February 2016 at 17:29:23, Sebastian Arcus wrote: As per advice from this list, I have been re-using my bayes databases on several different servers running

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Bowie Bailey
On 2/12/2016 3:45 PM, Sebastian Arcus wrote: On 12/02/16 20:31, Antony Stone wrote: On Friday 12 February 2016 at 17:29:23, Sebastian Arcus wrote: As per advice from this list, I have been re-using my bayes databases on several different servers running SA. On one of the servers though, the

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Sebastian Arcus
On 12/02/16 20:31, Antony Stone wrote: On Friday 12 February 2016 at 17:29:23, Sebastian Arcus wrote: As per advice from this list, I have been re-using my bayes databases on several different servers running SA. On one of the servers though, the database is not accepted. Are the servers all

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Reindl Harald
Am 12.02.2016 um 21:31 schrieb Antony Stone: On Friday 12 February 2016 at 17:29:23, Sebastian Arcus wrote: As per advice from this list, I have been re-using my bayes databases on several different servers running SA. On one of the servers though, the database is not accepted. Are the

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Antony Stone
On Friday 12 February 2016 at 17:29:23, Sebastian Arcus wrote: > As per advice from this list, I have been re-using my bayes databases on > several different servers running SA. On one of the servers though, the > database is not accepted. Are the servers all the same distro, release an

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Sebastian Arcus
issions I would look at the directories. see previous mail - that was already verified looking closer "No such file or directory" is not a permission problem there was a hint "please re-run with -D" at least re-use bayes on different servers, even over different operating

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Reindl Harald
see previous mail - that was already verified looking closer "No such file or directory" is not a permission problem there was a hint "please re-run with -D" at least re-use bayes on different servers, even over different operating systems is no problem, or bayes is runni

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Marc Perkel
: As per advice from this list, I have been re-using my bayes databases on several different servers running SA. On one of the servers though, the database is not accepted. I re-transferred them several times over ssh, to make sure they were not corrupted. The database files are in the correct

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Sebastian Arcus
On 12/02/16 16:59, Reindl Harald wrote: Am 12.02.2016 um 17:29 schrieb Sebastian Arcus: As per advice from this list, I have been re-using my bayes databases on several different servers running SA. On one of the servers though, the database is not accepted. I re-transferred them several

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Sebastian Arcus
On 12/02/16 16:59, Reindl Harald wrote: Am 12.02.2016 um 17:29 schrieb Sebastian Arcus: As per advice from this list, I have been re-using my bayes databases on several different servers running SA. On one of the servers though, the database is not accepted. I re-transferred them several

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Reindl Harald
Am 12.02.2016 um 17:29 schrieb Sebastian Arcus: As per advice from this list, I have been re-using my bayes databases on several different servers running SA. On one of the servers though, the database is not accepted. I re-transferred them several times over ssh, to make sure they were not

Re: Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread John Hardin
On Fri, 12 Feb 2016, Sebastian Arcus wrote: # ls -l /var/spool/spamd/bayes/ total 5912 -rw-rw-rw- 1 spamd spamd 1310720 2016-02-09 08:42 bayes_seen -rw-rw-rw- 1 spamd spamd 4739072 2016-02-09 08:43 bayes_toks When I try to learn a new message on the receiving server (where I moved the bayes

Error when trying to re-use Bayes database from one server to another

2016-02-12 Thread Sebastian Arcus
As per advice from this list, I have been re-using my bayes databases on several different servers running SA. On one of the servers though, the database is not accepted. I re-transferred them several times over ssh, to make sure they were not corrupted. The database files are in the correct

Re: Can your bayes do this?

2016-01-24 Thread Dave Warren
On 2016-01-20 22:21, Marc Perkel wrote: Here is a list of 3494938 words and phrases used in the subject line of SPAM and never seen in the subject line of HAM http://www.junkemailfilter.com/data/subject-spam.txt I thought I'd take you up on this using a combination of my corpus, and the othe

Re: Can your bayes do this?

2016-01-21 Thread John Hardin
On Thu, 21 Jan 2016, RW wrote: On Thu, 21 Jan 2016 08:53:10 -0800 (PST) John Hardin wrote: There was an improvement in FP and FN from two tokens. The marginal improvement from three doesn't seem worth it. The improvement from 2 to 3 is more substantial than from 1 to 2 287/160 = 1.79 160/6

Re: Can your bayes do this?

2016-01-21 Thread Reindl Harald
is-training when previously as BAYES_999 or BAYES_00 classified samples change their result that's done with a dedicated SA-instance doing only bayes test and nothing else feeded by "spamc" and parsing the outputs, takes around 1 hour on the current hardware

Re: Can your bayes do this?

2016-01-21 Thread RW
On Thu, 21 Jan 2016 08:53:10 -0800 (PST) John Hardin wrote: > There was an improvement in FP and FN from two tokens. The marginal > improvement from three doesn't seem worth it. The improvement from 2 to 3 is more substantial than from 1 to 2 287/160 = 1.79 160/69 = 2.3 Whether any of thi

Re: Can your bayes do this?

2016-01-21 Thread Reindl Harald
corpus the database size is dominated by ephemeral tokens which makes the situation look worse than it is. It depends what you want. I don't care about an extra 100 MB of disk space and a few milliseconds if it gives any measurable improvement. Personally I wouldn't like to see Bayes g

Re: Can your bayes do this?

2016-01-21 Thread John Hardin
ephemeral tokens which makes the situation look worse than it is. It depends what you want. I don't care about an extra 100 MB of disk space and a few milliseconds if it gives any measurable improvement. Personally I wouldn't like to see Bayes go multi-word because it would likely end-up

Re: Can your bayes do this?

2016-01-21 Thread RW
rom corpus the database size is dominated by ephemeral tokens which makes the situation look worse than it is. It depends what you want. I don't care about an extra 100 MB of disk space and a few milliseconds if it gives any measurable improvement. Personally I wouldn't like to see

Re: Can your bayes do this?

2016-01-21 Thread Reindl Harald
Am 21.01.2016 um 14:17 schrieb RW: On Thu, 21 Jan 2016 13:45:08 +0100 Christian Laußat wrote: Am 21.01.2016 13:19, schrieb Reindl Harald: no entirely when "urrently, SA's bayes tokens are single words" from https://mail-archives.apache.org/mod_mbox/spamassassin-dev/201211.mbox

Re: Can your bayes do this?

2016-01-21 Thread RW
On Thu, 21 Jan 2016 13:45:08 +0100 Christian Laußat wrote: > Am 21.01.2016 13:19, schrieb Reindl Harald: > > no entirely when "urrently, SA's bayes tokens are single words" from > > https://mail-archives.apache.org/mod_mbox/spamassassin-dev/201211.mbox/%3c509d55a8.3

Re: Can your bayes do this?

2016-01-21 Thread Dianne Skoll
pams, but also in 22 spams. While 1400/1422 still makes the token useful for Bayes, his algorithm would discount it altogether because it's not "pure" ham. Regards, Dianne.

Re: Can your bayes do this?

2016-01-21 Thread Dianne Skoll
ject line of HAM [snip] And what, exactly, is your point? Bayes would handle that just fine. Tokens in your first list would score 0.00 for spam probability and tokens in your second list would score 1.00 and Bayes would be great. Regards, Dianne.

Re: Can your bayes do this?

2016-01-21 Thread Christian Laußat
Am 21.01.2016 13:19, schrieb Reindl Harald: no entirely when "urrently, SA's bayes tokens are single words" from https://mail-archives.apache.org/mod_mbox/spamassassin-dev/201211.mbox/%3c509d55a8.30...@gmail.com%3E is still true please review that response below and consider

Re: Can your bayes do this?

2016-01-21 Thread RW
; > > > >"ambulatory care" -> only in ham > >"aall cards" -> only in spam > > > > and > > > > "ambulatory care" occurs 16 times in ham and 0 times in spam > > "aall cards" occurs

Re: Can your bayes do this?

2016-01-21 Thread Reindl Harald
tween "ambulatory care" -> only in ham "aall cards" -> only in spam and "ambulatory care" occurs 16 times in ham and 0 times in spam "aall cards" occurs 0 times in ham and 3 times in spam is that you have discarded the count informat

Re: Can your bayes do this?

2016-01-21 Thread Antony Stone
uot;ambulatory care" occurs 16 times in ham and 0 times in spam > >"aall cards" occurs 0 times in ham and 3 times in spam > > is that you have discarded the count information. Plus, the "never in ham" and "never in spam" lists omit any me

Re: Can your bayes do this?

2016-01-21 Thread RW
On Wed, 20 Jan 2016 22:21:49 -0800 Marc Perkel wrote: > OK - Just to show you this isn't Bayesian - see if you can do this. > > Here is a list of 5505874 words and phrases used in the subject line > of HAM and never seen in the subject line of SPAM > > http://www.junkemailfilter.com/data/subject

Re: Can your bayes do this?

2016-01-21 Thread Reindl Harald
;" and when you don't stop advertising that aggressive you are classified as spammer too 177 MB only subjects? well, not really impressive given that i easly get the same results with a 81 MB bayes-db containing the *complete* junk of 1.5 years while only selected ham (reported w

Re: Can your bayes do this?

2016-01-20 Thread Matthias Apitz
El día Wednesday, January 20, 2016 a las 10:21:49PM -0800, Marc Perkel escribió: > OK - Just to show you this isn't Bayesian - see if you can do this. > > Here is a list of 5505874 words and phrases used in the subject line of > HAM and never seen in the subject line of SPAM > > http://www.junk

Can your bayes do this?

2016-01-20 Thread Marc Perkel
OK - Just to show you this isn't Bayesian - see if you can do this. Here is a list of 5505874 words and phrases used in the subject line of HAM and never seen in the subject line of SPAM http://www.junkemailfilter.com/data/subject-ham.txt Here is a list of 3494938 words and phrases used in th

Re: The difference between my Evolution filter and Bayes is ...

2016-01-20 Thread Dianne Skoll
On Wed, 20 Jan 2016 12:01:59 -0800 Marc Perkel wrote: > Bayes compares the test message to what's in the Ham corpus and > what's in the Spam corpus and comes up with a number indicating it's > more like one or the other. As I mentioned earlier, your filter is

The difference between my Evolution filter and Bayes is ...

2016-01-20 Thread Marc Perkel
Bayes compares the test message to what's in the Ham corpus and what's in the Spam corpus and comes up with a number indicating it's more like one or the other. Evolution matched the Ham corpus and not matches the spam corpus to get a ham score. Then it matches the spam corpus

Re: Is BAYES filtering working? Having doubts.

2015-12-30 Thread Bill Cole
On 30 Dec 2015, at 8:37, RW wrote: On Tue, 29 Dec 2015 20:41:31 -0500 Bill Cole wrote: On 29 Dec 2015, at 20:02, Ian Zimmerman wrote: esired result. Clearly you can do the su magic if needed. Um, no. Neither su nor sudo magically changes the permissions or ownership of files. No, but

Re: Is BAYES filtering working? Having doubts.

2015-12-30 Thread RW
On Tue, 29 Dec 2015 20:41:31 -0500 Bill Cole wrote: > On 29 Dec 2015, at 20:02, Ian Zimmerman wrote: > esired result. > > > > Clearly you can do the su magic if needed. > > Um, no. > > Neither su nor sudo magically changes the permissions or ownership of > files. No, but sudo allows sa-

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread Reindl Harald
Am 30.12.2015 um 03:11 schrieb Ian Zimmerman: On 2015-12-29 20:41 -0500, Bill Cole wrote: Neither su nor sudo magically changes the permissions or ownership of files. If you pass filenames as arguments they must be readable by the user actually running sa-learn, which is the *unprivileged* us

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread Ian Zimmerman
On 2015-12-29 20:41 -0500, Bill Cole wrote: > Neither su nor sudo magically changes the permissions or ownership of > files. If you pass filenames as arguments they must be readable by the > user actually running sa-learn, which is the *unprivileged* user > handling the system-wide BayesDB ("amavi

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread Bill Cole
On 29 Dec 2015, at 20:02, Ian Zimmerman wrote: On 2015-12-29 19:44 -0500, Bill Cole wrote: On 29 Dec 2015, at 18:54, Ian Zimmerman wrote: In fact sa-learn accepts multiple named arguments on the command line, so the alternative I use is to go through the spambox N files at a time in a shel

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread Ian Zimmerman
On 2015-12-29 19:44 -0500, Bill Cole wrote: > On 29 Dec 2015, at 18:54, Ian Zimmerman wrote: > > >In fact sa-learn accepts multiple named arguments on the command line, > >so the alternative I use is to go through the spambox N files at a time > >in a shell loop. (I have N=100 but obviously this

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread Bill Cole
On 29 Dec 2015, at 18:54, Ian Zimmerman wrote: In fact sa-learn accepts multiple named arguments on the command line, so the alternative I use is to go through the spambox N files at a time in a shell loop. (I have N=100 but obviously this depends.) Which successfully ignores the original i

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread Ian Zimmerman
On 2015-12-29 17:50 -0500, Bill Cole wrote: > Yes, with the advantage of using Mail::SpamAssassin::Util::secure_tmpfile() > rather > than whatever I happen to roll up in a bit of Q&D shell that I never get > around to > reviewing for edge cases... > > The main reason to do something like that i

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread Bill Cole
lso possible to train via spamc. Yes. IF you run spamd and it's how your system-wide SA filtering is done already, that's arguably the best way to do ad hoc (re)training since you can be sure it's hitting the right DB and you can feed it in parallel. Personally I'd avoid

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread Bill Cole
On 29 Dec 2015, at 8:28, Jude DaShiell wrote: With spamassassin, is it possible to have the filter show counts of number of messages sent to spam, number of messages sent to ham, and total number of messages processed that a user can check? Since SpamAssassin is a suite of Perl modules and an

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread RW
ust a directory of files. If you need to train an arbitrary selection of files, you could symlink them into a temporary directory. If you run spamd it's also possible to train via spamc. Personally I'd avoid the unforced use of mbox around Bayes without being sure that "From-escaping&q

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread Reindl Harald
ze=0 --progress --ham /sample-folder/ham/ while both folders contain single eml-files which don't need to have a leading 'From' sa-learn is able to display progress including estimated time to finish _ yours: for SAMPLE_FILE in "$SA_MILTER_HOME&q

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread Chalmers
er of > messages processed that a user can check?On Mon, 28 Dec 2015, Bill Cole wrote: > >> Date: Mon, 28 Dec 2015 23:42:03 >> From: Bill Cole >> Reply-To: users@spamassassin.apache.org >> To: users@spamassassin.apache.org >> Subject: Re: Is BAYES filtering working?

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread Jude DaShiell
: users@spamassassin.apache.org To: users@spamassassin.apache.org Subject: Re: Is BAYES filtering working? Having doubts. On 28 Dec 2015, at 17:54, Peter L. Berghold wrote: The script that I use to pull the messages out of a spam bucket invoking sa-learn runs as root which has permissions to read

Re: Is BAYES filtering working? Having doubts.

2015-12-28 Thread Bill Cole
On 28 Dec 2015, at 17:54, Peter L. Berghold wrote: The script that I use to pull the messages out of a spam bucket invoking sa-learn runs as root which has permissions to read from anywhere. The complication is the amavis does not have permissions to read the Maildir files for trivial users

Re: Is BAYES filtering working? Having doubts.

2015-12-28 Thread Reindl Harald
at said, I have some thoughts as how to solve that well, you should never run such commands as root https://wiki.apache.org/spamassassin/SiteWideBayesSetup in the best case you configure both (training user and amavis) to use the same bayes database or you find a way to read the samples a

Re: Is BAYES filtering working? Having doubts.

2015-12-28 Thread Peter L. Berghold
I think you might be on to something here. When I run "sa-learn --dump magic" as root and as amavis they are definitely different. Here is the result as "root" again: # sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db vers

Re: Is BAYES filtering working? Having doubts.

2015-12-28 Thread John Hardin
On Mon, 28 Dec 2015, Peter L. Berghold wrote: On Mon, Dec 28, 2015 at 11:38:17AM -0800, John Hardin wrote: * you haven't also been training ham. Bayes needs sufficient examples of both to be able to make a judgement. Oh yes, been training ham too. Good. * you're somehow m

Re: Is BAYES filtering working? Having doubts.

2015-12-28 Thread Reindl Harald
them to sa-learn and yet they still keep popping up every other fetch from my server. How do I figure out where the issue is or if the learning is even working? * what does oyur maillog say when your grep for BAYES * what do your headers say * did you train at least 100 spam *and* ham samples *

Re: Is BAYES filtering working? Having doubts.

2015-12-28 Thread John Hardin
them to sa-learn and yet they still keep popping up every other fetch from my server. How do I figure out where the issue is or if the learning is even working? This is a FAQ. Have you searched the mailing list archives? Common problems: * you're not training the Bayes database that SA/Amav

Re: Is BAYES filtering working? Having doubts.

2015-12-28 Thread Antony Stone
On Monday 28 December 2015 at 20:27:32, Peter L. Berghold wrote: > I've been noticing a lot of SPAM emails coming to my account > How do I figure out where the issue is or if the learning is even > working? Show us the headers of the delivered email/s? Antony. -- "Once you have a panic, thin

Is BAYES filtering working? Having doubts.

2015-12-28 Thread Peter L. Berghold
I've been noticing a lot of SPAM emails coming to my account with subject headers "Trump's Brain Secret" and similar, along with "Amazon Gift Card" and other something for nothing sorts of emails. I keep feeding them to sa-learn and yet they still keep popping up every other fetch from my serv

Re: bayes problem?

2015-12-17 Thread Reindl Harald
PAM due to BAYES_99 (99-100% SPAM), for example this mail I'm responding now; I saved it as 'rh.mail' and run it through: "sa-update" adjust scores, brings rules, disables rules but has no business in change the bayes behavior when ham hat a BAYES_99 you misclassified mails or

Re: bayes problem?

2015-12-17 Thread Matthias Apitz
SPAM), for example > > this mail I'm responding now; I saved it as 'rh.mail' and run it > > through: > > "sa-update" adjust scores, brings rules, disables rules but has no > business in change the bayes behavior > > when ham hat a BAYES_99 you mi

Re: bayes problem?

2015-12-17 Thread Reindl Harald
"sa-update" adjust scores, brings rules, disables rules but has no business in change the bayes behavior when ham hat a BAYES_99 you misclassified mails or did not keep a ham/spam balance in your training $ spamassassin -tD < rh.mail > rh.out 2> rh.debug The results are

bayes problem? (was: Re: feed spamassassin with a catch-all address)

2015-12-17 Thread Matthias Apitz
gt; rh.debug The results are here http://www.unixarea.de/SA/rh.mail http://www.unixarea.de/SA/rh.out http://www.unixarea.de/SA/rh.debug Can some kind soul help me please having a look what is now wrong with my bayes ? Thanks in advance matthias -- Matthias Apitz, ✉ g...@unixarea.de,

Re: Trying Bayes / Redis

2015-12-15 Thread Axb
On 12/15/2015 10:57 PM, Marc Perkel wrote: This Bayes Redis works GREAT. For years I've been trying to get bayes to work and now finally IT WORKS good news for once... just watch memory usage... :)

Re: Trying Bayes / Redis

2015-12-15 Thread Marc Perkel
This Bayes Redis works GREAT. For years I've been trying to get bayes to work and now finally IT WORKS -- Marc Perkel - Sales/Support supp...@junkemailfilter.com http://www.junkemailfilter.com Junk Email Filter dot com 415-992-3400

Re: Trying Bayes / Redis

2015-12-12 Thread Benny Pedersen
On December 13, 2015 1:16:17 AM Marc Perkel wrote: Because I'd have to upgrade 50 servers for consistency and if I do that I'll probably try something other than centos. okay, it just not how i would solve +1 server farms in gentoo, here i would emerge --buildpkgonly on master, and then eme

<    2   3   4   5   6   7   8   9   10   11   >