Re: problem with spamassassin for WIndows
Hi Bill, this is the result of the command you suggested to type: feb 16 07:21:09.678 [21824] warn: Use of uninitialized value $_[1] in hash eleme nt at Mail/SpamAssassin/Conf/Parser.pm line 571, line 717. On 16 febbraio 2018 a 02:06:40, Bill Cole (sausers-20150...@billmail.scconsult.com) scritto: On 15 Feb 2018, at 15:33, Gianluca Furnarotto wrote: > Hi, > > I am trying to use Bayes with spamassassin, now it seems stop to > learn, and > when I use a command as "sa-learn --dump magic", or "sa-learn --sync", > or other sa-learn commands, > it appears this error: > "Use of uninitialized value $_[1] in hash element at > Mail/SpamAssassin/Conf/Parser.pm line 571." > > Line 571 is this: > " } " > inside these lines. > " elsif ($type == $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST) { > $cmd->{code} = \&set_addrlist_value; > }" <--- line 571 That absolutely IS NOT line 571 of Mail/SpamAssassin/Conf/Parser.pm in SA version 3.4.1. That's line 685. The relevant lines in Mail/SpamAssassin/Conf/Parser.pm: 568 569 # functions supported in the "if" eval: 570 sub cond_clause_plugin_loaded { 571 return $_[0]->{conf}->{plugins_loaded}->{$_[1]}; 572 } 573 My first guess on this is that your configuration has a typo. Try running 'spamassassin --lint' to check it. The error message indicates that something is calling the subroutine 'cond_clause_plugin_loaded' in a way that gives it only one parameter where it is expecting 2, the first of which is an object reference.
Re: problem with spamassassin for WIndows
On 15 Feb 2018, at 15:33, Gianluca Furnarotto wrote: Hi, I am trying to use Bayes with spamassassin, now it seems stop to learn, and when I use a command as "sa-learn --dump magic", or "sa-learn --sync", or other sa-learn commands, it appears this error: "Use of uninitialized value $_[1] in hash element at Mail/SpamAssassin/Conf/Parser.pm line 571." Line 571 is this: " } " inside these lines. " elsif ($type == $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST) { $cmd->{code} = \&set_addrlist_value; }" <--- line 571 That absolutely IS NOT line 571 of Mail/SpamAssassin/Conf/Parser.pm in SA version 3.4.1. That's line 685. The relevant lines in Mail/SpamAssassin/Conf/Parser.pm: 568 569 # functions supported in the "if" eval: 570 sub cond_clause_plugin_loaded { 571return $_[0]->{conf}->{plugins_loaded}->{$_[1]}; 572 } 573 My first guess on this is that your configuration has a typo. Try running 'spamassassin --lint' to check it. The error message indicates that something is calling the subroutine 'cond_clause_plugin_loaded' in a way that gives it only one parameter where it is expecting 2, the first of which is an object reference.
Re: URIBL_BLOCKED
On 2018-02-15 (02:10 MST), Tobi wrote: > > and does your bind server use other forward servers? Nope. It is its own thing. Nor forwarders. Dunno what the issue was, but it was transient AFAICT. -- Forever was over. All the sands had fallen. The great race between entropy and energy had been run, and the favourite had been the winner after all. Perhaps he ought to sharpen the blade again? No. Not much point, really.
Re: Train SA with e-mails 100% proven spams and next time it should be marked as spam
On Thu, 15 Feb 2018 14:32:36 -0600 (CST) sha...@shanew.net wrote: > I haven't checked the math in the Bayes plugin, but it explicitly > mentions using the "chi-square probability combiner" which is > described at http://www.linuxjournal.com/print.php?sid=6467 > > Maybe I'm misunderstanding what that article describes, but I'm pretty > sure what it boils down to is that when the occurence of a token is > too small (he uses the phrase "rare words") it can lead to > probabilities at the extremes (like a token that occurs only once and > is in spam, so its probability is 1). The way to address these > extremely low or extremely high probabilities is to use the Fisher > calculation (which is described in the second page of the article). Tokens with low counts are detuned a bit, but not as much as you might think. In a database with a 1:1 ratio you get hapax token probabilities of 0.016 and 0.987, IIRC Robinson anticipated something much closer to neutral. This is similar to the defaults in spambayes and bogofilter, and I think at least one of the three project would have derived them from optimization. My guess it's because enough tokens with low counts are very strong, but short-lived indicators that it's worth putting with the noise.
Re: problem with spamassassin for WIndows
Thanks for your non contribution again harald. On point as ever. Hope you feel better now that you have again broadcast your irrelevant thoughts. (Your opinion of windows servers does not represent the world nor does it help the poster with his problem). (Did I say "on point"? Oops, it seems I can talk sh1t too). Helpful responses only please, team. On 15 February 2018 22:22:01 GMT+00:00, Reindl Harald wrote: >nobody seriously cares ablut windows if it comes to servers - it's that > >easy - period > >Am 15.02.2018 um 23:17 schrieb Groach: >> I originally guided Gianluca you this list for help because as a user >i >> know that the jam port of spamassassin makes an almost identical >> function of the software which, as you know, operates mainly on these > >> plug ins. Everything you do in Linux you also do in the Windows >version. >> I also know that as a user I personally don't have the problem he is >> reporting. This problem he has is with a plug in. (Perl is non >platform >> specific usually). >> >> Every time I had a problem in the past and reported it to Jam support > >> for help, it was ALWAYS due to something (bugs) existing in base >> spamassassin and not platform specific. >> >> Therefore i ask the readers to consider this report generally and >ignore >> the platform it is run on. >> >> Can any one offer more help please. >> >> On 15 February 2018 21:37:05 GMT+00:00, "Kevin A. McGrail" >>
Re: problem with spamassassin for WIndows
On 2/15/2018 5:17 PM, Groach wrote: I originally guided Gianluca you this list for help because as a user i know that the jam port of spamassassin makes an almost identical function of the software which, as you know, operates mainly on these plug ins. Everything you do in Linux you also do in the Windows version. I also know that as a user I personally don't have the problem he is reporting. This problem he has is with a plug in. (Perl is non platform specific usually). Every time I had a problem in the past and reported it to Jam support for help, it was ALWAYS due to something (bugs) existing in base spamassassin and not platform specific. Therefore i ask the readers to consider this report generally and ignore the platform it is run on. Can any one offer more help please. Can you provide any steps to replicate the issue? You might have to submit your bayesian db to pastebin or similar.
Re: problem with spamassassin for WIndows
I originally guided Gianluca you this list for help because as a user i know that the jam port of spamassassin makes an almost identical function of the software which, as you know, operates mainly on these plug ins. Everything you do in Linux you also do in the Windows version. I also know that as a user I personally don't have the problem he is reporting. This problem he has is with a plug in. (Perl is non platform specific usually). Every time I had a problem in the past and reported it to Jam support for help, it was ALWAYS due to something (bugs) existing in base spamassassin and not platform specific. Therefore i ask the readers to consider this report generally and ignore the platform it is run on. Can any one offer more help please. On 15 February 2018 21:37:05 GMT+00:00, "Kevin A. McGrail" wrote: >On 2/15/2018 3:33 PM, Gianluca Furnarotto wrote: >> I am trying to use Bayes with spamassassin, now it seems stop to >> learn, and >> when I use a command as "sa-learn --dump magic", or "sa-learn >--sync", >> or other sa-learn commands, >> it appears this error: >> "Use of uninitialized value $_[1] in hash element at >> Mail/SpamAssassin/Conf/Parser.pm line 571." >> >> Line 571 is this: >> " } " >> inside these lines. >> " elsif ($type == $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST) { >> $cmd->{code} = \&set_addrlist_value; >> }" <--- line 571 >> >> I'm not a perl programmer, so I need help to understand what is >wrong. >> Thanks. >> >> p.s.: this is the Jam Software Spamassassin version for Windows >> >You should likely ask JAM Software if they don't respond on list. > > >Regards, >KAM
Re: problem with spamassassin for WIndows
On 2/15/2018 3:33 PM, Gianluca Furnarotto wrote: I am trying to use Bayes with spamassassin, now it seems stop to learn, and when I use a command as "sa-learn --dump magic", or "sa-learn --sync", or other sa-learn commands, it appears this error: "Use of uninitialized value $_[1] in hash element at Mail/SpamAssassin/Conf/Parser.pm line 571." Line 571 is this: " } " inside these lines. " elsif ($type == $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST) { $cmd->{code} = \&set_addrlist_value; }" <--- line 571 I'm not a perl programmer, so I need help to understand what is wrong. Thanks. p.s.: this is the Jam Software Spamassassin version for Windows You should likely ask JAM Software if they don't respond on list. Regards, KAM
problem with spamassassin for WIndows
Hi, I am trying to use Bayes with spamassassin, now it seems stop to learn, and when I use a command as "sa-learn --dump magic", or "sa-learn --sync", or other sa-learn commands, it appears this error: "Use of uninitialized value $_[1] in hash element at Mail/SpamAssassin/Conf/Parser.pm line 571." Line 571 is this: " } " inside these lines. " elsif ($type == $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST) { $cmd->{code} = \&set_addrlist_value; }" <--- line 571 I'm not a perl programmer, so I need help to understand what is wrong. Thanks. p.s.: this is the Jam Software Spamassassin version for Windows
Re: Train SA with e-mails 100% proven spams and next time it should be marked as spam
On Thu, 15 Feb 2018, RW wrote: On Thu, 15 Feb 2018 11:56:55 -0600 (CST) sha...@shanew.net wrote: So, the sample size doesn't matter when calculating the probability of a message being spam based on individual tokens, but it can matter when we bring them all together to make a final calculation. It's not a matter of how they combine, smaller counts just lead to less accurate token probabilities. I'm not saying that it doesn't matter how much you train, I'm saying that if you have enough spam and enough ham Bayes is insensitive to the ratio. I agree that past a certain minimum threshold, the ratio doesn't matter much. But as I understand it, larger sample size makes a difference. I haven't checked the math in the Bayes plugin, but it explicitly mentions using the "chi-square probability combiner" which is described at http://www.linuxjournal.com/print.php?sid=6467 Maybe I'm misunderstanding what that article describes, but I'm pretty sure what it boils down to is that when the occurence of a token is too small (he uses the phrase "rare words") it can lead to probabilities at the extremes (like a token that occurs only once and is in spam, so its probability is 1). The way to address these extremely low or extremely high probabilities is to use the Fisher calculation (which is described in the second page of the article). Maybe this is where I'm making a logical leap that I shouldn't, but I think that "non-rare words" increasingly outnumber "rare words" as the sample size of messages (and thus tokens) increases. -- Public key #7BBC68D9 at| Shane Williams http://pgp.mit.edu/| System Admin - UT CompSci =--+--- All syllogisms contain three lines | sha...@shanew.net Therefore this is not a syllogism | www.ischool.utexas.edu/~shanew
Re: Train SA with e-mails 100% proven spams and next time it should be marked as spam
On Thu, 15 Feb 2018 20:16:24 +0100 Reindl Harald wrote: > Am 15.02.2018 um 20:10 schrieb RW: > > I'm not saying that it doesn't matter how much you train, I'm saying > > that if you have enough spam and enough ham Bayes is insensitive to > > the ratio > > but not when the ratio differs in magnitudes like the values from the > OP not more, and not less Based on the mathematics of "I reckon", and your database going off the rails after (by your own admission) you mistrained it. Actually the ratio was only 4:1, which isn't all that big.
Re: Train SA with e-mails 100% proven spams and next time it should be marked as spam
On Thu, 15 Feb 2018 11:56:55 -0600 (CST) sha...@shanew.net wrote: > On Thu, 15 Feb 2018, RW wrote: > > > As I said, Bayes is based on frequencies. > > > > If a token occurs in 10% of ham and 0.5% of spam based on 10,000 > > hams and 10,000 spams, what do you think is likely to happen to > > those percentages with 10,000 hams and 1,000,000 spams? > > ... > So, the sample size doesn't matter when calculating the probability of > a message being spam based on individual tokens, but it can matter > when we bring them all together to make a final calculation. It's not a matter of how they combine, smaller counts just lead to less accurate token probabilities. I'm not saying that it doesn't matter how much you train, I'm saying that if you have enough spam and enough ham Bayes is insensitive to the ratio.
Re: Train SA with e-mails 100% proven spams and next time it should be marked as spam
On Thu, 15 Feb 2018 19:24:14 +0100 Reindl Harald wrote: > Am 15.02.2018 um 19:20 schrieb RW: > > On Thu, 15 Feb 2018 17:15:47 +0100 > > You are talking about ultra-rare tokens here, the chances of these > > dominating a classification is negligibl > it is not - in 2015 i had to purge "in doubt" a few days of training > because unreasonable amount of ham was classified as BAYES_50 or even > tagged instead BAYES_00 and we talk about a bay with around 100.000 > sample sin total where with your logic you would not expect to get > biased within a few days - yes, that was training-mistakes for sure - > but when you are able to bias a bayes with a few years of corpus > within a few days your exmples are wrong I have no idea what you are talking about, how it's relevant, or what you did wrong, but it doesn't trump mathematics.
From:name spoofing
We have covered this issue a few times recently on this list but I don't think anything definitive was ever decided or recommended to detect and block this sort of spoofing: https://pastebin.com/juXLD8vr This appears to be a spoofed email from a compromised account trying to be a known corespondent to this customer of mine. The Message-ID is suspicious since it's an inbound email to the hck12.net domain. -- David Jones
Re: Train SA with e-mails 100% proven spams and next time it should be marked as spam
On Thu, 15 Feb 2018 17:15:47 +0100 Reindl Harald wrote: > Am 15.02.2018 um 17:01 schrieb RW: > > On Thu, 15 Feb 2018 00:01:18 +0100 > > Reindl Harald wrote: > > > >> Am 14.02.2018 um 23:07 schrieb RW: > > > >>> My point is that an imbalance doesn't create a bias > > > >> wrong - what you tried to say was "doesn't necessarily create a > >> bias" > >> - but in fact when the imbalance is too big *it does* > >> > >> simply think about how bayes works makes that clear: eahc word a > >> token with ham/spam counter - when you have 1 Mio of one type and > >> 1 of the other type guess how that counter start to get > >> biased > > > > As I said, Bayes is based on frequencies. > > > > If a token occurs in 10% of ham and 0.5% of spam based on 10,000 > > hams and 10,000 spams, what do you think is likely to happen to > > those percentages with 10,000 hams and 1,000,000 spams? > > the 10% and 0.5% is just an unbacked assumption It's not an assumption, it's an example. > what if every word except a few relevant of the spam mail and so > every token exists in a relevant percent of your 1.4 Mio ham samples > and so 90% of every token has a high ham-conuter You are talking about ultra-rare tokens here, the chances of these dominating a classification is negligible.
Re: Train SA with e-mails 100% proven spams and next time it should be marked as spam
On Thu, 15 Feb 2018, RW wrote: On Thu, 15 Feb 2018 00:01:18 +0100 Reindl Harald wrote: Am 14.02.2018 um 23:07 schrieb RW: My point is that an imbalance doesn't create a bias wrong - what you tried to say was "doesn't necessarily create a bias" - but in fact when the imbalance is too big *it does* simply think about how bayes works makes that clear: eahc word a token with ham/spam counter - when you have 1 Mio of one type and 1 of the other type guess how that counter start to get biased As I said, Bayes is based on frequencies. If a token occurs in 10% of ham and 0.5% of spam based on 10,000 hams and 10,000 spams, what do you think is likely to happen to those percentages with 10,000 hams and 1,000,000 spams? Perhaps it would help to state Bayes' formula explicitly. The probabality that a message is spam given a specific token is equal to: (the probabilty of a token occuring in spam) times (the probability that a message is spam) divided by (the probabilty of that token occuring in all messages) The important feature in this formula is that every value being operated on is a probability, so if a given token occurs in .5% of 10,000 spams, we would expect it to occur in .5% of 100,000 or 1,000,000. If that assumption is true, and the .5% probability doesn't change, the resulting calculated probability also doesn't change. For actual spam detection, this is complicated by the fact that we end up with a whole stack of calculated probabilites for each token (including the probabilities that a message is non-spam given specific tokens), and we have to take all of them into account to calculate a final probability. In this process, it's not unusual that some individual calculated probablities "matter" more than others, and one basis for how much weight a particular probability gets is how much we can trust that probability. Here's where the 10,000 vs. 1,000,000 comes into play, because we can rely on the .5% probability out of 1,000,000 samples more than we can the .5% probability out of 10,000 samples, and both of those are better than a .5% probability out of 100 samples (that said, the difference in trust increases more between 100 samples and 10,000 samples than from 10,000 samples to 1,000,000 samples due to diminishing return). So, the sample size doesn't matter when calculating the probability of a message being spam based on individual tokens, but it can matter when we bring them all together to make a final calculation. -- Public key #7BBC68D9 at| Shane Williams http://pgp.mit.edu/| System Admin - UT CompSci =--+--- All syllogisms contain three lines | sha...@shanew.net Therefore this is not a syllogism | www.ischool.utexas.edu/~shanew
Re: Train SA with e-mails 100% proven spams and next time it should be marked as spam
On Thu, 15 Feb 2018 00:01:18 +0100 Reindl Harald wrote: > Am 14.02.2018 um 23:07 schrieb RW: > > My point is that an imbalance doesn't create a bias > wrong - what you tried to say was "doesn't necessarily create a bias" > - but in fact when the imbalance is too big *it does* > > simply think about how bayes works makes that clear: eahc word a > token with ham/spam counter - when you have 1 Mio of one type and > 1 of the other type guess how that counter start to get biased As I said, Bayes is based on frequencies. If a token occurs in 10% of ham and 0.5% of spam based on 10,000 hams and 10,000 spams, what do you think is likely to happen to those percentages with 10,000 hams and 1,000,000 spams?
Re: URIBL_BLOCKED
On Thu, 15 Feb 2018 16:06:40 +0100 Matus UHLAR - fantomas wrote: > >Or if you like using your ISP's servers, most DNS server software > >lets you forward by default but make exceptions for specific > >domains. > although possible, this does not make sense IMHO. It makes a lot of sense, IMO. I'm not H like the rest of you. > you would need to keep track of DNSBLs you need to access directly, > while they can change with SA rules without your knowledge. IMO, it makes no sense to run a mail server without having complete knowledge of which DNSBLs you use. Regards, Dianne.
Re: URIBL_BLOCKED
On Wed, 14 Feb 2018 14:05:54 -0800 (PST) John Hardin wrote: This detail always gets glossed over: set up a local NON-FORWARDING resolver. If you set up a local resolver and it just forwards requests to your ISP's DNS servers, you have not materially changed the problem. On 15.02.18 09:57, Dianne Skoll wrote: Or if you like using your ISP's servers, most DNS server software lets you forward by default but make exceptions for specific domains. although possible, this does not make sense IMHO. you would need to keep track of DNSBLs you need to access directly, while they can change with SA rules without your knowledge. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. (R)etry, (A)bort, (C)ancer
Re: URIBL_BLOCKED
On Wed, 14 Feb 2018 14:05:54 -0800 (PST) John Hardin wrote: > This detail always gets glossed over: set up a local NON-FORWARDING > resolver. > If you set up a local resolver and it just forwards requests to your > ISP's DNS servers, you have not materially changed the problem. Or if you like using your ISP's servers, most DNS server software lets you forward by default but make exceptions for specific domains. Regards, Dianne.
Re: URIBL_BLOCKED
On 15 Feb 2018, at 4:10 (-0500), Tobi wrote: Am 15.02.2018 um 02:35 schrieb @lbutlr: On 2018-02-14 (09:55 MST), Tobi wrote: Am 14.02.2018 um 17:16 schrieb @lbutlr: I can't imagine why i'd be over limit, my mail server is tiny. its not the mailserver that got blocked by limits, but the dns resolver your mailserver uses! I use my own DNS on Bind 9.12, however the block error is not appearing today, so... and does your bind server use other forward servers? Or does it directly resolve the queries from the authorative nameservers? All depends whether you resolver is in forward mode or not. If it's in forward mode then it sounds that the ips of those forwarders might got limited On 15.02.18 09:49, Bill Cole wrote: Another possibility is DNS hijacking. Connection providers pitch it as a security measure, and I guess it can be for residential customers and small businesses that essentially use their connections in the same ways as home users, but it's lethal for mail systems. My provider (WOW Business) does it by default. DNSSEC should avoid that too, however you must get root key via other way and I have no information about dnsbls signing their zones. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Christian Science Programming: "Let God Debug It!".
Re: URIBL_BLOCKED
On 15 Feb 2018, at 4:10 (-0500), Tobi wrote: Am 15.02.2018 um 02:35 schrieb @lbutlr: On 2018-02-14 (09:55 MST), Tobi wrote: Am 14.02.2018 um 17:16 schrieb @lbutlr: I can't imagine why i'd be over limit, my mail server is tiny. its not the mailserver that got blocked by limits, but the dns resolver your mailserver uses! I use my own DNS on Bind 9.12, however the block error is not appearing today, so... and does your bind server use other forward servers? Or does it directly resolve the queries from the authorative nameservers? All depends whether you resolver is in forward mode or not. If it's in forward mode then it sounds that the ips of those forwarders might got limited Another possibility is DNS hijacking. Connection providers pitch it as a security measure, and I guess it can be for residential customers and small businesses that essentially use their connections in the same ways as home users, but it's lethal for mail systems. My provider (WOW Business) does it by default. -- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Currently Seeking Steady Work: https://linkedin.com/in/billcole
Re: Adding IPs to the check list
Pedro David Marco skrev den 2018-02-15 08:06: Is there any "relativelly easy" way to add a new IP found in a non-standard header to the IPs checks (e.g. DNSRBL)??? plugin is the only way? relative easy not to use it ip address is most use full in mta stage, not in content filter stage, just my own point of view, wanted it anyway see config files in default rules for adding another Received: header alias to take ip from, some say more data is better for content scanners is always better, for ip is so much content that i dont like to argue with that knowledge
Re: URIBL_BLOCKED
Am 15.02.2018 um 02:35 schrieb @lbutlr: > On 2018-02-14 (09:55 MST), Tobi wrote: >> >> Am 14.02.2018 um 17:16 schrieb @lbutlr: >>> I can't imagine why i'd be over limit, my mail server is tiny. >> >> its not the mailserver that got blocked by limits, but the dns resolver >> your mailserver uses! > > I use my own DNS on Bind 9.12, however the block error is not appearing today, so... > > > and does your bind server use other forward servers? Or does it directly resolve the queries from the authorative nameservers? All depends whether you resolver is in forward mode or not. If it's in forward mode then it sounds that the ips of those forwarders might got limited