Words with embedded symbols
I'm getting a lot of SPAM with words written like this. These are pretty horrible, and I don't like getting them every day. A:N ;A %L" P:O ~R %N ( P &lCT U #R&E / Is there a way to make a rule for strings of characters that would ignoring non-alpha characters embedded in the string?
Re: SA rules & matching of private addresses
Hi Mabry, At 03:46 04-10-2012, Mabry Tyson wrote: The debug output shows that SA is (IMO, mis-) interpreting the "x-originating-ip" as a Received header. The IP address from the X-Origination-IP header field, similarly to those in the Receiver header fields, is used for DNSBL lookups. Regards, -sm
Re: Try to run sa-learn
ok, my debug Oct 4 12:19:15.857 [8148] dbg: logger: adding facilities: all Oct 4 12:19:15.857 [8148] dbg: logger: logging level is DBG Oct 4 12:19:15.858 [8148] dbg: generic: SpamAssassin version 3.3.2 Oct 4 12:19:15.858 [8148] dbg: generic: Perl 5.010001, PREFIX=/usr, DEF_RULES_DIR=/usr/share/spamassassin, LOCAL_RULES_DIR=/etc/mail/spamassassin, LOCAL_STATE_DIR=/var/lib/spamassassin Oct 4 12:19:15.858 [8148] dbg: config: timing enabled Oct 4 12:19:15.859 [8148] dbg: config: score set 0 chosen. Oct 4 12:19:15.861 [8148] dbg: util: running in taint mode? yes Oct 4 12:19:15.861 [8148] dbg: util: taint mode: deleting unsafe environment variables, resetting PATH Oct 4 12:19:15.861 [8148] dbg: util: PATH included '/usr/local/sbin', keeping Oct 4 12:19:15.861 [8148] dbg: util: PATH included '/usr/local/bin', keeping Oct 4 12:19:15.861 [8148] dbg: util: PATH included '/sbin', keeping Oct 4 12:19:15.861 [8148] dbg: util: PATH included '/bin', keeping Oct 4 12:19:15.861 [8148] dbg: util: PATH included '/usr/sbin', keeping Oct 4 12:19:15.861 [8148] dbg: util: PATH included '/usr/bin', keeping Oct 4 12:19:15.862 [8148] dbg: util: PATH included '/root/bin', which is unusable, dropping: No such file or directory Oct 4 12:19:15.862 [8148] dbg: util: final PATH set to: /usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin Oct 4 12:19:15.873 [8148] dbg: dns: is Net::DNS::Resolver available? yes Oct 4 12:19:15.874 [8148] dbg: dns: Net::DNS version: 0.68 Oct 4 12:19:15.874 [8148] dbg: config: using "/etc/mail/spamassassin" for site rules pre files Oct 4 12:19:15.874 [8148] dbg: config: read file /etc/mail/spamassassin/init.pre Oct 4 12:19:15.875 [8148] dbg: config: read file /etc/mail/spamassassin/v310.pre Oct 4 12:19:15.875 [8148] dbg: config: read file /etc/mail/spamassassin/v312.pre Oct 4 12:19:15.875 [8148] dbg: config: read file /etc/mail/spamassassin/v320.pre Oct 4 12:19:15.875 [8148] dbg: config: read file /etc/mail/spamassassin/v330.pre Oct 4 12:19:15.875 [8148] dbg: config: using "/var/lib/spamassassin/3.003002" for sys rules pre files Oct 4 12:19:15.875 [8148] dbg: config: using "/var/lib/spamassassin/3.003002" for default rules dir Oct 4 12:19:15.875 [8148] dbg: config: read file /var/lib/spamassassin/3.003002/updates_spamassassin_org.cf Oct 4 12:19:15.875 [8148] dbg: config: using "/etc/mail/spamassassin" for site rules dir Oct 4 12:19:15.877 [8148] dbg: config: read file /etc/mail/spamassassin/00_FVGT_File001.cf Oct 4 12:19:15.878 [8148] dbg: config: read file /etc/mail/spamassassin/75_ckrules.cf Oct 4 12:19:15.878 [8148] dbg: config: read file /etc/mail/spamassassin/antidrug.cf Oct 4 12:19:15.878 [8148] dbg: config: read file /etc/mail/spamassassin/local.cf Oct 4 12:19:15.879 [8148] dbg: config: using "/root/.spamassassin/user_prefs" for user prefs file Oct 4 12:19:15.880 [8148] dbg: config: read file /root/.spamassassin/user_prefs Oct 4 12:19:15.888 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::RelayCountry from @INC Oct 4 12:19:15.889 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::URIDNSBL from @INC Oct 4 12:19:15.894 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::Hashcash from @INC Oct 4 12:19:15.905 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::SPF from @INC Oct 4 12:19:15.909 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::DCC from @INC Oct 4 12:19:15.916 [8148] dbg: dcc: network tests on, registering DCC Oct 4 12:19:15.916 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::Pyzor from @INC Oct 4 12:19:15.919 [8148] dbg: pyzor: network tests on, attempting Pyzor Oct 4 12:19:15.919 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::Razor2 from @INC Oct 4 12:19:15.975 [8148] dbg: razor2: razor2 is available, version 2.84 Oct 4 12:19:15.976 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::SpamCop from @INC Oct 4 12:19:15.989 [8148] dbg: reporter: network tests on, attempting SpamCop Oct 4 12:19:15.989 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::AutoLearnThreshold from @INC Oct 4 12:19:15.991 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::WhiteListSubject from @INC Oct 4 12:19:15.992 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::MIMEHeader from @INC Oct 4 12:19:15.994 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::ReplaceTags from @INC Oct 4 12:19:15.995 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::DKIM from @INC Oct 4 12:19:16.001 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::Check from @INC Oct 4 12:19:16.009 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::HTTPSMismatch from @INC Oct 4 12:19:16.010 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::URIDetail from @INC Oct 4 12:19:16.012 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::BodyEval from @INC Oct 4 12:19:16.014 [8148] dbg: plugin: loading Mail::SpamAssassin::Plugin::DNSEval from @INC Oct 4 12:19:16.017 [8148] dbg: plugin: loading Mail::SpamAs
Re: Try to run sa-learn
On Thu, 4 Oct 2012, troxlinux wrote: Hi list , I try to run sa-learn on centos 6.3 but no work sa-learn --spam --showdots /dir/dir/domain.com.ni/spam/.spam/cur/ Learned tokens from 0 message(s) (1 message(s) examined) ERROR: the Bayes learn function returned an error, please re-run with -D for more information at /usr/bin/sa-learn line 493. any idea ? , is a bug? , selinux is disabled Please run it with the -D option as suggested in the output. Without debugging output we can only guess. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- An entitlement beneficiary is a person or special interest group who didn't earn your money, but demands the right to take your money because they *want* it.-- John McKay, _The Welfare State: No Mercy for the Middle Class_ --- Today: the 8th anniversary of SpaceshipOne winning the X-prize
Re: Try to run sa-learn
On 10/4/2012 2:06 PM, troxlinux wrote: > Hi list , I try to run sa-learn on centos 6.3 but no work > > sa-learn --spam --showdots /dir/dir/domain.com.ni/spam/.spam/cur/ > > Learned tokens from 0 message(s) (1 message(s) examined) > ERROR: the Bayes learn function returned an error, please re-run with > -D for more information at /usr/bin/sa-learn line 493. > > any idea ? , is a bug? , selinux is disabled Well, did you do what the error message suggested (run 'sa-learn' with the -D switch)? What's the relevant output? > my version of spamassassin > spamassassin-3.3.2-4.el6.rfx.x86_64 > > > regardss >
Re: Try to run sa-learn
On 10/04, troxlinux wrote: > Hi list , I try to run sa-learn on centos 6.3 but no work > > sa-learn --spam --showdots /dir/dir/domain.com.ni/spam/.spam/cur/ Try: sa-learn --spam --showdots /dir/dir/domain.com.ni/spam/.spam/ ("cur/" is inside the mailbox, not part of the path to the mailbox) -- "Blessed are the cracked, for they shall let in the light." http://www.ChaosReigns.com
Re: SA rules & matching of private addresses
On 10/2/12 8:30 PM, dar...@chaosreigns.com wrote: Run the email through "spamassassin -D received-header". That'll tell you how and if the headers got parsed. SA has certainly had bugs where it failed to parse received headers before, and IPv6 hasn't had a whole lot of use. Thanks for pointing that out. That does illuminate the situation. The debug output shows that SA is (IMO, mis-) interpreting the "x-originating-ip" as a Received header. That's a surprise to me because it is an uninterpreted optional (header) field (RFC5821, 3.6.8) which could have any garbage in it and convey any semantics. There has also been a fair amount of work on IPv6 since the last release, so it's possible there was a bug, it got fixed, and you don't have the fix yet. It turns out it isn't an IPv6 problem. I had wrongly assumed that if a host with a private address delivers to a trusted, local host, then that host would be considered trusted & local. I removed the "x-originating-ip" header. Then, if I replace the IPv6 link local addresses by IPv4 10/8 addresses, I get the same behavior as with the IPv6 fe80::/10 addresses. If the private addresses are in the trusted & local addresses, then I get ALL_TRUSTED and not RDNS_NONE. If they are not in trusted & local, then I get RDNS_NONE and not ALL_TRUSTED. So I have to add things like 10/8 and fe80::/10 to the trusted & local networks. (When I add back in "x-originating-ip", I lose ALL_TRUSTED, which would be expected if you treat it like a Received: header.) On 10/02, Mabry Tyson wrote: One user complained about a false positive. When I examined the mail, there appeared to be at least two rules that didn't work as I thought they should because of a Received line in which IPv6 Link Local addresses were used. It appears that a patch was previously put in that was thought to fix these kinds of things. The sender was apparently using AA.BB.CC.DD (a Comcast address, presumably his home address). He logged into the mail system of SRI.COM (independent of our mail system) and sent his mail from within it (which is why CCC.SRI.COM is the oldest Received line). That should result in a received header clearly indicating that the connection from comcast was authenticated, and SA should notice that and use it to skip the tests on that comcast IP. It mostly sounds like this is what's missing. SRI.com not indicating the authentication in their received header in the standard way. You say "standard way". Can you point to a standard (eg, RFC) that indicates how this is indicated? Maybe it is my lack of knowledge, but I'm not aware of any "standard" way. That mail system is run by an independent business unit. I have no influence on their operations of their Exchange server. But, if they're not following a standard, I can suggest they follow a standard. But they probably won't listen to a "Well, SpamAssassin would recognize things better if you did ," RFC5322 (Message Format) doesn't mention "authenticate". RFC5321 (SMTP) does mention authentication (p 73) but gives no indication this should be reflected in any header nor any mechanism to do so. It mentions RFC4409 (Message Submission) as an example of a protocol that causes a message to be entered onto the Internet. However, RFC4409 gives no indication of how a conforming service should (or even might) indicate authentication. Nor does it indicate the format of a Received header that it should add such that it indicates that this protocol was used. After reviewing these, I believe that the first host to receive this message fails to add a Received header as required (which presumably would be where some indication of authentication would be found). The oldest Received header indicates a transmission from that first host to another host. (Ref: RFC5821, 3.7.2 " When forwarding a message into or out of the Internet environment, a gateway MUST prepend a Received: line ...") 4. [3]http://spamassassin.apache.org/tests_3_3_x.html has RCVD_IN_PBL = 3.6 (Spamhaus Policy black list) RCVD_IN_SBL = 2.6 (Spamhaus Spam black list) RCVD_IN_XBL = 0.7 (Spamhaus Botnet black list) which seems backward to me. The 3.2 tests scoring seems more reasonable. Do not attempt to comprehend the depths of the mind of the re-scorer :P No seriously, it has no concept of "this rule means the email is more bad than another rule, therefore it should have a higher score". Only "This score results in a better approximation of the 1 false positive in 2,500 non-spams goal". Which often results in unexpected things. It comes up a lot. I very recently found a case where a rule that hit more non-spam than spam got a score of something like 3. Which may have been suboptimal. There may be many ways of assigning scores to the rules to get nearly equivalent results (of correct assignments, false positives, & misses) on any particular test set