Re: Unicode considered harmful again
Original Message On Nov 4, 2021, 07:45, Damian < spamassas...@arcsin.de> wrote: >> Please convert all source code to ASCII. If it fails to compile, then it may >> have a trojan hiding in Unicode clothing. >Instructions unclear. CVE 2021-42574
Re: Unicode considered harmful again
>> Please convert all source code to ASCII. If it fails to compile, then it may have a trojan hiding in Unicode clothing. >Instructions unclear. CVE 2021-42574 It remains unclear (to me). What source code should spamassassin-users convert? Attached source code in emails? How should they convert, is there a SpamAssassin-Plugin? Should they install compilers on their mail system?
Re: Unicode considered harmful again
Original Message On Nov 4, 2021, 09:34, Damian < spamassas...@arcsin.de> wrote: > >> Please convert all source code to ASCII. If it fails to compile, > then it may have a trojan hiding in Unicode clothing. > > >Instructions unclear. > > CVE 2021-42574 > It remains unclear (to me). What source code should spamassassin-users > convert? Attached source code in emails? How should they convert, is there a > SpamAssassin-Plugin? Should they install compilers on their mail system? The CVE is a call to action for the developers. On users, if SA can safely detect an attack, it should report it.
Re: Unicode considered harmful again
On 2021-11-04 at 08:45:02 UTC-0400 (Thu, 4 Nov 2021 08:45:02 -0400) Jared Hall is rumored to have said: [...] 2) Beware of using somebody else's source code :) That's the really significant warning... The relevance to SA is that it uses a config system with "rules" that can be auto-updated and are which de facto source code: somebody else's source code. :) We do not currently publish non-ASCII rules in the default ruleset channel. I don't believe that KAM ever does so. At least one 3rd-party ruleset has done so in the past, generating errors and warnings from some versions of Perl. Through 3.x, SA does not have conscious support for non-ASCII rules and while it is possible that SA could be vulnerable to something akin to CVE-2021-42574 and CVE-2021-42694 via malicious rules, it would be a noisy and rather difficult attack. In v4.x, Unicode support will be better. That also means it may be easier to make this sort of attack quieter in the future, as non-ASCII rules won't be definitively wrong as they are now. -- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Not Currently Available For Hire
Re: Unicode considered harmful again
On 11/4/2021 10:44 AM, Bill Cole wrote: On 2021-11-04 at 08:45:02 UTC-0400 (Thu, 4 Nov 2021 08:45:02 -0400) Jared Hall is rumored to have said: [...] 2) Beware of using somebody else's source code :) That's the really significant warning... Agreed. Does one need to write a paper and publish a couple of CVEs for that? I thought Mitre or whoever runs CVE nowadays would triage these types of reports through a "Captain Obvious" department to sort Wants from Needs. We do not currently publish non-ASCII rules in the default ruleset channel. I don't believe that KAM ever does so. KAM certainly has. I do recall seeing at least an infinity symbol as well as the Euro symbol in his rulesets last I looked. NBD, works anyway. I crank out hex when dealing with Unicode, and I have tons of that. I have a nice Unicode converter that works on strings. One of these days I'll change it to parse entire files; Heinlein's stuff for instance. In v4.x, Unicode support will be better. That also means it may be easier to make this sort of attack quieter in the future, as non-ASCII rules won't be definitively wrong as they are now. I have my own thoughts/reservations about distributing Unicode rulesets. Challenging days ahead, to be sure. It'd sure be nice to get sa-compile to run entirely clean though. Thanks, -- Jared Hall
Re: timeouts on processing some messages, started October 24
I have captured a bad message. It seems innocuous; it's from me at a host in my domain, to me, basically From: g...@foo.lexort.com To: g...@lexort.com and has a body "foo", no DKIM headers, just Received, Subject, Message-Id. Processing this with my normal config results in the timeout. I noticed lockfiles for txrep, even though I couldn't figure out that txrep was involved from' -D all', and turned off txrep in my config ("use_txrep 0" instead of 1). Then, the message processes in 2s. When I had txrep enabled, I saw a tx-reputation.lock with a single line that was a pid of the spamd child process that was accumulating CPU time. I also had files like: tx-reputation.lock.bar.lexort.com.5023 where that was another pid, and this second file seemed to be accumulating lines. I did find a stray sa-learn from October and killed it. Running my spam learning script, which just calls sa-learn with --spam or --ham (and -L always) is turning out slow, probably from the same thing. So it sort of smells like one of - something is wrong with my txrep database - some code is hitting O(n^k) or something - there is some strange locking/spinning behavior - something else I don't understand, as always Does anyone have pointers to a database export/import script for txrep? signature.asc Description: PGP signature
Re: Unicode considered harmful again
On 2021-11-04 09:34, Damian wrote: >> Please convert all source code to ASCII. If it fails to compile, then it may have a trojan hiding in Unicode clothing. >Instructions unclear. CVE 2021-42574 It remains unclear (to me). What source code should spamassassin-users convert? Attached source code in emails? How should they convert, is there a SpamAssassin-Plugin? Should they install compilers on their mail system? https://bugs.gentoo.org/807781 not all 3dr party have clean rules with leds to that problem == $ perl -ne 'print "$. $_" if m/[\x80-\xFF]/' /var/lib/spamassassin/3.004006/updates_spamassassin_org/50_scores.cf 526 # Validity (née ReturnPath) Certified == i dont have tested if its solved in defeault rules now, but kam and ita channel still have it we are all waiting for spamassassin 4.x
Re: Unicode considered harmful again
In v4.x, Unicode support will be better. That also means it may be easier to make this sort of attack quieter in the future, as non-ASCII rules won't be definitively wrong as they are now. The question is whether non-ascii malicious rules could do anything more damaging than simply failing to match on the obvious strings "visible" in the rule, or alternately deliberately match on some string that should not be matched, in some form of DOS attempt. It's hard to see how someone could inject Perl (or any other) code with screwy rules. There was a time Perl code was allowed in rules, that was disallowed many years ago: uri LW_PRINTIT /(^.*$)(?{ print "URI:\n$^N\nEnd URI\n\n" })/is That was a real handy debugging rule once, but you can't get away with that anymore. Loren