Re: Every spam is sacred: tagging mails because of their content or their supposed origin?
On Mon, 2003-06-16 at 19:33, Joey Hess wrote: Today I noticed those summaries were getting spamassassing scores in the 30 range. I ended up whitelisting myself, though that doesn't feel like a good idea -- now SA might mislearn spam subjects as ham, and any spammer who forges mail from me will probably get through. Aside from bypassing SA entirely for local mail, is there any better approach? You could add a special keyword into the summaries, and have its spamassassin score be -1000 or whatever.
Re: Every spam is sacred: tagging mails because of their content or their supposed origin?
On Mon, Jun 16, 2003 at 07:33:08PM -0400, Joey Hess wrote: Today I noticed those summaries were getting spamassassing scores in the 30 range. I ended up whitelisting myself, though that doesn't feel like a good idea -- now SA might mislearn spam subjects as ham, and any spammer who forges mail from me will probably get through. Aside from bypassing SA entirely for local mail, is there any better approach? Are you using procmail? Set a rule that if the mail is sent by you, with a header stating that themail was originated locally, do not use spamassassin. Are you using postfix? Set postfix so that mail delivered locally uses a entry like: 127.0.0.1:smtp ... external_ip:smtp ... -o content_filter=filter: filterunix - n n - - pipe flags=Rq user=pffilter argv=/home/pffilter/filter.sh -f ${sender} -- ${recipient} Create a user pffilter and put in filter.sh: /bin/cat | /usr/bin/spamc -f | /usr/sbin/sendmail -i $@ Set your spamassassin to run spamd, which is always a good idea. That will separate incoming mail and outgoing (local) mail to be checked by SA. HTH -- Jesus Climent | Unix SysAdm | Helsinki, Finland | pumuki.hispalinux.es GPG: 1024D/86946D69 BB64 2339 1CAA 7064 E429 7E18 66FC 1D7F 8694 6D69 -- Registered Linux user #66350 proudly using Debian Sid Linux 2.4.20 So much to do, so little time... --Joker (Batman)
Re: Every spam is sacred
On Mon, 16 Jun 2003 12:43:48 +1000, Russell Coker [EMAIL PROTECTED] said: On Mon, 16 Jun 2003 12:11, Theodore Ts'o wrote: false positive rate of as high as 2 per day by some estimates, do we as a body consider it acceptable if some percentage of Debian developers: 1) Don't receive a mail message from a fellow Debian developer because they unfortunately got caught by a false-positive (perhaps they got renumbered onto a bad SPAM address, or they were roaming on a wireless from a conference or during business travel) and important mail that related to Debian business gets lost? There is no excuse for this. Access to servers that are not in spam lists is well available to Debian developers. I tunnel my outgoing mail through a server in Melbourne no matter where I am, this avoids all issues of spam blocking by IP address. I offered accounts on a choice of machines to be used for such purposes for any Debian developers who have no better options, but so far no-one has taken me up on this offer. I refuse to allow spammers this victory. My machines are fully capable members of the internet, and I deliver my own email. My philosophy is that if people drop mail from me due to incompetence (since setting up machines to classify email from me as spam is indeed a misconfiguration), then I have no real desire to impose my musings on them. manoj -- I always had a repulsive need to be something more than human. David Bowie Manoj Srivastava [EMAIL PROTECTED] http://www.debian.org/%7Esrivasta/ 1024R/C7261095 print CB D9 F4 12 68 07 E4 05 CC 2D 27 12 1D F5 E8 6E 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C
Re: Every spam is sacred
On Mon, 16 Jun 2003 15:06, Manoj Srivastava wrote: There is no excuse for this. Access to servers that are not in spam lists is well available to Debian developers. I tunnel my outgoing mail through a server in Melbourne no matter where I am, this avoids all issues of spam blocking by IP address. I offered accounts on a choice of machines to be used for such purposes for any Debian developers who have no better options, but so far no-one has taken me up on this offer. I refuse to allow spammers this victory. My machines are fully capable members of the internet, and I deliver my own email. My philosophy is that if people drop mail from me due to incompetence (since setting up machines to classify email from me as spam is indeed a misconfiguration), then I have no real desire to impose my musings on them. If your machines are fully capable then they will have permanent IP addresses and no spam ever coming from them. In which case they will never get listed in a DNSBL (*) and there is nothing to be concerned about. (*) SPEWS level 2 and other excessively agressive DNSBL's may list you, but don't worry about that. No-one has suggested that we use such DNSBL's. We are talking about the simplest and most conservative ones that just list open relays etc. -- http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/Postal SMTP/POP benchmark http://www.coker.com.au/~russell/ My home page
Re: Every spam is sacred
In article [EMAIL PROTECTED] [EMAIL PROTECTED] writes: On Sun, Jun 15, 2003 at 11:45:17AM -0500, Steve Langasek wrote: If some number of Debian developers utilizing blocking that has a false positive rate of as high as 2 per day by some estimates, do we as a body consider it acceptable if some percentage of Debian developers: Alternativly, if Debian dosn't implement spam blocking, do we consider it acceptable that: Some developers stop reading any email, since the vast majority of it is spam. Developers delete messages unread because of spammy sounding subjects. Developers spend so much time reading spam they don't have time to fix bugs and do other useful work. People advocating not filtering spam based on some false positives seem to forget that being burried under the load of spam can cause more false postitives by the human forced to do the filtering. On my personal mailbox, I use some rather aggrasive lists that I wouldn't recomend to Debian at this time. (relays.osirusoft.com, which includes SPEWS and SBL, and block.blars.org that I run myself and don't recomend to others.) It still gets ten times as much spam as non-spam. Without the spam filters, I'd probably wind up not reading email at all. -- Blars Blarson [EMAIL PROTECTED] http://www.blars.org/blars.html Text is a way we cheat time. -- Patrick Nielsen Hayden
Re: Every spam is sacred: tagging mails because of their content or their supposed origin?
On Sun, Jun 15, 2003 at 11:19:10PM -0400, Duncan Findlay wrote: FWIW, the next version of spamassassin (2.60) will have no forgeable negatively scoring rules. (ETA early-mid July) Just out of curiosity, how will this be accomplished? -- 2. That which causes joy or happiness.
Re: Every spam is sacred
On Sun, Jun 15, 2003 at 10:11:22PM -0400, Theodore Ts'o wrote: Given that it's been pointed out that the MTA supports per-user bouncing of mail from open relays, and that it's very possible to use LDAP to provide easy management of per-user preferences, why is there any need to continue discussing what individual developers do or don't consider acceptable for collateral damage? I don't think it's as simple as that. When I worked for VA Linux systems, we consciously decided not to use any spam-blocking systems, and live with the spam, because the chance that we might lose one e-mail from a customer due to a false-positive was considered unacceptable. If some number of Debian developers utilizing blocking that has a false positive rate of as high as 2 per day by some estimates, do we as a body consider it acceptable if some percentage of Debian developers: Frankly, we as a body cannot possibly have a real say in this particular issue. Every developer can filter their mail for themselves -- and a lot of them already do (probably a vast majority). Even people who don't filter mail in a technical manner can perfectly well ignore mails for whatever reason. It's practically impossible to not leave this in the discretion of individual developers. Besides, mails get lost all the time, for a variety of reasons most of which are out of our control. The impact is never any more than negligible, otherwise there'd be much more fuss raised about it. but the debian mail system is run by the entire Debian project, and so it is appropriate that the decision be one which is taken by the entire project about whether or not supporting a service which has such a high potential false positive rate is something the Debian project as a whole should support. Now that's just silly. People are already allowed to use a myriad of filtering methods on Debian systems, none of which are intrinsically worse or better than those that aren't available systemwide (dnsbl, spamassassin, razor, ...). And if the method isn't available systemwide, they can run it from their home directory. Having stuff run in a more technically sane way (like by not having everyone maintain a duplicate copy of whatever software necessary in their $HOME, or having hundreds or thousands of DNS lookups get made per mail instead of just a few) is certainly something that any sysadmin would prefer. I'm getting tired of these red herrings. I guess it serves Santiago right -- he posted the issue on -devel in a flamebait-ish way and all he (and we) got was this useless flamewar. -- 2. That which causes joy or happiness.
Re: Every spam is sacred
On Thu, Jun 12, 2003 at 11:20:12PM +0200, Marcelo E. Magallon wrote: On Thu, Jun 12, 2003 at 02:18:57AM +0200, Santiago Vila wrote: How can they say no to using some of them in /warn mode ... ? Santiago holds that more than half of the spam could be eventually avoided. I'd very much like to see hard evidence against or in favor that assertion. Given the ammount of spam that I get delived to my account via Debian machines, I guess the reduction in bandwidth usage by master and murphy is not to be taken lightly. The bandwidth reduction will only happen if you decide to discard the mail, since the mail will always be accepted, scanned to find the IP which originated the message, the IP will be checked agains the database and then the mail will be tagged. The reduction happens in the output, but the load might increase in the server. mooch -- Jesus Climent | Unix SysAdm | Helsinki, Finland | pumuki.hispalinux.es GPG: 1024D/86946D69 BB64 2339 1CAA 7064 E429 7E18 66FC 1D7F 8694 6D69 -- Registered Linux user #66350 proudly using Debian Sid Linux 2.4.20 I say you are Lord, and I should know. I've followed a few. --Arthur (Life of Brian)
Re: Proposal for using SpamAssassin in master.d.o [Was: Re: Every spam is sacred]
On Sun, Jun 15, 2003 at 03:39:00PM +0200, Jesus Climent wrote: Hi. [...] I might thing I spoke BS on my proposal, since I have not heard any comments... mooch -- Jesus Climent | Unix SysAdm | Helsinki, Finland | pumuki.hispalinux.es GPG: 1024D/86946D69 BB64 2339 1CAA 7064 E429 7E18 66FC 1D7F 8694 6D69 -- Registered Linux user #66350 proudly using Debian Sid Linux 2.4.20 I never drink ... wine. --Dracula (Dracula)
Re: Every spam is sacred
On Sun, Jun 15, 2003 at 02:17:23PM +0200, Santiago Vila wrote: Read a previous message by Duncan Findlay. He said that 39.2668% of all the spam might be blocked by using the DSBL, but doing that you would block 0.0185% of ham. I just ran a quick test on my current email folders. At the moment I have very little email stored in my Debian folders (198 messages actually). I extracted IP addresses of machines connecting to master or murphy like this: $ cd path/to/debian/mail $ find -type f | while read f ; do formail -c -x Received $f done | egrep 'by (murphy|master).debian.org' | perl -lne '/\[([0-9.]+)\]/ print join(., reverse (split /\./, $1))' | sort -n -u | grep -v '1\.0\.0\.127' That outputs 103 IP addresses. Adding perl -pe 's/$/.list.dsbl.org/' | while read s ; do host $s ; done to that command I get a match for 175.90.65.4.list.dsbl.org Searching for the matchin message I get: Subject: Someone for you. Message-Id: [EMAIL PROTECTED] X-Spam-Status: No, hits=3.5 required=5.0 tests=HTML_30_40,HTML_MESSAGE,MIME_HTML_ONLY,REMOVE_PAGE,X_LOOP, X_MAILING_LIST version=2.55 I'm sure I don't have to show you the email to convince you that it's spam. Looking at my spam folder, I can extract 203 unique IP addresses (311 received emails) out of which 71 are *not* listed by list.dsbl.org. I call that impressive. Feel free to come up with your own numbers using your own received email. Now the question again: why does debian-admin and/or listmaster oppose to running this in warning mode? That'd be a much more accurate statistic since post-facto I can't tell if the IPs were added after observing the spam I'm testing with now, or if they were already present at the moment of reception. Marcelo
Re: Every spam is sacred
On Mon, Jun 16, 2003 at 11:37:00AM +0200, Jesus Climent wrote: The bandwidth reduction will only happen if you decide to discard the mail, since the mail will always be accepted, scanned to find the IP which originated the message, the IP will be checked agains the database and then the mail will be tagged. The IP that's checked in the DSBL is the one of the machine opening the connection to the MTA running at murphy/master. You don't actually scan the email headers. You are right that the bandwidth is decreased only if the mail is rejected instead of being just tagged, of course. The reduction happens in the output, but the load might increase in the server. Bandwidth is much more expensive than CPU time. The bandwidth required by the DNS lookups can also be reduced by maintaining a local cache. Marcelo
Re: Every spam is sacred
On Mon, 16 Jun 2003 19:37, Jesus Climent wrote: account via Debian machines, I guess the reduction in bandwidth usage by master and murphy is not to be taken lightly. The bandwidth reduction will only happen if you decide to discard the mail, since the mail will always be accepted, scanned to find the IP which originated the message, the IP will be checked agains the database and then the mail will be tagged. The usual implementation of DNSBL systems is to check the client IP address as soon as the TCP connection comes in, and disconnect with an appropriate error message if it's an IP address that you don't like. That saves a lot of bandwidth, and the mail is not discarded, it is left on the sending machine to be bounced in an appropriate manner back to the sender. -- http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/Postal SMTP/POP benchmark http://www.coker.com.au/~russell/ My home page
Re: Every spam is sacred
Russell Coker [EMAIL PROTECTED] a tapoté : On Mon, 16 Jun 2003 19:37, Jesus Climent wrote: account via Debian machines, I guess the reduction in bandwidth usage by master and murphy is not to be taken lightly. The bandwidth reduction will only happen if you decide to discard the mail, since the mail will always be accepted, scanned to find the IP which originated the message, the IP will be checked agains the database and then the mail will be tagged. The usual implementation of DNSBL systems is to check the client IP address as soon as the TCP connection comes in, and disconnect with an appropriate error message if it's an IP address that you don't like. That saves a lot of bandwidth, and the mail is not discarded, it is left on the sending machine to be bounced in an appropriate manner back to the sender. I think you missed the point: tagging mails is fine to everybody but not blocking/discarding/bouncing mails (that can be done later, by user, with their procmailrc). So as only tagging will be done, no bandwidth will be saved. -- Mathieu Roy Homepage: http://yeupou.coleumes.org Not a native english speaker: http://stock.coleumes.org/doc.php?i=/misc-files/flawed-english
Re: Every spam is sacred
On Mon, Jun 16, 2003 at 11:37:00AM +0200, Jesus Climent wrote: Given the ammount of spam that I get delived to my account via Debian machines, I guess the reduction in bandwidth usage by master and murphy is not to be taken lightly. The reduction happens in the output, Which is secondary, insignificant, right? Here are some statistics for the BTS: 2003-04: mails received: 476 MB mails passed onto |receive: 66 MB 2003-05: mails received: 854 MB mails passed onto |receive: 79 MB 2003-06: mails received: 455 MB mails passed onto |receive: 36 MB but the load might increase in the server. Watch the oidentds and bugscans on master some time... there are much better ways to save machine load. -- 2. That which causes joy or happiness.
Re: Every spam is sacred: tagging mails because of their content or their supposed origin?
On Mon, Jun 16, 2003 at 10:03:45AM +0200, Josip Rodin wrote: On Sun, Jun 15, 2003 at 11:19:10PM -0400, Duncan Findlay wrote: FWIW, the next version of spamassassin (2.60) will have no forgeable negatively scoring rules. (ETA early-mid July) Just out of curiosity, how will this be accomplished? The only negative rules will be: bayesian rules, bondedsender and habeas. Figuring how to autolearn ham (non-spam) is the only obstacle we still need to figure out. -- Duncan Findlay pgptieP4vqlZ2.pgp Description: PGP signature
Re: Every spam is sacred
Mathieu Roy wrote: Manoj Srivastava [EMAIL PROTECTED] a tapoté : The 127th Ferengi rule of acquisition: Even if you got it for free, you paid too much. But the Rule 37th says otherwise: If it's free, take it and worry about hidden costs later. But the 96th confirms that For every Rule, there is an equal and opposite Rule, (except when there's not). interesting. i found this: http://www.dmwright.com/html/ferengi.htm Rule 127 » Stay neutral in conflict so that you can sell supplies to both sides. Manoj, perhaps you were thinking of this one: (same source) Rule 011 » Even if it's free, you can always buy it cheaper. -john
Re: Every spam is sacred: tagging mails because of their content or their supposed origin?
On Mon, 16 Jun 2003, Duncan Findlay wrote: The only negative rules will be: bayesian rules, bondedsender and habeas. Figuring how to autolearn ham (non-spam) is the only obstacle we still need to figure out. Sure sounds like throwing the baby out with the bathwater... but I presume you all are running statistics on email distributions... Don Armstrong -- Any excuse will serve a tyrant. -- Aesop http://www.donarmstrong.com http://www.anylevel.com http://rzlab.ucr.edu pgpGEO5lz6saB.pgp Description: PGP signature
Re: Every spam is sacred
On Friday 13 June 2003 05:13 pm, Don Armstrong wrote: Oh, what the hell. This damn song won't get out of my head, so now you all get to be subjected to it to[1]: FWIW, the original version of this song has also been in my head for weeks. Thanks for digging up the full text :) 1: Misery loves company. Ad infinitum.
Re: Every spam is sacred
On Mon, 16 Jun 2003 16:18:49 +1000, Russell Coker [EMAIL PROTECTED] said: On Mon, 16 Jun 2003 15:06, Manoj Srivastava wrote: There is no excuse for this. Access to servers that are not in spam lists is well available to Debian developers. I tunnel my outgoing mail through a server in Melbourne no matter where I am, this avoids all issues of spam blocking by IP address. I offered accounts on a choice of machines to be used for such purposes for any Debian developers who have no better options, but so far no-one has taken me up on this offer. I refuse to allow spammers this victory. My machines are fully capable members of the internet, and I deliver my own email. My philosophy is that if people drop mail from me due to incompetence (since setting up machines to classify email from me as spam is indeed a misconfiguration), then I have no real desire to impose my musings on them. If your machines are fully capable then they will have permanent IP addresses and no spam ever coming from them. In which case they will never get listed in a DNSBL (*) and there is nothing to be concerned about. They are, when I send mail from home, or when I trael to the office. However, I often send mail from the road (or the internet cafe) where dhcp often rules. manoj -- The only way for a reporter to look at a politician is down. H.L. Mencken Manoj Srivastava [EMAIL PROTECTED] http://www.debian.org/%7Esrivasta/ 1024R/C7261095 print CB D9 F4 12 68 07 E4 05 CC 2D 27 12 1D F5 E8 6E 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C
Re: Every spam is sacred: tagging mails because of their content or their supposed origin?
On Mon, Jun 16, 2003 at 04:43:53PM -0400, Don Armstrong wrote: On Mon, 16 Jun 2003, Duncan Findlay wrote: The only negative rules will be: bayesian rules, bondedsender and habeas. Figuring how to autolearn ham (non-spam) is the only obstacle we still need to figure out. Sure sounds like throwing the baby out with the bathwater... but I presume you all are running statistics on email distributions... Eventually, spammers will forge any test they can. (This of course presumes that spamassassin is a big problem for spammers.) It's extreme, but necessary. All the spamassassin scores are generated with a genetic algorithm using results from about 150k spam and 150k non-spam. The scores will naturally be adjusted to compensate for the lack of negative scoring rules. Anyways, this is quite OT for debian-devel (although so is the vast majority of this thread). -- Duncan Findlay pgpeCXv0A64y5.pgp Description: PGP signature
Re: Every spam is sacred
By the way some folks live in countries considered spam countries by other people, and they can't get a email in edgewise to the high class users. By the way how about my http://jidanni.org/comp/spam/spamdealer.html solution for the little guy, remote and without root. -- http://jidanni.org/ Taiwan(04)25854780
Re: Every spam is sacred: tagging mails because of their content or their supposed origin?
Duncan Findlay wrote: The only negative rules will be: bayesian rules, bondedsender and habeas. Figuring how to autolearn ham (non-spam) is the only obstacle we still need to figure out. This is fairly off topic, but the other day I tired of downloading all my spam to check it for false positives, and so I whipped up a script to produce an index of a mailbox, with author and subject lines and sorted by SA score, and cronned it so I'd be mailed a daily summary to peruse. Today I noticed those summaries were getting spamassassing scores in the 30 range. I ended up whitelisting myself, though that doesn't feel like a good idea -- now SA might mislearn spam subjects as ham, and any spammer who forges mail from me will probably get through. Aside from bypassing SA entirely for local mail, is there any better approach? -- see shy jo pgpei9U6oNYFn.pgp Description: PGP signature
Re: Every spam is sacred
On Mon, 16 Jun 2003 12:11, Theodore Ts'o wrote: false positive rate of as high as 2 per day by some estimates, do we as a body consider it acceptable if some percentage of Debian developers: 1) Don't receive a mail message from a fellow Debian developer because they unfortunately got caught by a false-positive (perhaps they got renumbered onto a bad SPAM address, or they were roaming on a wireless from a conference or during business travel) and important mail that related to Debian business gets lost? There is no excuse for this. Access to servers that are not in spam lists is well available to Debian developers. I tunnel my outgoing mail through a server in Melbourne no matter where I am, this avoids all issues of spam blocking by IP address. I offered accounts on a choice of machines to be used for such purposes for any Debian developers who have no better options, but so far no-one has taken me up on this offer. The technical ability to perform such tunneling is assumed, compared to Debian development tasks tunneling TCP connections is trivial. Your point about mail from users is fair, but as every Debian developer has the ability to notice DNSBL's and work-around them there should be no problem in that regard. -- http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/Postal SMTP/POP benchmark http://www.coker.com.au/~russell/ My home page
Re: Every spam is sacred: tagging mails because of their content or their supposed origin?
On Sun, Jun 15, 2003 at 07:45:02PM +0200, Santiago Vila wrote: Mathieu Roy wrote: But I definitely find spamassassin conceptually much better - because it really takes a mail for what it is. It cannot be trapped. Because if the DNSBL one day become a major problem to spammers, who knows what kind of methods they may use to attack them. A spamassassin rule is much easier to fool than an IP address. Not a long time ago there were a lot of spam which was PGP-signed. FWIW, the next version of spamassassin (2.60) will have no forgeable negatively scoring rules. (ETA early-mid July) -- Duncan Findlay pgpO8jKiZXc3t.pgp Description: PGP signature