Re: Adding SpamAssassin Headers to IETF mail
Dean, --On 17. desember 2003 16:01 -0500 Dean Anderson <[EMAIL PROTECTED]> wrote: This is ridiculous. The IETF is not getting a lot of spam, so adding SpamAssassin headers is a solution in need of a problem. the reason you don't see a lot of spam on IETF lists is because it's sent to the list administrators, and they filter it by hand. The chief beneficiaries of automatic spam detection and deletion in the current IETF setup is the list administrators.
SA / Spam. Facts.
These are the facts. On Wednesday 17 December 2003 16:01, Dean Anderson wrote: > This is ridiculous. The IETF is not getting a lot of spam, so adding > SpamAssassin headers is a solution in need of a problem. "a lot" is a subjective term. Also, unless you are sniffing the traffic into our network, would you know how much spam our MX receives? A rough approximation is that 1/3 of the mail into the IETF MX is spam. Estimate based on a small sample. If a more accurate number is needed, please submit to the tracking system for prioritizing in the queue of IETF things to do. Some spam we already filter out without spam assassin. For example... CC'ing mail to ietf-announce (as two of your posts did) gets caught in our spam filter because it is not appropriate on that mailing list. > [EMAIL PROTECTED] wrote: > > ...this implementation is to allow the IETF community to get used > > to having these headers in the messages, and allow us to make any > > changes to the filtering rules. > The above seems like a thinly veiled attempt to make SpamAssassin headers > a defacto standard supported by the IETF, without going through the > standards process. It may seem that way to you, but in reality it isn't. Just me deciding to use it because it worked well with exim, it was quick to setup, seemed to perform the task well, didn't need a lot of human intervention, it could be tuned. Oh, and it's free, so the IETF could afford it. Mr. Anderson continued > Obviously, if the goal is to standardize these headers, then a standard > can be produced and put through the standards process. The goal is to reduce spam, and reduce the human intervention needed to reduce spam. These are the facts. --Brett
Re: Adding SpamAssassin Headers to IETF mail
On Wed, 17 Dec 2003, Harald Tveit Alvestrand wrote: > --On 17. desember 2003 16:01 -0500 Dean Anderson <[EMAIL PROTECTED]> wrote: > > This is ridiculous. The IETF is not getting a lot of spam, so adding > > SpamAssassin headers is a solution in need of a problem. > > the reason you don't see a lot of spam on IETF lists is because it's sent > to the list administrators, and they filter it by hand. > > The chief beneficiaries of automatic spam detection and deletion in the > current IETF setup is the list administrators. .. which do not/cannot use SpamAssassin to filter the bounces(?) -- Pekka Savola "You each name yourselves king, yet the Netcore Oykingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings
Re: SA / Spam. Facts.
Brett Thorson wrote: The goal is to reduce spam, and reduce the human intervention needed to reduce spam. Right. I support the secretariat's efforts to reduce spam and associated management effort on the IETF lists. Personally, I have a good experience with SpamAssissin, so to me the technical arrangement looks quite reasonable. As for the rest, lets all remember that there may be no fast, perfect, and inexpensive solutions. It may not be reasonable to require that the false positive rate is zero. We certainly can't replace spam detecting tools with e-mail signatures and have it operational tomorrow. And if the IETF has unused cash, there may be better uses for it than paying for spam detection software. --Jari
Re: Adding [ietf] considered harmful
IETFers- A tag in the subject line is clearly overdue. But, if we're going to do it, let's do it right. Please use "[IETF]" not "[ietf]" because it's more befitting of a proper acronym. If people would really rather defile the IETF's good name by calling it the "ietf" then maybe we could extend the mailing list software to allow each user to define their own "turd" to place in the subject line of mail they get from the IETF list. Thank you, allman
Re: Adding [ietf] considered harmful
Mark Allman wrote: A tag in the subject line is clearly overdue. But, if we're going to do it, let's do it right. Please use "[IETF]" not "[ietf]" because it's more befitting of a proper acronym. Just what we need, a mailing list that SHOUTS. (Then again, for this list, maybe it constitutes fair warning...) -- /===\ |John Stracke |[EMAIL PROTECTED]| |Principal Engineer|http://www.centive.com | |Centive |My opinions are my own. | |===| |"Music is not a noun, it's a verb." --John Perry Barlow| \===/
Re: Adding SpamAssassin Headers to IETF mail
Harald Tveit Alvestrand <[EMAIL PROTECTED]> wrote: > > the reason you don't see a lot of spam on IETF lists is because it's > sent to the list administrators, and they filter it by hand. Clearly, this cannot continue (unless we come up with some way to pay people to perform this service). > The chief beneficiaries of automatic spam detection and deletion in the > current IETF setup is the list administrators. I am really in no position to criticize the use of SpamAssassin. I started using it for my personal account just before I left for IETF-58, and have little hope of turning it off. (It flags as spam roughly 4,000 emails per week.) But I think we should stop short of endorsing it. It is, frankly, wrong to propagate to the list any email which we consider to be likely spam. We should instead come up with a way to verify/authenticate/intuit/whatever that it is an individually-written message considered to be on-topic by some person we have no reason to distrust. SpamAssassin is a technical marvel -- and I suspect it could be useful as a sorting tool to distinguish messages which deserve to be distributed immediately vs. messages which need further verification. But that further verification should be done _before_ anything is distributed to the list. If the SpamAssasin filtering were applied _during_ the SMTP session to ietf.org and a descriptive error (with URL) was returned (rather than "250 - OK"), then we would have done everything we reasonably could to notify an honest sender that we needed further verification. (And, of course, any other content-processing tool could be used instead of SpamAssassin -- indeed I'm not sure any useful purpose is served by publishing which particular content-assessment tool we use.) If we can't process during the SMTP session, then -- as a short- term stopgap -- it is reasonable to flag messages for some automated processing before distributing to the list. (None of this is to criticize anyone who runs SpamAssassin at their own site to apply more rigorous rules -- I'm probably doing so myself, even if unintentionally.) What I do wish to call into question is the wisdom of passing the SpamAssasin headers to the list. I believe it creates the potential for confusion as to what is or is not a legitimate message. -- John Leslie <[EMAIL PROTECTED]>
Re: Adding SpamAssassin Headers to IETF mail
the reason you don't see a lot of spam on IETF lists is because it's sent to the list administrators, and they filter it by hand. The chief beneficiaries of automatic spam detection and deletion in the current IETF setup is the list administrators. I'm one of those list administrators and I can attest that having spam flood the review queues of the mailing lists is a huge problem. It's not terribly unusual for the review queue of some lists to get so large that you can't download and resubmit Mailman's review page without crashing the web browser (and I've tried several different browsers on different platforms). but despite first-hand experience with the problem I'm still worried about using SpamAssassin - I've seen it block too many legitimate messages.
Re: Adding [ietf] considered harmful
Maybe we should also rewrite the From header field so that people with dysfunctional MUAs won't have trouble replying to the list? Maybe we should also rewrite the Reply-to field so that it doesn't matter when people get confused about the difference between reply to author and reply all? And let's be sure to rewrite the To field so that everyone who uses that field to collate list traffic will get the messages put in the right bin. It's taken us ~15 years to get rid of "features" like those mentioned above that well-meaning authors of list software put in to work around lack of user agent functionality. We're still not rid of all of it. I used to call it "header munging disease" - the idea that if a message passes through an intermediary, there's a strong temptation for the author of that intermediary to consider munging every header field (as well as the message itself) just in case it could somehow be useful to somebody -- never mind that this removes valuable information from the message, and reduces everyone's ability to use the messages to that of the least capable MUA. Putting [foo] in the subject header is just another example of this trend. Sure, it might be useful to people with dysfunctional MUAs, and there are a lot of those people out there. There were once a lot of people whose MUAs couldn't do "reply all", too. The short-term solution is to make [foo] an optional feature specified on a per-recipient basis. The long-term solution is to fix the MUAs to recognize and do appropriate things with List-* fields. A tag in the subject line is clearly overdue. But, if we're going to do it, let's do it right. Please use "[IETF]" not "[ietf]" because it's more befitting of a proper acronym. If people would really rather defile the IETF's good name by calling it the "ietf" then maybe we could extend the mailing list software to allow each user to define their own "turd" to place in the subject line of mail they get from the IETF list.
Re: Adding SpamAssassin Headers to IETF mail
Keith, the reason the secretariat is doing this in stages is exactly because we want to see how big the false-positive issue is. I currently personally use Mailman 2.60 with Bayesian filtering and close-to-default rules; it seems to run at a very low rate of false positives. --On 18. desember 2003 09:40 -0500 Keith Moore <[EMAIL PROTECTED]> wrote: the reason you don't see a lot of spam on IETF lists is because it's sent to the list administrators, and they filter it by hand. The chief beneficiaries of automatic spam detection and deletion in the current IETF setup is the list administrators. I'm one of those list administrators and I can attest that having spam flood the review queues of the mailing lists is a huge problem. It's not terribly unusual for the review queue of some lists to get so large that you can't download and resubmit Mailman's review page without crashing the web browser (and I've tried several different browsers on different platforms). but despite first-hand experience with the problem I'm still worried about using SpamAssassin - I've seen it block too many legitimate messages.
Never-ending arguments about mailing lists considered harmful (was: Re: Adding [ietf] considered harmful)
Keith and others, While... (1) I agree that this (and any SpamAssassin or other header-insertion or filtering) would, ideally, better be done as a per-subscriber optional feature, and (2) I recognize that, if for some reason (unfathomable to me, but there is no accounting for taste), people encapsulate messages in message/rfc822 body parts and then sign them (or archive hashes of messages including the headers), any modification of the encapsulated message would wreak havoc, and (3) I've got an MUA (and an MTA) that are capable of filtering on Return-path and/or List-* and/or receipient (including subaddress)fields, there are three things about this discussion that bother me... (i) A number of efforts within the community have pointed to the advantages of having more routine work done in a routine and automated way by the secretariat. Since the secretariat is operating with very tight resources (something else that has been in enough documents and presentations that I assume/hope everyone knows), it is in _our_ advantage to let them automate anything they can sensibly automate without causing _severe_ problems. Conversely, asking for things that might take large amounts of time and energy (such as per-user setting of tag fields or application of spam filtering), is, IMO, pretty lousy prioritization. (ii) Even with powerful filtering and organizing tools, some of us prefer (as a matter of taste) to not have, e.g., one folder or color per mailing list or other correspondent. For us, a subject line indicator of source makes it easier to organize things cognitively. Is it a big deal one way or the other? Not for me at least; I can't speak for others. But it is helpful to some of us, regardless of what the MTA or MUA may or be able to do. And that makes me (at least) a little intolerant of people starting religious wars that, themselves, consume large amounts of (human as well as network) bandwidth, if only because... (iii) I am, personally, getting concerned that the IETF is approaching the point where we are more concerned about process and administration than we are about doing high-quality design and engineering and getting high-quality results out. I don't think we are there yet, and I think the trends in that direction are still reversible, but I take * the relative amount of energy the community seems willing to spend discussing two, essentially trivial, changes to mailing list management, or * the fine details (rather than broad issues) of a process WG charter, or * heated arguments about proposals for which most of the people actively participating in the discussions have clearly not read the relevant documents, or * IESG being willing to tie up Proposed Standards (or even lower-maturity documents) in order to make sure that all of the grammatical and procedural niceties are adhered to, or probably several other things that belong on that list... as symptoms of serious and deep problems with our priorities and how we do business. For the record, before I'm quoted out of context (as I probably will be anyway), our copying procedures from SDOs that have become much more procedure-bound, so much so that they often appear to no longer care about quality or adoption or interoperability of standards as long as the many procedural rules are followed to the letter and they can report getting more standards out one year than in the previous one would not, IMO, be a good idea ... indeed, it would be closer to the height of stupidity. To make a distinction that may be useful before you (or someone else) replies, if you (or someone else) wants to get on a tear about NATs, I may or may not agree with you, and I may or may not believe that the flaming the topic tends to generate will result in any real progress or changes in behavior, but at least I'm sure the issue is important to the future of the Internet. Can you say the same for whether the Secretariat and its mailing list machinery adds (or does not add) a few headers to a message or a few characters to a subject line ... assuming they don't _break_ conforming software used in a rational way (e.g., with the robustness principle in mind)? And, if the answer is "no", is there any hope of increasing the ratio of meaningful technical standards work to this sort of debate around here? regards, john --On Thursday, 18 December, 2003 09:58 -0500 Keith Moore <[EMAIL PROTECTED]> wrote: Maybe we should also rewrite the From header field so that people with dysfunctional MUAs won't have trouble replying to the list? Maybe we should also rewrite the Reply-to field so that it doesn't matter when people get confused about the difference between reply to author and reply all? And
Re: Adding SpamAssassin Headers to IETF mail
On Thu, 18 Dec 2003, Keith Moore wrote: > I'm one of those list administrators and I can attest that having spam > flood the review queues of the mailing lists is a huge problem. Ahh. Mail from non-subscribers that has to be reviewed. SpamBayes or other content filters would be a far better approach to this problem, And they don't have the feature of revenge. Also, sorting out pre-existing subject lines, pre-existing message-id's, that have been seen already on the list is probably useful. > It's not terribly unusual for the review queue of some lists to get so > large that you can't download and resubmit Mailman's review page without > crashing the web browser (and I've tried several different browsers on > different platforms). A better, faster user interface could be useful. I think you can have mailmail messages sent to an imap store for approval, where you can sort them into different folders based on certain criteria (like replies), and use faster user interfaces to forward them to the list. It is strange that it crashes your web browser. I've used web browsers (Netscape and IE) with a Call Accounting system, which shows Call Detail Record pages with 20,000 records per page, and IE and Netscape can load it. > but despite first-hand experience with the problem I'm still worried > about using SpamAssassin - I've seen it block too many legitimate > messages. Mostly, this is due to the revenge oriented blacklists that it uses. --Dean
Re: Adding SpamAssassin Headers to IETF mail
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Harald Tveit Alvestrand wrote: | Keith, | | the reason the secretariat is doing this in stages is exactly because we | want to see how big the false-positive issue is. | | I currently personally use Mailman 2.60 with Bayesian filtering and | close-to-default rules; it seems to run at a very low rate of false | positives. | In my experience lots of false positives from spamassassin+bayesian is the result of the user making false assumptions about the linearity of the spamassassin point-scale. MVH leifj -BEGIN PGP SIGNATURE- Version: GnuPG v1.0.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQE/4eT08Jx8FtbMZncRAnw6AJ46QS2hwy6KVNFtGwnKLtNZbjvEeACgtckS ad+Jaiv8wJPjix7MV9NS0SE= =e3+E -END PGP SIGNATURE-
Re: Adding [ietf] considered harmful
> "From" lines and "Reply-to" and whatever are headers that are meant to > be processed by computers. So, you can say all you want about how > dumb MUAs do or do not process these (and how intermediate mail > servers should keep their mits off). Now, humans use these lines, > too. So, call them dual use. > > The subject line, on the other hand, is just for people. Book titles are for people, too. Does that mean that it's okay for a bookseller or library to change the titles on books, in order to help the consumer indentify where they came from? I'm a bit surprised at the frequency at which people who claim to be networking protocol engineers fail to appreciate the benefits of clean separation-of-function and layering.
Hashing spam
I work on an approach to block spam with a database of hash (md5) string of spam email: 1) Reporting a "verified" spam to the database server on the web 2) the mail client check incoming mail, generate a hash string send to and verify the presence on the server, is yes block email. 3) download a hot list to block directly on the machine i don't know if it's a good or bad idea. --giuseppe
Re: Hashing spam
On 18 Dec 2003, at 13:10, escom wrote: I work on an approach to block spam with a database of hash (md5) string of spam email: 1) Reporting a "verified" spam to the database server on the web 2) the mail client check incoming mail, generate a hash string send to and verify the presence on the server, is yes block email. 3) download a hot list to block directly on the machine i don't know if it's a good or bad idea. http://www.rhyolite.com/anti-spam/dcc/
Re: Hashing spam
> From: "escom" <[EMAIL PROTECTED]> > I work on an approach to block spam with a database of hash (md5) string of > spam email: > 1) Reporting a "verified" spam to the database server on the web > 2) the mail client check incoming mail, generate a hash string send to and > verify the presence on the server, is yes block email. > 3) download a hot list to block directly on the machine > > i don't know if it's a good or bad idea. The several existing implementations of something like that idea suggest that some people think it is a reasonable idea. I think it is useful but has limitations. See http://www.google.com/search?q=vipul+razor and http://cloudmark.com for one (set of?) implementation(s). See http://www.dcc-servers.net/ for another. The DCC is often used with SpanAssassin. Refusing mail that has been seen anywhere else (i.e. with non-zero DCC target counts) seems like a perfect fit for a mailing list, but I'm probably biased and so won't suggest that. I vote "no" if someone is taking a vote about trashing the Subject headers. Access to this mailing list should be more, not less difficult. Anyone who cannot figure out how to sort mail from this list based on its existing headers and rewrite Subject headers or anything else to taste is not really interested in the nominal purpose of the list. If it were practical, it would be good to require subscribers or at least contributors to prove their interest by showing they can fetch, compile, install, configure, and operate a simple TCP application such as an SMTP server. Anyone who lacks sufficient interest to do (or already have done) something like that is unlikely to have anything interesting to say to this list, except to other go-ers. Such a test might reduce the number of people who are interested only in non-technical issues such as the administrative work of the IETF, the nasty evil power grabbing U.N., ICANN, or legacy internet engineers widening the digital divide, or whatever else concerns the people who are prompting the continued statements of the painfully obvious. (For some reason perhaps related to procmail, I'm not seeing the questions that prompt the obvious answers. It would be swell if those offering the answers would desist.) Anyone who cannot find a usable POP3 or SMTP server with which to subscribe to this list would certainly be better served and better serve the rest of us by using the web pages of its archive. See http://www.ietf.org/mail-archive/ietf/Current/maillist.html Vernon Schryver[EMAIL PROTECTED]
Re: Adding [ietf] considered harmful
On Thu, 18 Dec 2003 13:07:24 -0500 Keith Moore <[EMAIL PROTECTED]> wrote: > I'm a bit surprised at the frequency at which people who claim to be > networking protocol engineers fail to appreciate the benefits of clean > separation-of-function and layering. Hopefully the drawbacks are appreciated also. Quoting Rich Seifert, "Layering makes a good servant, but a poor master. Use layering to organize the way you THINK about networks, but don't let it restrict how you DESIGN networks." If I recall correctly, David Clark used to say something very similiar to this in a protocol workshop class at Interop awhile back. John
Re: Tag, You're It!
Thus spake "John Stracke" <[EMAIL PROTECTED]> > Modifying the Subject: line is a Bad Thing; it invalidates digital > signatures. We're never going to get widespread use of signed email as > long as we have pieces of mail infrastructure munging messages to make > signatures useless. Signed email already gets mangled by the ietf mail servers (AFAICT), so what's one more bad idea in the mix? I can't believe this topic is even being debated. Filtering has been a standard feature of every MUA I've used for over 10 years, including my current PDA and webmail systems. IMHO, the problems (listed by others) with this proposal grossly outweigh the complaints of a couple people who refuse to use a modern MUA or can't figure out how to configure said MUA to filter on the Sender header. S Stephen Sprunk "God does not play dice." --Albert Einstein CCIE #3723 "God is an inveterate gambler, and He throws the K5SSSdice at every possible opportunity." --Stephen Hawking smime.p7s Description: S/MIME cryptographic signature
Re: Hashing spam
escom wrote: I work on an approach to block spam with a database of hash (md5) string of spam email: 1) Reporting a "verified" spam to the database server on the web 2) the mail client check incoming mail, generate a hash string send to and verify the presence on the server, is yes block email. 3) download a hot list to block directly on the machine It's been done, and the spammers have already evolved to get around it: they randomize the messages so that the hashes don't match. -- /=\ |John Stracke |[EMAIL PROTECTED] | |Principal Engineer|http://www.centive.com| |Centive |My opinions are my own. | |=| |"No, no, that's *not* a boat, that's Queen Victoria."| \=/
Re: Hashing spam
> From: John Stracke <[EMAIL PROTECTED]> > >I work on an approach to block spam with a database of hash (md5) string of > >spam email: > ... > It's been done, and the spammers have already evolved to get around it: > they randomize the messages so that the hashes don't match. Unless you are mean naive and simplistic hashes, that is an overstatement. As long as you want to accept mail from strangers, no spam filter can perfectly predict whether copies of the next message from a stranger are being sent to 30,000,000 of your intimate friends, but the various hashing filters do some good work. An estimate of the effectiveness of a large scale filter can be obtained from what it sees as the spam ratio. If it claims that 60% of all mail is spam but the real ratio is 70%, then it must be 85% effective. Concerning false positives for this mailing list--it would be wise to define what mail is legitimate. In many places, you must accept at least 99.9% of all even remotely legitimate mail. However, this context is different. Here a boolean "good/spam" is simplistic and wrong. Instead we have a spectrum: 1. on-topic messages from subscribers 2. on-topic messages from non-subscribers 3. noise from subscribers 4. noise from non-subscribers 5. pure spam such as advertisements for loan sharks In this list, only #1 is clearly "good." It is good to avoid rejecting #2, but there is surely no harm in sometimes delaying #2. If the senders of any rejected or "false positive" #2 received an informative non-delivery report so that they could retransmit, what would be the harm? SpamAssassin is reported to be better than 60% accurate. #2 is surely rare compared to #1. Thus, as long as SpamAssassin white-lists all subscribers, there would be no harm in the occasional rejection of #2. Vernon Schryver[EMAIL PROTECTED]
Re: Tag, You're It!
Stephen Sprunk wrote: Thus spake "John Stracke" <[EMAIL PROTECTED]> Modifying the Subject: line is a Bad Thing; it invalidates digital signatures. We're never going to get widespread use of signed email as long as we have pieces of mail infrastructure munging messages to make signatures useless. Signed email already gets mangled by the ietf mail servers (AFAICT), so what's one more bad idea in the mix? Mine seems to make it. This one is (at least was) signed - I hope :-) -- Doug Royer | http://INET-Consulting.com ---|- [EMAIL PROTECTED] | Office: (208)520-4044 http://Royer.com/People/Doug |Fax: (866)594-8574 | Cell: (208)520-4044 We Do Standards - You Need Standards smime.p7s Description: S/MIME Cryptographic Signature
Re: Never-ending arguments about mailing lists considered harmful (was: Re: Adding [ietf] considered harmful)
John, Trying to make this response a brief one, and hopefully the last message I need to write on this topic for a while. 1) While I generally support reducing secretariat workload when possible, I don't think it follows that it's to our advantage to "let them automate anything they can sensibly automate without causing severe problems", particularly without taking due care in how it is done. We've had quite a few problems already with lists being subject to arbitrary censorship, and many of spamassassin's criteria have no sound justification. I should at this point re-iterate that so far nothing harmful has been done, and it does look like there's some attempt at "due care". I hope that publicizing this issue will encourage more "due care". 2) I have given several reasons for objecting to adding [xxx] to message headers, ranging from theoretical/academic arguments about separation-of-function and layering to statements of personal experience that this very practice causes problems with reading mail on small displays, with searching, etc. These are not absolutes but merely factors that people should consider rather than immediately assuming that subject munging is a good idea. 3) It's gotten to the point that almost any argument about a technical subtlety on the IETF list gets labelled a religious war. I suspect this is partly because we're straining to articulate the justification for our positions (so they look somewhat like religious arguments even when there's an underlying technical basis for them), but that's inherent in the fact that these subjects are subtle. I remember a time when we valued the exchange that helped to illuminate these subtleties and give justification for our positions, and when we did not think that this level of exchange was inappropriate or an excessive consumption of bandwidth. I'm not sure what has changed, but I hope it's not the case that we can no longer try to understand subtle effects of technical decisions - because I believe our inability to do that has caused the quality of our output to suffer tremendously. 4) I see the [xxx] labelling as a design issue. Even if we claim we're only designing for ourselves, it's still a concern because to me the casual attitude toward adding [xxx] reflects a lack of understanding of fundamental network protocol design principles. I see the spamassassin filtering as a process issue, but one that affects our ability to produce good designs, because I've seen several occasions where valuable input from outsiders was discarded for arbitrary reasons and the design suffered for it. John, I know you well enough to know that - You've seen more than a few problems with header munging yourself, and with munging of protocols by intermediaries in general; - You are more aware than most that the Internet is a diverse community with widely varying needs and capabilities and that it is becoming more diverse all the time; - You know enough about protocol design to appreciate the value of separation of layers in general, and of separation of function between user agent and transport in particular; and - You know enough about information storage and retrieval systems to appreciate the value in keeping data models clean. So I don't think I need to convince you of these things. If I'm talking to you specifically, I try to frame my statements with knowledge of your experience and depth in mind. When I make statements like the above on the IETF mailing list, I'm doing so for the benefit of people who don't seem to understand these things (regardless of who is in the To field), and part of my reason for doing so is to try to remedy that situation in a small way. Any good design is necessarily a compromise. It might be that there are cases where, _after_ considering the various factors, that adding [xxx] is a reasonable compromise, particularly for a list that operates only for a year or two - one can argue that UA capabilities won't change much while the list is in use. However such compromises are _not_ justified by statements of the form "it works for me, therefore it is good for everyone" -- particularly when the Internet is so diverse and when there's a tendency for these practices to become entrenched. It does seem like we often get bogged down in arguments between people of widely varying depths, or between people of very different kinds of expertise. In the first case there is no basis for compromise because the person who is out of his depth doesn't understand the need for compromise or the basis that makes the compromise reasonable. In the second case compromise is difficult because there is little or no common ground. I'm not sure how to resolve either kind of impasse in a reasonable fashion other than by discussion, though this does sometimes get tedious. Yes, I'd like to find a better way. At any rate, it seems difficult to get a compromise before it is clear that people understand the issues associated wit
Re: Adding SpamAssassin Headers to IETF mail
Dean Anderson wrote: > Mostly, this is due to the revenge oriented blacklists that it uses. You are aware that's it's trivial to disable all the blacklist testing in the config, aren't you? SpamAssassin is extremely configurable. -- Jake Nelson
layering and separation of function
> On Thu, 18 Dec 2003 13:07:24 -0500 > Keith Moore <[EMAIL PROTECTED]> wrote: > > > I'm a bit surprised at the frequency at which people who claim to be > > networking protocol engineers fail to appreciate the benefits of > > clean separation-of-function and layering. > > Hopefully the drawbacks are appreciated also. Quoting Rich Seifert, > "Layering makes a good servant, but a poor master. Use layering to > organize the way you THINK about networks, but don't let it restrict > how you DESIGN networks." If I recall correctly, David Clark used to > say something very similiar to this in a protocol workshop class at > Interop awhile back. there's clearly a limit to how much layering is desirable, and it's often desirable to have a way to bypass layering in corner cases. I'm convinced that IPv4 worked better because it _didn't_ separate location and identity, than it would have otherwise, because the cost of the extra mapping layer would have been prohibitive for most of IPv4's history (and may still be prohibitive, but we're getting closer). but failure to have clean interfaces and separation of function - makes the whole system more complex and less reliable (because components can't rely on other components functioning as advertised - e.g. NATs try to second-guess apps and apps try to second-guess NATs) and - makes the system less adaptable to meet unanticipated needs (because assumptions about how things work are no longer isolated in certain parts of the system but they permeate the entire system - meaning that the entire system has to evolve rather than evolving it one piece at a time)
Re: Adding [ietf] considered harmful
> > The subject line, on the other hand, is just for people. > > Book titles are for people, too. Does that mean that it's okay for a > bookseller or library to change the titles on books, in order to help > the consumer indentify where they came from? Um, my library slaps a helpful identification tag on the spine of every book to help me find it. Your analogy, man ... allman
Re: Adding [ietf] considered harmful
Keith- > Putting [foo] in the subject header is just another example of this > trend. Sure, it might be useful to people with dysfunctional MUAs, > and there are a lot of those people out there. There were once a lot > of people whose MUAs couldn't do "reply all", too. This is just wrong. "From" lines and "Reply-to" and whatever are headers that are meant to be processed by computers. So, you can say all you want about how dumb MUAs do or do not process these (and how intermediate mail servers should keep their mits off). Now, humans use these lines, too. So, call them dual use. The subject line, on the other hand, is just for people. Sure we can make programs and filters grok them to classify mail if there is some standard format (e.g., i-d actions). But, fundementally subject lines are for humans, not computers. So, comparing subject line munging to reply-to munging seems to me to pretty much apples and oranges. You might read the above as supporting your point that we should not add "[ietf]" to subject lines because subject lines are not for computers (or "dysfunctional MUAs") to process. However, I think the correct interpretation is that it is OK for the mail server to add these tags **and** they may aid the entities that the subject line is actually for in the first place (humans). Hence, they are fine. allman (I cannot actually believe I am sending a non-snide comment in this thread. Someone should slap me. I read through the whole thread last night. Every message was dumberer than the previous one (probably including this one!). I was literally laughing out loud. I cannot believe we are even having such a dumbass debate. But, it was like a wreck on the highway and I could not stop rubber-necking. If we have this much trouble about 6 characters in the subject line then we might as well forget that problem statement thingy. Really.)
Re: Hashing spam
The problem with this analysis is that it assigns greater value to contributions from subscribers than to contributions from non-subscribers. But often the failure to accept clues from "outsiders" causes working groups to do harm - and filtering messages in the #2 category increases this tendency. The occasional rejection of #2 messages can be very harmful. On Dec 18, 2003, at 3:01 PM, Vernon Schryver wrote: 1. on-topic messages from subscribers 2. on-topic messages from non-subscribers 3. noise from subscribers 4. noise from non-subscribers 5. pure spam such as advertisements for loan sharks In this list, only #1 is clearly "good." It is good to avoid rejecting #2, but there is surely no harm in sometimes delaying #2. If the senders of any rejected or "false positive" #2 received an informative non-delivery report so that they could retransmit, what would be the harm? SpamAssassin is reported to be better than 60% accurate. #2 is surely rare compared to #1. Thus, as long as SpamAssassin white-lists all subscribers, there would be no harm in the occasional rejection of #2.
What eMail is legitimate
Vernon Schryver <[EMAIL PROTECTED]> wrote: > > Concerning false positives for this mailing list--it would be wise to > define what mail is legitimate. In many places, you must accept at > least 99.9% of all even remotely legitimate mail. However, this context > is different. Here a boolean "good/spam" is simplistic and wrong. > Instead we have a spectrum: > > 1. on-topic messages from subscribers > 2. on-topic messages from non-subscribers > 3. noise from subscribers > 4. noise from non-subscribers > 5. pure spam such as advertisements for loan sharks Agreed that these categories exist. Alas, we cannot necessarily tell them apart. :^( > In this list, only #1 is clearly "good." I'd greatly prefer to avoid flame-wars about how much difference there is between #1 and #2... Personally, I consider the question pointless because we don't have any dependable way to tell them apart. Please realize how trivially easy it is to harvest poster addresses from archives and forge those as From addresses. > It is good to avoid rejecting #2, but there is surely no harm in > sometimes delaying #2. I do not agree that there is "surely no harm". (But I'd _really_ rather not argue that question.) > If the senders of any rejected or "false positive" #2 received an > informative non-delivery report so that they could retransmit, what > would be the harm? I _won't_ discuss the possible harm... But Vernon's point that a prompt non-delivery report minimizes the possible harm is an excellent one. > SpamAssassin is reported to be better than 60% accurate. #2 is surely > rare compared to #1. Thus, as long as SpamAssassin white-lists all > subscribers, there would be no harm in the occasional rejection of #2. This is where I must disagree. Whitelisting something as easily forged as the From address is simply wrong -- and if it is published rule, we're sure to see spammers forging whitelisted From addresses as their standard operating practice. If, OTOH, Vernon would like to whitelist the combination of From address and IP address of the sending SMTP server, that could be a very worthwhile practice, virtually immune to spammer forging. -- John Leslie <[EMAIL PROTECTED]>
Re: What eMail is legitimate
> From: John Leslie <[EMAIL PROTECTED]> > ... >This is where I must disagree. Whitelisting something as easily > forged as the From address is simply wrong -- and if it is published > rule, we're sure to see spammers forging whitelisted From addresses > as their standard operating practice. As is true of many theories about what spammers do or will do, practice differs from (simplistic) theory. In the real world, whitelisting by sender works fine and is not abused often enough to matter. Whether it works today because it is rarely used is a secondary issue good for no more than trying to predict the future. Yes, I know that spammers often forge source addresses. I get more than my fair share of demands from lusers that I unsubscribe them from this or that stream of porn or other offensive spam. Nevertheless, such problems are trivial in this context. That reasoning involves a second error common to IETF talk about spam and mailing list noise. It is the academic pretense that all failures are of equal gravity and completely unacceptable. In this case, the failure mode that supposedly makes whitelisting by sender unacceptable is merely leaking a little spam. >If, OTOH, Vernon would like to whitelist the combination of From > address and IP address of the sending SMTP server, that could be a > very worthwhile practice, virtually immune to spammer forging. If you mean manual whitelisting, that sounds good in theory, but fails in practice. I've experience with various sorts of whitelisting, because the DCC depends on whitelists to distinguish solicited from unsolicited bulk mail. Whitelisting by IP address fails in practice because so much bulk mail comes from so many different and changing SMTP clients. For an example at the small end of the spectrum of bulk mail sources, I've had to regularly change the whitelisting for IETF mailings. Bigger legitimate bulk mailer often have too many SMTP clients for outsiders to count, not to mention manually whitelist. You must find other ways to whitelist them. However, whitelisting bulk mail by IP address is trivial compared to whitelisting private mail by IP address. I use greylisting (see http://www.dcc-servers.net/dcc/greylist.html ) which can be described as automated whitelisting by the triple (sender,sender-IP-address,target). It works well, but only because it is automated and it uses 4yz soft failures. Many ISPs start sending a single message from one IP address and switch to another after a few minutes--lather and repeat for up to half a dozen different IP addresses for a single message. It would be hopeless to try to manually whitelist the IP addresses used by customers of such ISPs. The ISPs that do this sort of thing are among the largest. Vernon Schryver[EMAIL PROTECTED]
Re: Spam
On Wed, 17 Dec 2003 23:10:43 -0500, Bill Cunningham wrote: >Now that the federal government has taken some steps in regulating spam, >does that mean that a technical need as the IETF would look for, isn't >needed?>Maybe the Spam should be forgot about. Bill has the CMOS backup battery failed in your workstation? It is December 17, not April 1 :) Jeffrey Race
Re: Hashing spam
On Thu, Dec 18, 2003 at 03:39:58PM -0500, Keith Moore wrote: > The problem with this analysis is that it assigns greater value to > contributions from subscribers than to contributions from > non-subscribers. But often the failure to accept clues from > "outsiders" causes working groups to do harm I don't believe this is true, for any normal definition of "often". "Occasionally" might be believable. > - and filtering messages > in the #2 category increases this tendency. One could just as easily argue that such filtering would decrease the tendency, because people would modify their behavior to subscribe to groups they cared about. Also, one could just as easily argue that working groups are just as likely to be harmed by distracting comments from outsiders... > The occasional rejection > of #2 messages can be very harmful. Seems more likely to me that the amount of harm would be lost in the normal noise of ietf processes. Regards Kent > On Dec 18, 2003, at 3:01 PM, Vernon Schryver wrote: > > > 1. on-topic messages from subscribers > > 2. on-topic messages from non-subscribers > > 3. noise from subscribers > > 4. noise from non-subscribers > > 5. pure spam such as advertisements for loan sharks > > > >In this list, only #1 is clearly "good." It is good to avoid rejecting > >#2, but there is surely no harm in sometimes delaying #2. If the > >senders of any rejected or "false positive" #2 received an informative > >non-delivery report so that they could retransmit, what would be the > >harm? > > > >SpamAssassin is reported to be better than 60% accurate. #2 is surely > >rare compared to #1. Thus, as long as SpamAssassin white-lists all > >subscribers, there would be no harm in the occasional rejection of #2. -- Kent Crispin "Be good, and you will be [EMAIL PROTECTED],[EMAIL PROTECTED]lonesome." p: +1 310 823 9358 f: +1 310 823 8649 -- Mark Twain SIP: [EMAIL PROTECTED]
Re: Hashing spam
But often the failure to accept clues from "outsiders" causes working groups to do harm I don't believe this is true, for any normal definition of "often". "Occasionally" might be believable. if I look at why working groups do harm, the failure to accept clues from outsiders does seem to crop up "often". Of course, this is my assessment (others might read the situation differently) and I can only make this statement about the groups I've actually looked at, which is a small and nonrandom sample. One could just as easily argue that such filtering would decrease the tendency, because people would modify their behavior to subscribe to groups they cared about. You're incorrectly assuming that people with clues have the time to subscribe to and follow every single group that might do something harmful. Also, one could just as easily argue that working groups are just as likely to be harmed by distracting comments from outsiders... You could argue that. I haven't found it to be the case. The occasional rejection of #2 messages can be very harmful. Seems more likely to me that the amount of harm would be lost in the normal noise of ietf processes. Some noise is more harmful than others. Some WGs have more potential to do harm than others, and those are the very WGs that need outside input.
Dec03: Update on administration restructuring
In following up the discussion of the IAB Advisory Committee output, on December 1: http://www.ietf.org/mail-archive/ietf-announce/Current/msg27463.html I noted that I would endeavour to post monthly updates on progress, around mid-month. It's only been 2 weeks, but I thought it was important to get the update process rolling. There have been a few comments on the AdvComm document itself, draft-iab-advcomm-00.txt, but it seems largely ready to consider finished. We're going through it to check for final consistency and editorial fixes, with a view to having a new version out shortly. In terms of follow-on, Harald and I have started to have more formative discussions -- we've met with folks from ISOC and from CNRI to begin the discussions of what we might do, moving forward, to address the concerns laid out in the AdvComm document. There's nothing conclusive to report there, yet. Leslie. -- --- "Reality: Yours to discover." -- ThinkingCat Leslie Daigle [EMAIL PROTECTED] ---
Re: What eMail is legitimate
Vernon Schryver <[EMAIL PROTECTED]> wrote: >> From: John Leslie <[EMAIL PROTECTED]> >> >> This is where I must disagree. Whitelisting something as easily >> forged as the From address is simply wrong -- and if it is published >> rule, we're sure to see spammers forging whitelisted From addresses >> as their standard operating practice. > > As is true of many theories about what spammers do or will do, practice > differs from (simplistic) theory. You're welcome to build your organization on the assumption that spammers will continue doing the same thing they do today -- I choose to design for what they're likely to do next month... > In the real world, whitelisting by sender works fine and is not abused > often enough to matter. Seems to be true today... > Whether it works today because it is rarely used is a secondary issue > good for no more than trying to predict the future. I draw a distinction between "predicting" the future and "planning for" the future. "Planning for" the future requires being ready for things which may not be the "most likely" outcome. Thus, I'm not betting on what spammers will do next month. I'm hoping to be prepared for a number of different scenarios. > Yes, I know that spammers often forge source addresses. I get more > than my fair share of demands from lusers that I unsubscribe them from > this or that stream of porn or other offensive spam. Good to know you're aware of this. > Nevertheless, such problems are trivial in this context. Today, maybe... > That reasoning involves a second error common to IETF talk about spam > and mailing list noise. It is the academic pretense that all failures > are of equal gravity and completely unacceptable. I really don't know who you're talking about here. Certainly I have said nothing remotely like that. > In this case, the failure mode that supposedly makes whitelisting by > sender unacceptable is merely leaking a little spam. I work on a simple-minded principle: if you want more of something, arrange to reward it. I choose not to reward spammers for forging _obvious_ From addresses. >> If, OTOH, Vernon would like to whitelist the combination of From >> address and IP address of the sending SMTP server, that could be a >> very worthwhile practice, virtually immune to spammer forging. > > If you mean manual whitelisting, No, I don't. > that sounds good in theory, but fails in practice. I've experience > with various sorts of whitelisting, because the DCC depends on > whitelists to distinguish solicited from unsolicited bulk mail. > Whitelisting by IP address fails in practice because so much bulk > mail comes from so many different and changing SMTP clients. I'd actually enjoy discussing this issue. However what I was discussing was whitelisting the _combination_ of From address and the IP address that sender normally sends from. Until senders figure out how to forge the IP addresses of sending SMTP servers, this should make the whitelist pretty safe. > For an example at the small end of the spectrum of bulk mail sources, > I've had to regularly change the whitelisting for IETF mailings. > Bigger legitimate bulk mailer often have too many SMTP clients for > outsiders to count, not to mention manually whitelist. I seriously doubt we need to worry about IETF contributors who send through more than a few dozen IP addresses. Even if we do, it's still easily automated. I see no scaling problems even with 1,000 different IP addresses a contributor sends through. > You must find other ways to whitelist them. Perhaps, if you're doing it manually... > However, whitelisting bulk mail by IP address is trivial compared > to whitelisting private mail by IP address. I use greylisting (see > http://www.dcc-servers.net/dcc/greylist.html ) which can be described > as automated whitelisting by the triple (sender,sender-IP-address, > target). An interesting concept. I haven't tried it -- though I have used something a bit close -- imposing 50% packet loss on IP ranges which seem to contain many open relays. I find that legitimate email gets through, while spam is significantly slowed. (Obviously I don't consider this a "solution", just a stop-gap.) > It works well, but only because it is automated and it uses 4yz > soft failures. Many ISPs start sending a single message from one > IP address and switch to another after a few minutes--lather and > repeat for up to half a dozen different IP addresses for a single > message. Certainly an ill-suited tactic. (I can't help thinking we'd do well to write up an info RFC about stupid SMTP tricks...) > It would be hopeless to try to manually whitelist the IP addresses > used by customers of such ISPs. Not really, the way I'm thinking about it. Any combination not already evaluated goes to a spam-evaluator (whether automatic or manual): if evaluated as spam, an error is returned, if evaluated as valid, the combination is whitel
Re: Hashing spam
It just strikes me as highly unlikely that a WG would ever change course because of what would look like random comments from outsiders -- it's not consistent with the dynamics of a WG, or with human nature. and that just might be one of our biggest problems, in a nutshell.
Re: Adding [ietf] considered harmful
On Thu, 18 Dec 2003 13:19:29 EST, Mark Allman said: > Um, my library slaps a helpful identification tag on the spine of every > book to help me find it. Your analogy, man ... A quick sampling of 15 books from our local public library shows that: a) All 15 have spine tags for on the shelves and barcodes for check in/out. b) The exact location of neither tag is standardized - the height of the spine tag is variable and attempts to not obstruct the author/title originally printed. The barcode is *usually* placed on the back in such a way as to avoid obstructing text, but on 2 books is on the *front* because less information got overlaid that way. Obviously, the library is telling us to try to not munge existing information by sticking stuff in the Subject: line. :) pgp0.pgp Description: PGP signature