So Mike Hearn wrote a fantastic post on antispam for email [1]. The system he describes performs "feature extraction" from both metadata (e.g. sender) and contents (e.g links). It then scores messages based on the reputation of these features, and uses feedback from recipients and the scoring engine to adjust reputations.
Mike argues that feature extraction from message contents is necessary for antispam in email. But there was discussion of communications systems with different properties, e.g. "Apple iMessage, Wickr and BBM Protected can all be described as opportunistic encryption messaging systems that have been very successful deployment-wise." - Joe Bonneau, [2] "When you have central control everything becomes a million times easier because you can change anything at any time. You can terminate accounts and control signups." - Mike, [1] --- So it looks like some systems are "spam-vulnerable" to the point that message contents must be scanned, and the potential for widespread "Opportunistic Encryption" [3] is limited. But other systems seem "spam-resistant" enough that content scanning isn't necessary. I think the main theories about this difference are: 1) Size of target population: Email has a huge userbase, and email addresses are widely shared, so spammers are able to harvest huge target lists. 2) Cost per communication: Sending a single email is very cheap, compared to (say) postal mail. 3) Ability to attribute and penalize the sending service provider: While a receiving mail server can attribute the sending IP and assign it a reputation, IPs are fairly cheap and only loosely associated with sending service providers. Attributing a sending domain is more difficult [4]. 4) Ability to attribute and penalize the sending user: Free email accounts and easy signup make it hard to impose a cost on abusive users. 5) Centralized control: If a single system registers all users and handles all feedback, then I guess it's easier to calculate user reputations and penalize users. But I'm unclear how much this matters, or what other benefits come from centralization? One question is: which factors could we change to create email-like systems with spam-resistance and opportunistic encryption? I would take 1, 2, and 5 off the table. We want systems that can be widely used (1). Pay-per-message seems complicated, and would add tremendous overhead and change the UX of email (2). And we don't want a single party in control (5). So that leaves strengthening the reputation system, and basing it on more costly "identities". This is imaginable at the provider level. Instead of letting any IP send email, providers could form a federation where each sender has to sign up to certain obligations, post a bond, etc. Of course, this would be more "clubby" and less open than email currently - perhaps more like the relationships between telephony providers in the PSTN, or network operators in the Internet? If the reputation system also needs strengthening at the user level, then users would have to spend or commit some costly resource to get an account. Captchas, anti-automation [5], and proof-of-work probably only go so far. If users have to commit something more expensive, this could in theory be done in a privacy-preserving way (Tor! digital cash!), but most users would find it easier to register with something linked to them (phone number, credit card). So then there are privacy implications... But anyways, it would be great if we could ground this in facts. If anyone had insight into large-scale communications systems where spam and abuse are controlled *without* content scanning, it would be interesting to hear how that works, and what the important factors are. Trevor [1] https://moderncrypto.org/mail-archive/messaging/2014/000780.html [2] https://moderncrypto.org/mail-archive/messaging/2014/000824.html [3] https://moderncrypto.org/mail-archive/messaging/2014/000767.html [4] http://ceas.cc/2006/19.pdf [5] http://webcache.googleusercontent.com/search?q=cache:v6Iza2JzJCwJ:www.hackforums.net/archive/index.php/thread-2198360.html+&cd=8&hl=en&ct=clnk&gl=ch _______________________________________________ Messaging mailing list [email protected] https://moderncrypto.org/mailman/listinfo/messaging
