Please folks, don't take my comments as aggressive, even though they may 
sometimes come across as cynical. My top
priority is to somehow help curb the flood of spam, not to accuse people of not 
doing enough or not doing the right thing.

Am 05.02.21 um 04:23 schrieb Brandon Long via mailop:
> If you received say... a million ab...@gmail.com <mailto:ab...@gmail.com> 
> emails a day, how would you handle that?
>
When you have technology to handle some billion searches per day, I would 
suggest you have some experts in house who
could come up with ideas :-)
> Now, automating abuse@ requests is more likely at that scale... trying to 
> find common issues and problems, even across
> days, or maybe learning certain reporters as more useful than others... and 
> the reported issues from that are still a
> drop in the bucket compared to the known spammy accounts issues you get from 
> other sources.
>
> Which isn't to say it's not useful, it is, it does find this weird low level 
> of abuse that tends to be continuous but
> otherwise below the major campaigns you're already catching and working on... 
> on the other hand, ignoring it for too
> long and you can get a large amount of ground level abuse noise that is made 
> up of individually small actors.  Or
> maybe you're completely missing some new type of spam which is evading your 
> other feedback mechanisms.

Automation is of course the key at that scale. It is absolutely clear that it's 
impossible to handle abuse reports for
that many users manually, even if you staff your abuse desk quite generously.

And automation isn't easy. I've dabbled a bit with Tensorflow to identify 
likely spam sources, and getting low enough
false positives while being able to adapt to changing spammer patterns proved 
to be beyond my reach as a spare time mail
admin.

But we're not talking about some arbitrary spare-time mailbox provider, we're 
talking about a company that spends
millions (actually, billions: 
https://www.quora.com/Which-big-tech-company-spends-the-most-R-D-on-AI-machine-learning)
on AI algorithms that enable their users to find images of kittens on the 
internet, or select ads for me to show offers
for the bass guitar that I bought some years ago just because I recently 
visited a youtube video of some review of this
bass (true story).

There's a lot that can be achieved with automated categorization of spam 
reports according to different dimensions:

  * Reliability: Reliable spam reporters vs. people who hit the Junk button 
accidentally.
  * Own resources used to spam: Sender and Reply-To: mailbox addresses, IP 
addresses, message IDs, cloud facilities etc.
    can be extracted from mail headers and (with some less accuracy) mail 
bodies.
  * Content: 4-1-9 scams, link shortener URLs stuffed with spammer prose, 
promises of financial or sexual performance
    enhancements, possible malware, etc.

Combined these could sort reports into a few hundred "bins" labeled with report 
reliability, actionability, severity.
Acting on the top one or two dozen actionable bins each day would go a long way.

  * If you ignore the huge "nonsense" bin you might miss a few real abuse 
reports but save a lot of work.
  * The 4-1-9 bin is probably best handled automatically by disabling sender 
and reply-to addresses reported by reliable
    sources. If possible, a distinction between accounts created to spam and 
compromised accounts based on account
    creation and usage data should be made here to determine the best action.
  * Accounts sending malware and URL shortener stuff are almost always 
compromised. You need a policy to handle such
    accounts (restrict their sending ability until cleaned up) but if you have 
such a policy and mechanism to mark
    accounts as compromised that should be doable with a few clicks.
  * Abuse performed using cloud services such as the google form facility 
probably requires policy changes which is
    beyond the scope of the abuse desk but should be triggered by them.
  * Bulk mail sending customers who create significant amounts of complaints 
need to be scrutinized to determine whether
    you can make them change their behavior or a termination is warranted. This 
is most likely the most time-consuming
    activity and it can't be automated, but decision-making in this area can be 
supported by good data.

That's probably something a single abuse desk worker or small team with 
appropriate tools and authority to disable
accounts should be able to manage.

Of course these are just brain farts, not a well-thought-out plan. Maybe you 
even do something comparable to this, I
don't know. Some externally visible signs make me question it:

  * Reporting channel: Apparently e-mailed abuse reports are not accepted 
anymore (that's at least what Spamcop says, I
    have since stopped reporting spam to Google using SC). I wasn't able to 
find a spam reporting option where I could
    post a complete e-mail header. The whole story is too long to fit into this 
bullet point. But it looks like there's
    no way for me to get reliable spam data into the abuse system, and without 
such data I don't see how any of this
    could work.
  * Ongoing spam from reported sources: As Jesper pointed out earlier in this 
thread, spam from specific sources goes on
    for months even though properly reported. I've made the same experience 
with the google forms based spam
    (trix.bounces.google.com), which is still ongoing.


Cheers,

Hans-Martin
_______________________________________________
mailop mailing list
mailop@mailop.org
https://list.mailop.org/listinfo/mailop

Reply via email to