On Tue, Jan 23, 2018, at 02:55, Zulma Pape wrote: > In other words, can we integrate the Cloud AutoML into our server's > spam filter and make it behave the same way Gmail behave ? In short, not without a *lot* of work.
Gmail implements a lot more complexity, and they have a lot more data than you. One example is that they track user interaction with email, things like what messages does a user delete without reading, what messages are opened and for how long, are links clicked, replies generated, etc. They also have a very wide view of all the email around the world, and therefore are very likely to spot new botnets, changes in spammer techniques, and also changes in legitimate mail far faster than almost anyone else. Bayesian is good, per-user bayesian is better, but Gmail can build bayesian databases without the user's help simply based on their activity combined with generalized multiple user filters. They can also use this type of learning to split out mailing lists, receipts, advertising, scams and others in a general sense, and then apply some logic to determine if this particular user is likely receptive to the classifications of messages. You could reproduce all of this to the best of your data, but you also need a relatively massive dataset and ability to collect a lot of details about your user activity to really make it work. On the other hand, you can make unilateral decisions under the "my server, my rules" policy to customize and tweak your own filters in a way that Google cannot.