Jonathan Feally wrote:

Now why would a user delete mail from the spam folder if the script is going to do it for you?

Because there might be false positives. Of course there may be a script which deletes mail in the spam folders, that are older than, say, a week.

My entire setup requires very little training of users. They move spam into folder. If it is a message that was marked as posible spam they copy it into the notspam folder. Thats it. They continue with their work. If they look in the spam folder they will see at most 29 minutes of spam that has been put in the folder, otherwise it's gone.

Won't they miss false positives, if they don't check the spam folder every 25 minutes 24h a day? I'm not sure I understand you here.

Now consider this. Lets say you wake up in the morning and move 100 messages all at once into the spam folder. A trigger is going to launch your external script/application 100 times and probably at a rate in which your mail server is going to get load from the application being run without it actually doing anything.

To get 100 spam messages and none of them categorized as spam is only possible if no training has been done. This would not be a normal situation.

Do you really want 100 concurrent instances of dspamc running? Or would it make more sense to keep the mail server happy and run the dspamc's serially, keeping system resources free for the database server or dbmail-imapd.

First, dspamc is a client, that connects to the dspam daemon which does the real work. And even if, one could serialise the the execution of dspamc in dbmail, although then you'd have to get the whole mails right after the move and store them in the memory. Otherwise the user might delete them before it's their turn. Although using a temporary folder is also possible.

This is just for things like spam training. Imagine if somebody setups up a script to do some other inspection of the email. Perhaps a simple thing such as a support mailbox that recieves emails in a certain format that wants to mark a paticular item as responed to in another database. Then the midnight tech support guy responds to 30 emails all at once because his DVD just finished and them moves them off in bulk. This script then is going to read the message, extract the ticket #, open a connection to the database and update the record. Now all of a sudden you have 30 queries hitting up your db at the same time. Not a big deal, but for scalabillity this isn't going to work well.

I never said that this can't be done without applying limits.

Not to mention that each trigger would have to be defined somewhere and know what script/application to execute. For a spam thing to work with each user having his own spam folder, thats one trigger per user. Thus creating a lot of overhead for the mail administrator when he wants to set it up and change it.

The spam folder is defined in dspam. The name of the folder is the same for all users. If I change that, dbmail automatically creates the new folder once a spam mail is delivered there. So it's still one trigger.

This is also going to give dbmail-imap more work to do on every move checking if the message was in a folder that has a trigger that runs when a message is moved out and checking if the dst folder has a trigger for a message moved in.

True.

Triggers may work good for an installation where you have a small number of users, but on a large scale in a corperate enviroment, programs running on the machine where dbmail-imapd sits is going to cause problems.

Yes, of course. But it's an option and it would work great on smaller installations. Nobody is forcing anyone to use it/compiling it.

In case of spam training, it might even work OK for bigger installations, because users will train a lot only at the beginning. Dspam learns quickly and this feature won't be used frequently. For a really big userbase this is a no go, that's understandable.

Thanks for the feedback, I appreciate it,
                       Alex

Reply via email to