A number of mailing lists I subscribe to generate large numbers of
duplicate posts which TB! doesn't pick up.

On examining the "non-duplicate duplicates", it seems that the problem
is related to Yahoo! (and others) appending footers to the email thus:

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get your FREE credit report with a FREE CreditCheck
Monitoring Service trial
http://us.click.yahoo.com/Gi0tnD/bQ8CAA/ySSFAA/3mLolB/TM
---------------------------------------------------------------------~->

The problem is that the piece after '...yahoo.com/' is some sort of
hash, semi-randomly generated _for each email_ for, I presume,
tracking purposes. Of course a simple algorithm would not know that
there is no difference to the _body_ of the email.

A fix would be for TB!'s duplicate algorithm to pass over all URLs in
emails.

(Because several mailing list providers seem to have a lot of server
problems, producing duplicate posts by error, an improved check would
probably remove 15-20 per cent of my mailbox!)

Alastair


-- 
_________________________________________________________
Archives   : http://tbbeta.thebat.dutaint.com
Moderators : mailto:[EMAIL PROTECTED]
Unsubscribe: mailto:[EMAIL PROTECTED]
Latest Beta: 1.54 beta/10
Wish List  : http://wish.thebat.dutaint.com

Reply via email to