On Sun, Jul 10, 2011 at 04:57:38PM +0200, Frederik Schwarzer wrote: > If I search for e.g. "filmas" in all lists on > http://lists.debian.org/search.html > the result contains about 90% already removed spam messages. > > The results 2,3 and 4 for example are: > http://lists.debian.org/debian-68k/2011/01/msg00036.html > http://lists.debian.org/debian-alpha/2011/01/msg00006.html > http://lists.debian.org/debian-amd64/2011/01/msg00013.html > > Is the search index outdated or are removed spam messages > recognised as normal messages?
Thanks for reporting this, and sorry that it's not been addressed for so long. The main issue is that we don't currently have a process to purge removed spam from the index. I have been working on that, but it's not ready to deploy yet. However, we've just rebuilt the search index from scratch after upgrading the server to stretch, which means there's now nearly one million fewer already removed spam messages indexed - I just checked your example search and it currently returns valid links (many are spam, but spam that exists in the archives). Cheers, Olly