On 7/15/2017 2:13 PM, David B Funk wrote:
How quickly do stale entries get removed from it?

I randomly sorted this list, then I tried visiting 10 randomly selected links. I know that isn't a very large sample size, but it is a strong indicator since they were purely randomly chosen. 9 of the 10 links had already been taken down. So there might be much stale data in that list?

I also extracted out the host names, deleted duplicates, randomly sorted those, then ran checks of 500 randomly selected host names against SURBL, URIBL, DBL, and ivmURI. The number of hits on all 4 lists of shockingly low. But I think that probably has more to do with stale data on this URL list (and this is really a URL list, not a URI list), rather than with lack of effectiveness of these other domain/URI blacklists.

Still, there can be situations where a URI list won't list such a host name due to too much collateral damage - but yet where a URL list that specifically lists the entire URL - can still be effective.

Because such a URL list would be LESS efficient (due to being rules-based), it would be preferable that such a list would have much less stale data - and perhaps would focus on the stuff that isn't found on any (or very many) of the 4 major URI lists I mentioned, so as to keep the data small and focused, for maximum processing efficiency.

--
Rob McEwen
http://www.invaluement.com

Reply via email to