Hi Robert:

Ok. But I store file suffixes in a hash table ?
I can hash the url file suffix and check if it's
in table.

regards
Lucas Brasilino

for sure. you have a linear scan, comparing all suffixes on all urls.

if you assume a normal distribution, of the urls that fail, you will
have compared all extensions
and urls that match you will have compared 50% of the extensions on
average

checking a single extension requires checking (say) 4 characters
if you have 20 extensions, thats 40 char checks (plus the memory hit to
get the list)

the same 20 extensions, with a good regex compiler should boil down to a
single check for '.' at offset -4, then a single check for the number of
unique next letters etc. basically a perfect tree rather than linear
scan.

Reply via email to