On 04/30/2014 11:52 AM, Henrik Nordström wrote:
Unless it has been fixed the UFS based stores also have an implicit
limit on cached entries somewhat less than 4KB (whole meta header need
to fit in first 4KB). Entries failing this gets cached but can never get
hit.
Then StoreID helps a bit with that..
Now it's understood why some urls with the "?" in them do not cache well sometimes :P

>DNS defines X.Y.Z segments as being no longer than 255 bytes*each*.
For Internet host names the limits are 63 octets per label, and 255
octects in total including dot delimiters.

This is indeed what I have been reading in the RFC and it makes the regex for domain simpler to define. From what I have seen 2-3KB of request size was the high limit of the size that have been used. I assume that this is what is happening now in the current data sizes over the network. Every once in a while the data size goes up and the url should also since they will be used by bigger sizes hash algorithms.
It was started in smaller and then crc16 crc32 mdX md5 sha1 sha512...etc..

So for now a url blacklist should be at-least 4KB with size but I think when jumping\doubling 4KB it's not such a big jump to 8KB. The main issue I was thinking was between using one field of the DB with X size or other one which has indexes.

For now I have used mysql TEXT which doesn't have indexes but only the first query takes more then 0.00 ms.

I have tried couple key-value DB's and other DB's but it seems like all of them are having some kind of a step which is the slowest and then it run's fast.

I have mysql Compared to key-vaule and the main differences are the on-disk size of the DB which is important if there is a plan to filter many many specific urls and not based only on patterns.

Amos:(or anyone else) since you patched squidguard, maybe you do have basic understanding with it's lookup algorithm? I started reading the code of SquidGuard but then all of a sudden I lost my way in it and things got a bit complicated (for me) to understand how they do it.(hints..)

Thanks,
Eliezer

Regards
Henrik

Reply via email to