Hi, I've done some benchmarking, here are the results so far: The proposal I'm suggesting for dstdomain acl is at lp:~kinkie/squid/flexitrie . It uses the level-compact trie approach I've described in this thread (NOT a Patricia trie). As a comparison, lp:~kinkie/squid/domaindata-benchmark implements the same benchmark using the current splay-based implementation.
I've implemented a quick-n-dirty benchmarking tool (src/acl/testDomainDataPerf); it takes as input an acl definition - one dstdomain per line, as if it was included in squid.conf, and a hostname list file (one destination hostname per line). I've run both variants of the code against the same dataset: a 4k entries domain list, containing both hostnames and domain names, and a 18M entries list of destination hostnames, both matching and not matching entries in the domain list (about 7.5M hits, 10.5M misses). Tested 10 times on a Core 2 PC with plenty of RAM - source datasets are in the fs cache. level-compact-trie: the mean time is 11 sec; all runs take between 10.782 and 11.354 secs; 18 Mb of core used full-trie: mean is 7.5 secs +- 0.2secs; 85 Mb of core. splay-based: mean time is 16.3sec; all runs take between 16.193 and 16.427 secs; 14 Mb of core I expect compact-trie to scale better as the number of entries in the list grows and with the number of clients and requests per second; furthermore using it removes 50-100 LOC, and makes code more readable. IMO it is the best compromise in terms of performance, resources useage and expected scalability; before pursuing this further however I'd like to have some feedback. Thanks On Wed, Jun 4, 2014 at 4:11 PM, Alex Rousskov <[email protected]> wrote: > On 06/04/2014 02:06 AM, Kinkie wrote: > >> there are use cases >> for using a Trie (e.g. prefix matching for dstdomain ACL); these may >> be served by other data strcutures, but none we could come up with on >> the spot. > > std::map and StringIdentifier are the ones we came up with on the spot. > Both may require some adjustments to address the use case you are after. > > Alex. > > >> A standard trie uses quite a lot of RAM for those use cases. >> There are quite a lot of areas where we can improve so this one is not >> urgent. Still I'd like to explore it as it's synchronous code (thus >> easier for me to follow) and it's a nice area to tinker with. >> >> On Tue, Jun 3, 2014 at 10:12 PM, Alex Rousskov >> <[email protected]> wrote: >>> On 06/03/2014 08:40 AM, Kinkie wrote: >>>> Hi all, >>>> as an experiment and to encourage some discussion I prepared an >>>> alternate implementation of TrieNode which uses a std::map instead of >>>> an array to store a node's children. >>>> >>>> The expected result is a worst case performance degradation on insert >>>> and delete from O(N) to O(N log R) where N is the length of the >>>> c-string being looked up, and R is the size of the alphabet (as R = >>>> 256, we're talking about 8x worse). >>>> >>>> The expected benefit is a noticeable reduction in terms of memory use, >>>> especially for sparse key-spaces; it'd be useful e.g. in some lookup >>>> cases. >>>> >>>> Comments? >>> >>> >>> To evaluate these optimizations, we need to know the targeted use cases. >>> Amos mentioned ESI as the primary Trie user. I am not familiar with ESI >>> specifics (and would be surprised to learn you want to optimize ESI!), >>> but would recommend investigating a different approach if your goal is >>> to optimize search/identification of strings from a known-in-advance set. >>> >>> >>> Cheers, >>> >>> Alex. >>> >> >> >> > -- Francesco
