Re: Question about tcp hash function tcp_hashfn()
On Thu, Jun 01, 2006 at 12:40:10PM -0600, Brian F. G. Bidulock ([EMAIL PROTECTED]) wrote: So what are your thoughts about my sequence number approach (for connected sockets)? Depending on how you are going to use it. Generic socket code does not have TCP sequence numbers since it must work with all supported protocols. Netchannels also do not know about internals of the packet by design, since all protocol processing is performed at the end peer. Sequence number can be wrapped in minutes in current networks and even faster tomorrow, that is why PAWS was created. Your idea about reinserting the socket does not scale in 1Gbit environment, and definitely will not in 10Gbit. Probably it is possible to create second hash table for TCP sockets only and use that table first in protocol handler, but it requires some research to prove the idea. -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
On Fri, Jun 02, 2006 at 07:40:38AM +0200, Florian Weimer ([EMAIL PROTECTED]) wrote: * Evgeniy Polyakov: That is wrong. And I have a code and picture to show that, and you dont - prove me wrong :) Here we go: static inline num2ip(__u8 a1, __u8 a2, __u8 a3, __u8 a4) { __u32 a = 0; a |= a1; a 8; a |= a2; a 8; a |= a3; a 8; a |= a4; return a; } gcc -Wall was pretty illuminating. 8-P After fixing this and switching to a better PRNG, I get something which looks pretty normal. :) thats true, but to be 100% honest I used different code to test for hash artifacts... That code was created to show that it is possible to _have_ artifacts, but not specially to _find_ them. But it still does not fix artifacts with for example const IP and random ports or const IP and linear port selection. Values must be specially tuned to be used with Jenkins hash, for example linear port with const IP produce following hash buckets: 100 24397 200 12112 300 3952 400 975 500 178 600 40 700 3 800 1 i.e. one 800-entries bucket (!) while xor one always have only 100 of them (for 100*hash_size number of iterations). So, your prove does not valid :) -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
Evgeniy, I agree, even with constant source IP, the hash still should have performed better (but didn't). Constant source IP and varying port is a realistic data set for a port proxy. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
* Evgeniy Polyakov: :) thats true, but to be 100% honest I used different code to test for hash artifacts... Ah, okay. But it still does not fix artifacts with for example const IP and random ports or const IP and linear port selection. I see them now. Hmm. Is there a theoretical explanation for them? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
On Wed, May 31, 2006 at 11:41:27AM -0700, David Miller ([EMAIL PROTECTED]) wrote: Worse: he folded the jenkins algorith result with h ^= h 16; h ^= h 8; Destroying the coverage of the function. It was done to simulate socket code which uses the same folding. Leaving 32bit space is just wrong, consider hash table size with that index. You absolutely show not do this shifting on the jenkins hash result, you destroy the distribution entirely. Just mask it with the hash mask and that's all you need to do. Brian is right, this is absolutely critical to using the Jenkins hash correctly. You're unmixing the bits it worked so hard to mix. That is wrong. And I have a code and picture to show that, and you dont - prove me wrong :) Attached code and picture which shows _exactly_ the same distribution for folded and not folded Jenkins hash distribution, and it's artifact compared to XOR hash. Fairly distributed values being XORed produce still fairly distributed value. -- Evgeniy Polyakov hash_comparison.png Description: PNG image #include stdlib.h #include stdio.h #include errno.h typedef unsigned int u32; typedef unsigned short u16; typedef unsigned char u8; typedef unsigned int __u32; typedef unsigned short __u16; typedef unsigned char __u8; #include jhash.h struct hash_entry { unsigned long counter; }; static inline num2ip(__u8 a1, __u8 a2, __u8 a3, __u8 a4) { __u32 a = 0; a |= a1; a 8; a |= a2; a 8; a |= a3; a 8; a |= a4; return a; } static inline __u8 get_random_byte(void) { return 1 + (int) (255.0 * (rand() / (RAND_MAX + 1.0))); } static inline __u16 get_random_word(void) { return 1 + (int) (65536.0 * (rand() / (RAND_MAX + 1.0))); } unsigned int hash_addr(__u32 faddr, __u16 fport, __u32 laddr, __u16 lport) { unsigned int h = (laddr ^ lport) ^ (faddr ^ fport); h ^= h 16; h ^= h 8; return h; } unsigned int hash_addr1(__u32 faddr, __u16 fport, __u32 laddr, __u16 lport) { u32 ports; unsigned int h; ports = lport; ports = 16; ports |= fport; h = jhash_2words(faddr, laddr, ports); //h ^= h 16; //h ^= h 8; return h; } int main() { struct hash_entry *table; __u32 saddr, daddr; __u16 sport, dport; unsigned int hash, i, *results; unsigned int hash_size = 65536, iter_num = 100; table = malloc(hash_size * sizeof(struct hash_entry)); if (!table) return -ENOMEM; results = malloc(hash_size * sizeof(unsigned int)); if (!results) return -ENOMEM; for (i=0; ihash_size; ++i) { results[i] = 0; table[i].counter = 0; } srand(time(NULL)); daddr = num2ip(192, 168, 0, 1); dport = htons(80); for (i=0; ihash_size*iter_num; ++i) { saddr = num2ip(get_random_byte(), get_random_byte(), get_random_byte(), get_random_byte()); sport = get_random_word(); hash = hash_addr(saddr, sport, daddr, dport); hash = (hash_size - 1); table[hash].counter++; } for (i=0; ihash_size; ++i) results[table[i].counter]++; for (i=0; ihash_size; ++i) printf(%u %u\n, i, results[i]); //printf(%u %u\n, i, table[i].counter); }
Re: Question about tcp hash function tcp_hashfn()
On Wed, May 31, 2006 at 12:29:55PM -0600, Brian F. G. Bidulock ([EMAIL PROTECTED]) wrote: Evgeniy, On Wed, 31 May 2006, Evgeniy Polyakov wrote: Worse: he folded the jenkins algorith result with h ^= h 16; h ^= h 8; Destroying the coverage of the function. It was done to simulate socket code which uses the same folding. Leaving 32bit space is just wrong, consider hash table size with that index. Btw, that probably requires some clarification. Since hash table size is definitely less than returned hash value, so higher bits are removed, for that case above folding is done both in XOR hash and my test case. It is possible to just remove higher bits, but fairly ditributed parts being xored produce still fairly distributed value. h ^= h 16; h ^= h 8; This does not remove high order bits in either function. Your comparison results are simply invalid with these two lines in place. It is hash function, but not function which creates index inside hash table. Index is created by removing high order bits with ( 0x). I've present the new simple code and test results which show that folded and not folded Jenkins hashes _do_ produce _exactly_ the same distribution. I think I've already said that fairly distributed values being xored produce still fairly distributed value, so parts of 32bit fairly distributed hash after being xored with each other still produce fairly distributed 32bit space. --brian -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
From: Evgeniy Polyakov [EMAIL PROTECTED] Date: Thu, 1 Jun 2006 10:12:36 +0400 I've present the new simple code and test results which show that folded and not folded Jenkins hashes _do_ produce _exactly_ the same distribution. Ok I believe you now :) I think I've already said that fairly distributed values being xored produce still fairly distributed value, so parts of 32bit fairly distributed hash after being xored with each other still produce fairly distributed 32bit space. It would make a good research paper for someone mathmatically inclined enough :) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
Evgeniy, On Thu, 01 Jun 2006, Evgeniy Polyakov wrote: for (i=0; ihash_size*iter_num; ++i) { saddr = num2ip(get_random_byte(), get_random_byte(), get_random_byte(), get_random_byte()); sport = get_random_word(); You still have a problem: you cannot use a pseudo-random number generator to generate the sample set as the pseudo-random number generator function itself can interact with the hash. Try iterating through all 2**48 values or at least a sizeable representative subset. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
David, On Wed, 31 May 2006, David Miller wrote: Ok I believe you now :) I'll believe it if he interates through a subset and gets the same results instead of using a pseudo-random number generator. I thought you said you were considering jenkins_3word(), not jenkins_2word()? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
From: Brian F. G. Bidulock [EMAIL PROTECTED] Date: Thu, 1 Jun 2006 00:22:21 -0600 I thought you said you were considering jenkins_3word(), not jenkins_2word()? We could xor some of the inputs in order to use jenkins_2word(). - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
On Thu, Jun 01, 2006 at 12:18:25AM -0600, Brian F. G. Bidulock ([EMAIL PROTECTED]) wrote: Evgeniy, On Thu, 01 Jun 2006, Evgeniy Polyakov wrote: for (i=0; ihash_size*iter_num; ++i) { saddr = num2ip(get_random_byte(), get_random_byte(), get_random_byte(), get_random_byte()); sport = get_random_word(); You still have a problem: you cannot use a pseudo-random number generator to generate the sample set as the pseudo-random number generator function itself can interact with the hash. Try iterating through all 2**48 values or at least a sizeable representative subset. Since pseudo-randomness affects both folded and not folded hash distribution, it can not end up in different results. You are right that having test with 2^48 values is really interesting, but it will take ages on my test machine :) -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
Evgeniy, On Thu, 01 Jun 2006, Evgeniy Polyakov wrote: Since pseudo-randomness affects both folded and not folded hash distribution, it can not end up in different results. Yes it would, so to rule out pseudo-random effects the pseudo- random number generator must be removed. You are right that having test with 2^48 values is really interesting, but it will take ages on my test machine :) Try a usable subset; no pseudo-random number generator. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
On Thu, Jun 01, 2006 at 12:46:08AM -0600, Brian F. G. Bidulock ([EMAIL PROTECTED]) wrote: Since pseudo-randomness affects both folded and not folded hash distribution, it can not end up in different results. Yes it would, so to rule out pseudo-random effects the pseudo- random number generator must be removed. You are right that having test with 2^48 values is really interesting, but it will take ages on my test machine :) Try a usable subset; no pseudo-random number generator. I've run it for 2^30 - the same result: folded and not folded Jenkins hash behave the same and still both results produce exactly the same artifacts compared to XOR hash. Btw, XOR hash, as completely stateless, can be used to show how Linux pseudo-random generator works for given subset - it's average of distribution is very good. -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
Evgeniy, On Thu, 01 Jun 2006, Evgeniy Polyakov wrote: On Thu, Jun 01, 2006 at 12:46:08AM -0600, Brian F. G. Bidulock ([EMAIL PROTECTED]) wrote: Since pseudo-randomness affects both folded and not folded hash distribution, it can not end up in different results. Yes it would, so to rule out pseudo-random effects the pseudo- random number generator must be removed. You are right that having test with 2^48 values is really interesting, but it will take ages on my test machine :) Try a usable subset; no pseudo-random number generator. I've run it for 2^30 - the same result: folded and not folded Jenkins hash behave the same and still both results produce exactly the same artifacts compared to XOR hash. But not without the pseudo-random number generation... ? Btw, XOR hash, as completely stateless, can be used to show how Linux pseudo-random generator works for given subset - it's average of distribution is very good. But its distribution might auto-correlate with the Jenkins function. The only way to be sure is to remove the pseudo-random number generator. Just try incrementing from, say, 10.0.0.0:1 up, resetting port number to 1 at 16000, and just incrementing the IP address when the port number wraps, instead of pseudo-random, through 2^30 loops for both. If the same artifacts emerge, I give in. Can you show the same artifacts for jenkins_3word? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
On Thu, Jun 01, 2006 at 01:11:25AM -0600, Brian F. G. Bidulock ([EMAIL PROTECTED]) wrote: Evgeniy, On Thu, 01 Jun 2006, Evgeniy Polyakov wrote: On Thu, Jun 01, 2006 at 12:46:08AM -0600, Brian F. G. Bidulock ([EMAIL PROTECTED]) wrote: Since pseudo-randomness affects both folded and not folded hash distribution, it can not end up in different results. Yes it would, so to rule out pseudo-random effects the pseudo- random number generator must be removed. You are right that having test with 2^48 values is really interesting, but it will take ages on my test machine :) Try a usable subset; no pseudo-random number generator. I've run it for 2^30 - the same result: folded and not folded Jenkins hash behave the same and still both results produce exactly the same artifacts compared to XOR hash. But not without the pseudo-random number generation... ? How can I obtain (2^30)*6 bytes of truly random bytes? Btw, XOR hash, as completely stateless, can be used to show how Linux pseudo-random generator works for given subset - it's average of distribution is very good. But its distribution might auto-correlate with the Jenkins function. The only way to be sure is to remove the pseudo-random number generator. Just try incrementing from, say, 10.0.0.0:1 up, resetting port number to 1 at 16000, and just incrementing the IP address when the port number wraps, instead of pseudo-random, through 2^30 loops for both. If the same artifacts emerge, I give in. I've run it with following source ip/port selection algo: if (++sport == 0) { saddr++; sport++; } Starting IP was 1.1.1.1 and sport was 1. Destination IP and port are the same 192.168.0.1:80 Jenkins hash started to show different behaviour: it does not have previous artefacts, but instead it's dispersion is _much_ wider than in XOR case. With following ip/port selection algo: if (++sport == 0) { //saddr++; sport += 123; } I see yet another jenkins artefacts, but again different from previous two. But each time both folded and not folded hashes behave exactly the same. Can you show the same artifacts for jenkins_3word? What should be used as starting point there? If I use 0 it is the same as jhash_2words(). If I use 123123 - artefacts are the same, just slighly shifted (I tested only the latest test above though). Looking into the code we can see that jhash_2words() is jhash_3words() with zero C value, so it will show the same nature. -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
Evgeniy, On Thu, 01 Jun 2006, Evgeniy Polyakov wrote: For purely random numbers you could amplify thermal noise off an open transitor junction (the audiofile's white noise generator) and feed it into an analog to digital converter. I've run it with following source ip/port selection algo: if (++sport == 0) { saddr++; sport++; } Starting IP was 1.1.1.1 and sport was 1. Destination IP and port are the same 192.168.0.1:80 Jenkins hash started to show different behaviour: it does not have previous artefacts, but instead it's dispersion is _much_ wider than in XOR case. Aha! But perhaps this is too easy a data set. HTTP clients typically dynamically allocate port numbers within a range and source address are typically not less than a certain value. That is why I suggested something like: sport = 1; saddr = 0x0a00; /* 10.0.0.0 */ ... if (++sport == 16000) { sport = 1; saddr++; } If this shows artifacts worse than XOR then more realistic gaps in the input values will cause artifacts. With following ip/port selection algo: if (++sport == 0) { //saddr++; sport += 123; } I see yet another jenkins artefacts, but again different from previous two. Adding primes. Again, the arithmetic series of primes might auto-correlate with the Jenkins function. Or it plain might not like gaps. But each time both folded and not folded hashes behave exactly the same. Can you show the same artifacts for jenkins_3word? What should be used as starting point there? If I use 0 it is the same as jhash_2words(). If I use 123123 - artefacts are the same, just slighly shifted (I tested only the latest test above though). Looking into the code we can see that jhash_2words() is jhash_3words() with zero C value, so it will show the same nature. Skip that then. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
On Thu, Jun 01, 2006 at 04:24:57AM -0600, Brian F. G. Bidulock ([EMAIL PROTECTED]) wrote: For purely random numbers you could amplify thermal noise off an open transitor junction (the audiofile's white noise generator) and feed it into an analog to digital converter. It is also possible to look into window and count how many times sun hides and shines... It is quite cloudy in Moscow though. As Acrypto asynchronous crypto layer author I can use different hardware crypto accelerators, but it is a topic for another discussion. I've run it with following source ip/port selection algo: if (++sport == 0) { saddr++; sport++; } Starting IP was 1.1.1.1 and sport was 1. Destination IP and port are the same 192.168.0.1:80 Jenkins hash started to show different behaviour: it does not have previous artefacts, but instead it's dispersion is _much_ wider than in XOR case. Aha! But perhaps this is too easy a data set. HTTP clients typically dynamically allocate port numbers within a range and source address are typically not less than a certain value. That is why I suggested something like: sport = 1; saddr = 0x0a00; /* 10.0.0.0 */ ... if (++sport == 16000) { sport = 1; saddr++; } If this shows artifacts worse than XOR then more realistic gaps in the input values will cause artifacts. Specially for you :) It does not have artifacts, but it's dispersion is wider than XOR one. _Much_ wider, which tends to creation of some specially crafted source distribution which ends up in totally broken fairness. As usual folded and not folded versions behave exactly the same. With following ip/port selection algo: if (++sport == 0) { //saddr++; sport += 123; } I see yet another jenkins artefacts, but again different from previous two. Adding primes. Again, the arithmetic series of primes might auto-correlate with the Jenkins function. Or it plain might not like gaps. I want to confirm three things and one state: 1. Jenkins hash has some unacceptible artefacts in some source address/port distributions, no matter if it has some law embedded or it is (pseudo)-random set. If there are bugs, bugs exist. 2. If it does not have artifacts it has unacceptible dispersion. 3. It is 3 times slower than XOR one (28 seconds for XOR for 2^29 iterations vs. 101 seconds jhash nonfolded and 109 jhash folded on my AMD64 3500+ 2.2 Ghz desktop). 4. I believe it can be tuned or has some gaps inside refactoring logic, which can be fixed, but as is it can not be used for fair hash creation. -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
Evgeniy, On Thu, 01 Jun 2006, Evgeniy Polyakov wrote: I think the sun shines more in Moscow than in Edmonton, so it is not so random. ;) Specially for you :) Thank you for being so gracious and patient with me. It does not have artifacts, but it's dispersion is wider than XOR one. _Much_ wider, which tends to creation of some specially crafted source distribution which ends up in totally broken fairness. As usual folded and not folded versions behave exactly the same. With following ip/port selection algo: if (++sport == 0) { //saddr++; sport += 123; } I see yet another jenkins artefacts, but again different from previous two. Adding primes. Again, the arithmetic series of primes might auto-correlate with the Jenkins function. Or it plain might not like gaps. I want to confirm three things and one state: 1. Jenkins hash has some unacceptible artefacts in some source address/port distributions, no matter if it has some law embedded or it is (pseudo)-random set. If there are bugs, bugs exist. True, artifacts appeared even in the basic arithmetic sequence of primes. It is quite possible that a large set of natural sequences might cause artifacts. 2. If it does not have artifacts it has unacceptible dispersion. This is likely due to the relatively small sample sets; however, real experienced data sets would be very small compared to the widest possible set and might also contain structured gaps. 3. It is 3 times slower than XOR one (28 seconds for XOR for 2^29 iterations vs. 101 seconds jhash nonfolded and 109 jhash folded on my AMD64 3500+ 2.2 Ghz desktop). Yes, it is slower by inspection. 4. I believe it can be tuned or has some gaps inside refactoring logic, which can be fixed, but as is it can not be used for fair hash creation. Yes, I now agree. And, for the purpose of dynamic hash sizing, high dispersion is worse than artifacts. For some realistic TCP data sets it appears that XOR is superior. Thank you again for your efforts in resolving my doubts. So what are your thoughts about my sequence number approach (for connected sockets)? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
From: Brian F. G. Bidulock [EMAIL PROTECTED] Date: Thu, 1 Jun 2006 12:40:10 -0600 I think the sun shines more in Moscow than in Edmonton, so it is not so random. ;) Go Oilers :) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
From: Brian F. G. Bidulock [EMAIL PROTECTED] Date: Tue, 30 May 2006 23:55:26 -0600 For example, it goes to great pains to permute upper order bits in the local address, which for most connections will be a constant value. Consider an apache server hosting thousands of virtual hosts. The local address will be different for every such host. Please drop linux-kernel in any future replies, this discussion belongs on netdev not linux-kernel. Thanks. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
David, On Wed, 31 May 2006, David Miller wrote: From: Brian F. G. Bidulock [EMAIL PROTECTED] Date: Tue, 30 May 2006 23:55:26 -0600 For example, it goes to great pains to permute upper order bits in the local address, which for most connections will be a constant value. Consider an apache server hosting thousands of virtual hosts. The local address will be different for every such host. If you mean named virtual hosts, no. They have the same addresses. If you mean actual hosts (with an IP address), perhaps in the low order bits (host number), but unlikely in the high order bits of the local address (network mask bits). Also, in such a case the local port number will be rather constant (80, etc); a condition also not exploited by the function. Also consider that the function simply folds the values rather than permuting bits across the key field by shifting by some other value than a multiple of 8 between XOR operations. This will result in a longer collision list because the entropy of the key value has not been sufficiently reduced. It might sound like I'm complaining, but I'm not. The function works for me. But from a purist point of view, the hash function is not as efficient as it could be and there is room for improvement. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
From: Brian F. G. Bidulock [EMAIL PROTECTED] Date: Wed, 31 May 2006 01:45:40 -0600 It might sound like I'm complaining, but I'm not. The function works for me. But from a purist point of view, the hash function is not as efficient as it could be and there is room for improvement. For sure and there are plans afoot to move over to dynamic table sizing and the Jenkins hash function. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
David, On Wed, 31 May 2006, David Miller wrote: For sure and there are plans afoot to move over to dynamic table sizing and the Jenkins hash function. Yes, that could be far more efficient. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
David, On Wed, 31 May 2006, David Miller wrote: For sure and there are plans afoot to move over to dynamic table sizing and the Jenkins hash function. Just a suggestion, but I have an approach that stands to be faster than Jenkins deriving from the verification tag approach that I took for SCTP (OpenSS7 SCTP not lksctp). TCP uses a cryptographic hash function to select its initial sequence number (SCTP does the same for vtag). Last I checked it was taken from an MD4 swirled entropy pool and further combined with the local and remote IP addresses and port numbers. Each received segment contains a sequence number that is offset from the initial sequence number but is expected to appear within the current window. Most of the high order bits of an in-window sequence number are predicatable at any point in time and, due to cryptographic strength, are more efficient than Jenkins, ... and they are right there in the received packet. The approach would take the high order bits of the received sequence number and use them to index a separate sequence number keyed established hash (which could be dynamic). As the window changes, the socket would have to be removed and reinserted into this hash, but the repositioning would be infrequent. Out of window segments would fail to find a socket, but could fall back to the current established hash, or even the bind hash. A layer of caching could increase the hash lookup speed further for noisy senders. This would be faster than a Jenkins hash approach because it would not be necessary to calculate the hash function at all for in-window segments. Per packet overheads would decrease and better small packet performance would be experienced (i.e. your http server). It has better hash coverage because MD4 and other cryptographic algorithms used for initial sequence number selection have far better properties than Jenkins. What do you think? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
From: Brian F. G. Bidulock [EMAIL PROTECTED] Date: Wed, 31 May 2006 02:49:54 -0600 This would be faster than a Jenkins hash approach because it would not be necessary to calculate the hash function at all for in-window segments. Per packet overheads would decrease and better small packet performance would be experienced (i.e. your http server). It has better hash coverage because MD4 and other cryptographic algorithms used for initial sequence number selection have far better properties than Jenkins. What do you think? I don't know how practical this is. The 4GB sequence space wraps very fast on 10 gigabit, so we'd be rehashing a bit and 100 gigabit would make things worse whenever that shows up. It is, however, definitely an interesting idea. We also need the pure traditional hashes for net channels. I don't see how we could use your scheme for net channels, because we are just hashing in the interrupt handler of the network device driver in order to get a queue to tack the packet onto, we're not interpreting the sequence numbers and thus would not able to maintain the sequence space based hashing state. On a 3ghz cpu, the jenkins hash is essentially free. Even on slower cpus, jhash_2words for example is just 20 cycles on a sparc64 chip. It's ~40 integer instructions and they all pair up perfectly to dual issue. We'd probably use jhash_3words() for TCP ipv4 which would get us into the 30 cycle range. A few years ago when I introduced jenkins into the tree, I thought it's execution cost might be an issue. I really don't anymore. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
On Wed, May 31, 2006 at 02:00:09AM -0600, Brian F. G. Bidulock ([EMAIL PROTECTED]) wrote: For sure and there are plans afoot to move over to dynamic table sizing and the Jenkins hash function. Yes, that could be far more efficient. In theory practice and theory are the same, but in practice they are different. Current simple XOR hash used in socket selection code is just bloody good! Jenkins hash unfortunately has _significant_ artefacts which were found in netchannel [1] hash selection analysis [2]. And Jenkins hash is far too slower. 1. Netchannel. http://tservice.net.ru/~s0mbre/old/?section=projectsitem=netchannel 2. Compared Jenkins hash with XOR hash used in TCP socket selection code. http://tservice.net.ru/~s0mbre/blog/2006/05/14#2006_05_14 -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
From: Evgeniy Polyakov [EMAIL PROTECTED] Date: Wed, 31 May 2006 13:03:02 +0400 Current simple XOR hash used in socket selection code is just bloody good! Jenkins hash unfortunately has _significant_ artefacts which were found in netchannel [1] hash selection analysis [2]. And Jenkins hash is far too slower. Yes, it wins from a simplicity standpoint and it performs quite well. It was tuned from real socket data sets, but from systems running many many years ago :) FreeBSD even adopted this hash into their PCB hashing code at one point. I think it will need to be changed nevertheless. Even though this hash works on established sockets, it is attackable just like the routing hash used to be. If an attacker has sufficient resources, he can make hash chains in the TCP established hash table very long. As the years pass, it becomes easier and easier for one to have enough computing power at their disposal to carry out this kind of attack. So something like Jenkins with a random hash input become more and more critical. And this requires some kind of way to rehash, RCU table locking opens the door for that. Current locking scheme is too tightly bound for that kind of flexibility. I wish Ben L. would resubmit the TCP hash locking stuff he said he'd work on. :) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
David, On Wed, 31 May 2006, David Miller wrote: I don't know how practical this is. The 4GB sequence space wraps very fast on 10 gigabit, so we'd be rehashing a bit and 100 gigabit would make things worse whenever that shows up. It works better for SCTP, because the vtags are constant. No rehashing is required there. But, also consider that rehashing is only required for senders sending at a high data rate. (http clients will likely never have to be rehashed.) These packets will typically be large and per-packet overheads will be overwhelmed by per-byte overheads. Also, the rehashing is orderly and simple, the entry is simply bumped to the next sequential hash slot and the socket hash structure can already be cached at the time the action is performed. Rehashing, although a bother, would take little time, and could simply be added as part of TCP's existing window calculations. It is, however, definitely an interesting idea. We also need the pure traditional hashes for net channels. I don't see how we could use your scheme for net channels, because we are just hashing in the interrupt handler of the network device driver in order to get a queue to tack the packet onto, we're not interpreting the sequence numbers and thus would not able to maintain the sequence space based hashing state. Under SCTP I still have the traditional established hash for lookups of out of the blue packets and packets containing invalid verification tags. Really long lookups would invite DoS attacks on these. On a 3ghz cpu, the jenkins hash is essentially free. Even on slower cpus, jhash_2words for example is just 20 cycles on a sparc64 chip. It's ~40 integer instructions and they all pair up perfectly to dual issue. We'd probably use jhash_3words() for TCP ipv4 which would get us into the 30 cycle range. But you could throw away all 30 cycles, plus the stacking and unstacking of registers to get in and out of the algorithm. Some architectures might benefit more. Well, I thought you might find it interesting. Perhaps somebody reading this will experiment with it. For SCTP it is one of a number of techniques that allows OpenSS7 SCTP to drastically outperform lksctp. --brian - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
On Wed, May 31, 2006 at 02:12:43AM -0700, David Miller ([EMAIL PROTECTED]) wrote: I think it will need to be changed nevertheless. Even though this hash works on established sockets, it is attackable just like the routing hash used to be. If an attacker has sufficient resources, he can make hash chains in the TCP established hash table very long. As the years pass, it becomes easier and easier for one to have enough computing power at their disposal to carry out this kind of attack. Jenkins hash is very complex compared to tcp XOR one, and even simple test showed it's bottlenecks in some setups, so it's tuning will be quite challenging both from mathematical and engineering points of view. And socket code actually differs from routing cache, since the former is limited to 64k connections in a time, while routing cache can grow to unpredictibe sizes. So something like Jenkins with a random hash input become more and more critical. And this requires some kind of way to rehash, RCU table locking opens the door for that. Current locking scheme is too tightly bound for that kind of flexibility. I wish Ben L. would resubmit the TCP hash locking stuff he said he'd work on. :) It was good approach indeed. -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
Evgeniy, On Wed, 31 May 2006, Evgeniy Polyakov wrote: 2. Compared Jenkins hash with XOR hash used in TCP socket selection code. http://tservice.net.ru/~s0mbre/blog/2006/05/14#2006_05_14 Two problems with the comparison: Port numbers can be collected into a 32 bit register in network byte order directly from the TCP packet without taking two 16 bit values and shifting and or'ing them. Worse: he folded the jenkins algorith result with h ^= h 16; h ^= h 8; Destroying the coverage of the function. I, for one, am not suprised that artifacts appeared in the comparison as a result of this destruction of the coverage of the hashing function. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
Evgeniy, On Wed, 31 May 2006, Evgeniy Polyakov wrote: 1. Netchannel. http://tservice.net.ru/~s0mbre/old/?section=projectsitem=netchannel This one refers to the erroneous result below. 2. Compared Jenkins hash with XOR hash used in TCP socket selection code. http://tservice.net.ru/~s0mbre/blog/2006/05/14#2006_05_14 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
On Wed, May 31, 2006 at 03:51:24AM -0600, Brian F. G. Bidulock ([EMAIL PROTECTED]) wrote: Evgeniy, On Wed, 31 May 2006, Evgeniy Polyakov wrote: 2. Compared Jenkins hash with XOR hash used in TCP socket selection code. http://tservice.net.ru/~s0mbre/blog/2006/05/14#2006_05_14 Two problems with the comparison: Port numbers can be collected into a 32 bit register in network byte order directly from the TCP packet without taking two 16 bit values and shifting and or'ing them. They are. u32 ports; ports = lport; ports = 16; ports |= fport; Worse: he folded the jenkins algorith result with h ^= h 16; h ^= h 8; Destroying the coverage of the function. It was done to simulate socket code which uses the same folding. Leaving 32bit space is just wrong, consider hash table size with that index. I, for one, am not suprised that artifacts appeared in the comparison as a result of this destruction of the coverage of the hashing function. It is comparison of the approach used in TCP hashing code, it is not full mathematical analysis. And in that case jenkins hash already not good. I'm sure it can be tuned, but it does require a lot of iterations, while XOR one just works. -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
On Wed, May 31, 2006 at 02:58:18PM +0400, Evgeniy Polyakov ([EMAIL PROTECTED]) wrote: On Wed, May 31, 2006 at 03:51:24AM -0600, Brian F. G. Bidulock ([EMAIL PROTECTED]) wrote: Evgeniy, On Wed, 31 May 2006, Evgeniy Polyakov wrote: 2. Compared Jenkins hash with XOR hash used in TCP socket selection code. http://tservice.net.ru/~s0mbre/blog/2006/05/14#2006_05_14 Two problems with the comparison: Port numbers can be collected into a 32 bit register in network byte order directly from the TCP packet without taking two 16 bit values and shifting and or'ing them. They are. u32 ports; ports = lport; ports = 16; ports |= fport; Using network or host byte order does not affect hash distribution, that shifting was coded to simulate other types of mixing ports, which actually never showed different results. Worse: he folded the jenkins algorith result with h ^= h 16; h ^= h 8; Destroying the coverage of the function. It was done to simulate socket code which uses the same folding. Leaving 32bit space is just wrong, consider hash table size with that index. I, for one, am not suprised that artifacts appeared in the comparison as a result of this destruction of the coverage of the hashing function. It is comparison of the approach used in TCP hashing code, it is not full mathematical analysis. And in that case jenkins hash already not good. I'm sure it can be tuned, but it does require a lot of iterations, while XOR one just works. -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
On Wed, May 31, 2006 at 03:04:59PM +0400, Evgeniy Polyakov ([EMAIL PROTECTED]) wrote: On Wed, May 31, 2006 at 02:58:18PM +0400, Evgeniy Polyakov ([EMAIL PROTECTED]) wrote: On Wed, May 31, 2006 at 03:51:24AM -0600, Brian F. G. Bidulock ([EMAIL PROTECTED]) wrote: Evgeniy, On Wed, 31 May 2006, Evgeniy Polyakov wrote: 2. Compared Jenkins hash with XOR hash used in TCP socket selection code. http://tservice.net.ru/~s0mbre/blog/2006/05/14#2006_05_14 Two problems with the comparison: Port numbers can be collected into a 32 bit register in network byte order directly from the TCP packet without taking two 16 bit values and shifting and or'ing them. They are. u32 ports; ports = lport; ports = 16; ports |= fport; Using network or host byte order does not affect hash distribution, that shifting was coded to simulate other types of mixing ports, which actually never showed different results. Worse: he folded the jenkins algorith result with h ^= h 16; h ^= h 8; Destroying the coverage of the function. It was done to simulate socket code which uses the same folding. Leaving 32bit space is just wrong, consider hash table size with that index. Btw, that probably requires some clarification. Since hash table size is definitely less than returned hash value, so higher bits are removed, for that case above folding is done both in XOR hash and my test case. It is possible to just remove higher bits, but fairly ditributed parts being xored produce still fairly distributed value. -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
From: Evgeniy Polyakov [EMAIL PROTECTED] Date: Wed, 31 May 2006 14:58:18 +0400 On Wed, May 31, 2006 at 03:51:24AM -0600, Brian F. G. Bidulock ([EMAIL PROTECTED]) wrote: Worse: he folded the jenkins algorith result with h ^= h 16; h ^= h 8; Destroying the coverage of the function. It was done to simulate socket code which uses the same folding. Leaving 32bit space is just wrong, consider hash table size with that index. You absolutely show not do this shifting on the jenkins hash result, you destroy the distribution entirely. Just mask it with the hash mask and that's all you need to do. Brian is right, this is absolutely critical to using the Jenkins hash correctly. You're unmixing the bits it worked so hard to mix. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question about tcp hash function tcp_hashfn()
Evgeniy, On Wed, 31 May 2006, Evgeniy Polyakov wrote: Worse: he folded the jenkins algorith result with h ^= h 16; h ^= h 8; Destroying the coverage of the function. It was done to simulate socket code which uses the same folding. Leaving 32bit space is just wrong, consider hash table size with that index. Btw, that probably requires some clarification. Since hash table size is definitely less than returned hash value, so higher bits are removed, for that case above folding is done both in XOR hash and my test case. It is possible to just remove higher bits, but fairly ditributed parts being xored produce still fairly distributed value. h ^= h 16; h ^= h 8; This does not remove high order bits in either function. Your comparison results are simply invalid with these two lines in place. --brian - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html