Re: [Firebird-devel] Lock manager's hash calculations
30.12.2015 14:43, Dmitry Yemanov wrote: > The first one is in lock.cpp and you've found two others, three in total. Ok, then it is done now. Should I prepare a patch for Firebird 3 or leave it in Avalerion and let you adopt it at will? -- WBR, SD. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
30.12.2015 16:16, Dimitry Sibiryakov wrote: > I have found two: one in HashJoin.cpp and other in lck.cpp. Can't find third > one. Where > it is? The first one is in lock.cpp and you've found two others, three in total. Dmitry -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
29.12.2015 18:16, Dmitry Yemanov wrote: > I didn't speak about Hash.h, but about other three copies that are the same. I have found two: one in HashJoin.cpp and other in lck.cpp. Can't find third one. Where it is? -- WBR, SD. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
On 29/12/15 10:23, Dimitry Sibiryakov wrote: > 24.12.2015 18:29, Alex Peshkoff wrote: >> If we have one chain with 12 objects, three empty chains and chains >> with 1, 2 or 3 objects - it's not bad. >> I.e. existing stat-s is not enough for analysis. > >I've got following data from Igor Valchenko: > >> Hash slots: 65521, Hash lengths (min/avg/max): 0/ 1/ 8 >> Hash lengths distribution: >> 0 : 13761 >> 1 : 21925 >> 2 : 1 >> 3 : 8675 >> 4 : 3219 >> 5 : 955 >> 6 : 253 >> 7 : 53 >> 8 : 14 > >What would you say? > We have in 1st place 21925 + 1 + 8675 + 3219 + 955 + 253 + 53 + 14 = 51760 * 0 = 0 We have in 2nd place 1 + 8675 + 3219 + 955 + 253 + 53 + 14 = 29835 * 1 = 29835 In 3rd place 8675 + 3219 + 955 + 253 + 53 + 14 = 13169 * 2 = 26338 In 4th place 3219 + 955 + 253 + 53 + 14 = 4494 * 3 = 13482 In 5th place 955 + 253 + 53 + 14 = 1275 * 4 = 5100 In 6th place 253 + 53 + 14 = 320 * 5 = 1600 In 7th place 53 + 14 = 67 * 6 = 402 In 8th place 14 = 14 * 7 = 98 So we have 100,934 entries. To access each entry once from scratch, we will need to follow 76,855 "next" links. So that gives us a score of 0.76. A perfect score is (100,934 - 65521) / 100,934 ~= 0.35 So to score 0.76 isn't bad. We can't compare that with my Pick score because I would have used double the number of buckets, but if we keep the same number of buckets and items, this figure will give us an idea of how good the hash is. Note however, that if the characteristics of the data being hashed changes, so will the distribution, and hence the score ... this is always tricky territory to get right. Cheers, Wol -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
29.12.2015 17:29, Alex Peshkoff wrote: > Remains to check on something like tpcc does it affect CS performance. If someone has a powerful computer with SSE4.2 support and ready to - fresh builds for Windows are available at http://www.ibphoenix.com/ibpr_devel/ -- WBR, SD. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
29.12.2015 20:11, Dimitry Sibiryakov wrote: > Yep. Look into Hash.h. It is much different (and much slower). I didn't speak about Hash.h, but about other three copies that are the same. Dmitry -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
29.12.2015 18:08, Dmitry Yemanov wrote: > Really? Yep. Look into Hash.h. It is much different (and much slower). -- WBR, SD. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
28.12.2015 16:03, Dimitry Sibiryakov wrote: Yes. But in this case the hash is only used internally and not stored anywhere. Because of this, different platforms can use different algorithms for it. >> Of course, but then it should be encapsulated and moved to /common to >> avoid the same #ifdefs used here and there. This algorithm is used twice >> or thrice in our code, IIRC. > > AFAICS, in other places a little different algorithm is used. It uses > different sum > logic and then division in loop instead of single op. Really? The one in jrd/lck.cpp is a complete copy, just buggy (= used instead of +=), it was fixed in trunk. And the one in HashJoin.cpp should also be an exact match, as it was copy-pasted. > BTW, does this algorithm have a name to google for? Jim's Hash Function? Dmitry -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
On 12/29/2015 07:05 PM, Dimitry Sibiryakov wrote: > 29.12.2015 11:43, Alex Peshkoff wrote: >> That I do not expect to get something better than this bell curve: > > I was unable to create a good load on database with my weak > notebook, but oltp-emul with 5 windows gave me these numbers for lock > manager using CRC32: > >> Hash slots: 8191, Hash lengths (min/avg/max):0/ 0/ 6 >> Hash lengths distribution: >> 0 : 3591 (43%) >> 1 : 2992 (36%) >> 2 : 1194 (14%) >> 3 : 324 (3%) >> 4 : 71 (0%) >> 5 : 14 (0%) >> 6 :5 (0%) > > It looks like the curve falls faster. Yes - that looks better. Remains to check on something like tpcc does it affect CS performance. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
I stumbled upon quite evesome Hash-algorithm xxHash. There are lightning fast implementations. https://github.com/Cyan4973/xxHash Noted that usually the Hash algorithm is not the bottleneck, but still pretty interesting... -Tee- On Tue, Dec 29, 2015 at 6:05 PM, Dimitry Sibiryakov wrote: > 29.12.2015 11:43, Alex Peshkoff wrote: > >> That I do not expect to get something better than this bell curve: >> > > I was unable to create a good load on database with my weak notebook, > but oltp-emul with 5 windows gave me these numbers for lock manager using > CRC32: > > Hash slots: 8191, Hash lengths (min/avg/max):0/ 0/ 6 >> Hash lengths distribution: >> 0 : 3591 (43%) >> 1 : 2992 (36%) >> 2 : 1194 (14%) >> 3 : 324 (3%) >> 4 : 71 (0%) >> 5 : 14 (0%) >> 6 :5 (0%) >> > > It looks like the curve falls faster. > > -- > WBR, SD. > > > -- > > Firebird-Devel mailing list, web interface at > https://lists.sourceforge.net/lists/listinfo/firebird-devel > > -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
29.12.2015 11:43, Alex Peshkoff wrote: That I do not expect to get something better than this bell curve: I was unable to create a good load on database with my weak notebook, but oltp-emul with 5 windows gave me these numbers for lock manager using CRC32: Hash slots: 8191, Hash lengths (min/avg/max):0/ 0/ 6 Hash lengths distribution: 0 : 3591 (43%) 1 : 2992 (36%) 2 : 1194 (14%) 3 : 324 (3%) 4 : 71 (0%) 5 : 14 (0%) 6 :5 (0%) It looks like the curve falls faster. -- WBR, SD. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
On 12/29/2015 01:23 PM, Dimitry Sibiryakov wrote: 24.12.2015 18:29, Alex Peshkoff wrote: If we have one chain with 12 objects, three empty chains and chains with 1, 2 or 3 objects - it's not bad. I.e. existing stat-s is not enough for analysis. I've got following data from Igor Valchenko: Hash slots: 65521, Hash lengths (min/avg/max): 0/ 1/ 8 Hash lengths distribution: 0 : 13761 1 : 21925 2 : 1 3 : 8675 4 : 3219 5 : 955 6 : 253 7 : 53 8 : 14 What would you say? That I do not expect to get something better than this bell curve: -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
24.12.2015 18:29, Alex Peshkoff wrote: > If we have one chain with 12 objects, three empty chains and chains > with 1, 2 or 3 objects - it's not bad. > I.e. existing stat-s is not enough for analysis. I've got following data from Igor Valchenko: > Hash slots: 65521, Hash lengths (min/avg/max): 0/ 1/ 8 > Hash lengths distribution: > 0 : 13761 > 1 : 21925 > 2 : 1 > 3 : 8675 > 4 : 3219 > 5 : 955 > 6 : 253 > 7 : 53 > 8 : 14 What would you say? -- WBR, SD. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
24.12.2015 18:22, Dmitry Yemanov wrote: >> >Yes. But in this case the hash is only used internally and not stored >> >anywhere. Because >> >of this, different platforms can use different algorithms for it. > Of course, but then it should be encapsulated and moved to /common to > avoid the same #ifdefs used here and there. This algorithm is used twice > or thrice in our code, IIRC. AFAICS, in other places a little different algorithm is used. It uses different sum logic and then division in loop instead of single op. BTW, does this algorithm have a name to google for? -- WBR, SD. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
25.12.2015 14:14, Vlad Khorsun wrote: > crc32" used _mm_crc32_u32, _mm_crc32_u16 and _mm_crc32_u8 when appropriate Cool. I'm surprised that these instructions work even with unaligned data. -- WBR, SD. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
25.12.2015 13:40, Dimitry Sibiryakov wrote: > 25.12.2015 12:12, Dimitry Sibiryakov wrote: >> You'd better try final attached variant yourself. > > BTW, MSVC2010 32 bits showed the worst result: only 2.5x speed up while > GCC -O3 gave 5x. > Vlad, could you try it with MSVC2013?.. > MSVC13, /02 /Oi, Intel Core i3-4130T, 32 bit - Release build min max original31 234 unrolled3162 unrolled2 1547 crc320 125 crc32" 031 crc32" used _mm_crc32_u32, _mm_crc32_u16 and _mm_crc32_u8 when appropriate Regards, Vlad -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
On 12/25/2015 02:12 PM, Dimitry Sibiryakov wrote: > 24.12.2015 17:58, Alex Peshkoff wrote: >> BTW, please try it in such form: > > I can't believe my eyes. You'd better try final attached variant > yourself. > Telling true have not noticed great difference. Looks like it's highly HW-dependent. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
25.12.2015 12:12, Dimitry Sibiryakov wrote: > You'd better try final attached variant yourself. BTW, MSVC2010 32 bits showed the worst result: only 2.5x speed up while GCC -O3 gave 5x. Vlad, could you try it with MSVC2013?.. -- WBR, SD. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
On 24/12/15 17:22, Dmitry Yemanov wrote: >> I suspect that "true hash" could make distribution of hash values more >> equival >> > and thus improve overall timing for lock table search. > A couple of years ago I played a bit with different hash functions that > are considered good nowadays. Quite surprisingly, none of them provided > consistently better value distribution. > Once you get a decent number of buckets, it's almost impossible to spread stuff evenly ... Talking from experience with Pick (which puts its data in hashed files) its normal to have a few buckets seriously overflowed. You just can't avoid it. Let's say I have 80 objects, with space for 5 per bucket. That gives me 20 buckets. Pretty much any algorithm will probably give me one or two empty buckets, and one bucket with 11 necessitating storing it as three groups of five. That said, we reckon on finding 19 out of 20 records in the very first group we try. If you want a reasonably objective measure, try calculating the average distance an item is from the head of the chain. So let's assume we have 20 items, in 10 chains as follows: 1 1 3 1 0 4 0 4 3 3 So that gives us 0, 0, 1+2, 0, 0, 1+2+3, 0, 1+2+3, 1+2, 1+2 which = 21. So with an average of two items per chain, that's a score of just over 1. With perfection being 0.5 and a worst case of 10, that's not bad. (Pick would score this differently, putting them in five buckets we get 1+4 1+0 3+4 1+3 0+3. That gives us one bucket with 7, which scores 2. That gives us an overall score of 0.1. But for Pick each group access is a disk hit, so the cost of searching a group for the required item is lost in the noise). But if you want some figures for comparison, add 25% to the number of items to give the number of buckets. So with 20 items, that gives 25 buckets. If your algorithm then returns a score of 0.05, you're doing very well. Cheers, Wol -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
24.12.2015 17:58, Alex Peshkoff wrote: BTW, please try it in such form: I can't believe my eyes. You'd better try final attached variant yourself. -- WBR, SD. HashTest.cpp.7z Description: Binary data -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
On 12/24/2015 08:24 PM, Dimitry Sibiryakov wrote: > 24.12.2015 18:19, Alex Peshkoff wrote: >> This should be checked. At first one should show that current hash >> provides bad distribution. > In fb_lock_print output I often see something like this: > >> Hash lengths (min/avg/max):0/ 2/ 12 > Isn't it bad? > If we have one chain with 12 objects, three empty chains and chains with 1, 2 or 3 objects - it's not bad. I.e. existing stat-s is not enough for analysis. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
>On the other hand, CS is going down, so lock manager performance may be > not a critical part anymore? It will take groups like mine a while before they embrace the new SMP functionality of v3 -- there is application/deployment testing to be performed... So, CS will be around for a while* and performance does matter! Sean * the notion of support for clustered Firebird deployments is still very attractive to me -- especially with the mainstream availability of low latency RDMA NICs (Infiniband, RoCE, iWarp). Lock manager will be required those deployments. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
24.12.2015 18:19, Alex Peshkoff wrote: > This should be checked. At first one should show that current hash > provides bad distribution. In fb_lock_print output I often see something like this: > Hash lengths (min/avg/max):0/ 2/ 12 Isn't it bad? -- WBR, SD. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
24.12.2015 20:07, Dimitry Sibiryakov wrote: > > Yes. But in this case the hash is only used internally and not stored > anywhere. Because > of this, different platforms can use different algorithms for it. Of course, but then it should be encapsulated and moved to /common to avoid the same #ifdefs used here and there. This algorithm is used twice or thrice in our code, IIRC. > I suspect that "true hash" could make distribution of hash values more equival > and thus improve overall timing for lock table search. A couple of years ago I played a bit with different hash functions that are considered good nowadays. Quite surprisingly, none of them provided consistently better value distribution. > On the other hand, CS is going down, so lock manager performance may be not a > critical > part anymore? Maybe not, but hash tables may still be used in other places and they're quite important from the performance POV. For example, hash joins use the same hash function. Dmitry -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
On 12/24/2015 08:07 PM, Dimitry Sibiryakov wrote: > 24.12.2015 17:48, Alex Peshkoff wrote: >> Use of CPU-accelerated code appears suspicious from non-intel ports POV. > Yes. But in this case the hash is only used internally and not stored > anywhere. Because > of this, different platforms can use different algorithms for it. I suspect > that "true > hash" could make distribution of hash values more equival and thus improve > overall timing > for lock table search. This should be checked. At first one should show that current hash provides bad distribution. > On the other hand, CS is going down, so lock manager performance may be > not a critical > part anymore? > I suppose that users who need high reliability will continue with classic. I.e. it's usage will get lower but it will remain. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
24.12.2015 17:58, Alex Peshkoff wrote: > For me (AMD CPU) it behaves much better for length in 11,15,19,etc. I have AMD too, so we need someone with Intel to see if these conditions won't defeat its jump prediction algorithms there. -- WBR, SD. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
24.12.2015 17:48, Alex Peshkoff wrote: > Use of CPU-accelerated code appears suspicious from non-intel ports POV. Yes. But in this case the hash is only used internally and not stored anywhere. Because of this, different platforms can use different algorithms for it. I suspect that "true hash" could make distribution of hash values more equival and thus improve overall timing for lock table search. On the other hand, CS is going down, so lock manager performance may be not a critical part anymore? -- WBR, SD. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
On 12/24/2015 07:48 PM, Alex Peshkoff wrote: > On 12/24/2015 04:19 PM, Dimitry Sibiryakov wrote: >> Hello, All. >> >>I played a little with hash calculation code from lock manager and >> found that there is a room for improvement. >>Attached test program compiled by GCC 4.7.1 with switches "-O3 >> -msse4.2" shown that simple unroll of the loop makes it 2-3 times >> faster. Current code is even slower than CPU-accelerated CRC32. >>Comments? > Use of CPU-accelerated code appears suspicious from non-intel ports POV. > What about unrolled loop - I like it. BTW, please try it in such form: int hash_value = 0; { // scope char* p; const char* q = value; int l = length; while (l >= 4) { p = (char*) &hash_value; *p++ += *q++; *p++ += *q++; *p++ += *q++; *p++ += *q++; l -= 4; } p = (char*) &hash_value; if (l >= 2) { *p++ += *q++; *p++ += *q++; l -= 2; } if (l) { *p++ += *q++; } } // scope For me (AMD CPU) it behaves much better for length in 11,15,19,etc. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
On 12/24/2015 04:19 PM, Dimitry Sibiryakov wrote: > Hello, All. > > I played a little with hash calculation code from lock manager and > found that there is a room for improvement. > Attached test program compiled by GCC 4.7.1 with switches "-O3 > -msse4.2" shown that simple unroll of the loop makes it 2-3 times > faster. Current code is even slower than CPU-accelerated CRC32. > Comments? Use of CPU-accelerated code appears suspicious from non-intel ports POV. What about unrolled loop - I like it. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Re: [Firebird-devel] Lock manager's hash calculations
24.12.2015 14:19, Dimitry Sibiryakov wrote: > simple unroll of the loop makes it 2-3 times faster. Actually, even replace of USHORT with ULONG in the loop wins about 15%. -- WBR, SD. -- Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel