Re: [Firebird-devel] Lock manager's hash calculations

2015-12-30 Thread Dimitry Sibiryakov
30.12.2015 14:43, Dmitry Yemanov wrote:
> The first one is in lock.cpp and you've found two others, three in total.

   Ok, then it is done now.
   Should I prepare a patch for Firebird 3 or leave it in Avalerion and let you 
adopt it 
at will?

-- 
   WBR, SD.

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-30 Thread Dmitry Yemanov
30.12.2015 16:16, Dimitry Sibiryakov wrote:

> I have found two: one in HashJoin.cpp and other in lck.cpp. Can't find third 
> one. Where
> it is?

The first one is in lock.cpp and you've found two others, three in total.


Dmitry


--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-30 Thread Dimitry Sibiryakov
29.12.2015 18:16, Dmitry Yemanov wrote:
> I didn't speak about Hash.h, but about other three copies that are the same.

   I have found two: one in HashJoin.cpp and other in lck.cpp. Can't find third 
one. Where 
it is?

-- 
   WBR, SD.

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-29 Thread Wols Lists
On 29/12/15 10:23, Dimitry Sibiryakov wrote:
> 24.12.2015 18:29, Alex Peshkoff wrote:
>> If we have one chain with 12 objects, three empty chains and  chains
>> with 1, 2 or 3 objects - it's not bad.
>> I.e. existing stat-s is not enough for analysis.
> 
>I've got following data from Igor Valchenko:
> 
>> Hash slots: 65521, Hash lengths (min/avg/max): 0/ 1/ 8
>> Hash lengths distribution:
>>  0 : 13761
>>  1 : 21925
>>  2 : 1
>>  3 : 8675
>>  4 : 3219
>>  5 : 955
>>  6 : 253
>>  7 : 53
>>  8 : 14
> 
>What would you say?
> 
We have in 1st place
21925 + 1 + 8675 + 3219 + 955 + 253 + 53 + 14 = 51760 * 0 = 0

We have in 2nd place
1 + 8675 + 3219 + 955 + 253 + 53 + 14 = 29835 * 1 = 29835

In 3rd place
8675 + 3219 + 955 + 253 + 53 + 14 = 13169 * 2 = 26338

In 4th place
3219 + 955 + 253 + 53 + 14 = 4494 * 3 = 13482

In 5th place
955 + 253 + 53 + 14 = 1275 * 4 = 5100

In 6th place
253 + 53 + 14 = 320 * 5 = 1600

In 7th place
53 + 14 = 67 * 6 = 402

In 8th place
14 = 14 * 7 = 98

So we have 100,934 entries. To access each entry once from scratch, we
will need to follow 76,855 "next" links. So that gives us a score of 0.76.

A perfect score is (100,934 - 65521) / 100,934 ~= 0.35

So to score 0.76 isn't bad.

We can't compare that with my Pick score because I would have used
double the number of buckets, but if we keep the same number of buckets
and items, this figure will give us an idea of how good the hash is.
Note however, that if the characteristics of the data being hashed
changes, so will the distribution, and hence the score ... this is
always tricky territory to get right.

Cheers,
Wol


--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-29 Thread Dimitry Sibiryakov
29.12.2015 17:29, Alex Peshkoff wrote:
> Remains to check on something like tpcc does it affect CS performance.

   If someone has a powerful computer with SSE4.2 support and ready to - fresh 
builds for 
Windows are available at http://www.ibphoenix.com/ibpr_devel/

-- 
   WBR, SD.

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-29 Thread Dmitry Yemanov
29.12.2015 20:11, Dimitry Sibiryakov wrote:

> Yep. Look into Hash.h. It is much different (and much slower).

I didn't speak about Hash.h, but about other three copies that are the same.


Dmitry


--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-29 Thread Dimitry Sibiryakov
29.12.2015 18:08, Dmitry Yemanov wrote:
> Really?

   Yep. Look into Hash.h. It is much different (and much slower).

-- 
   WBR, SD.

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-29 Thread Dmitry Yemanov
28.12.2015 16:03, Dimitry Sibiryakov wrote:

 Yes. But in this case the hash is only used internally and not stored 
 anywhere. Because
 of this, different platforms can use different algorithms for it.
>> Of course, but then it should be encapsulated and moved to /common to
>> avoid the same #ifdefs used here and there. This algorithm is used twice
>> or thrice in our code, IIRC.
>
> AFAICS, in other places a little different algorithm is used. It uses 
> different sum
> logic and then division in loop instead of single op.

Really? The one in jrd/lck.cpp is a complete copy, just buggy (= used 
instead of +=), it was fixed in trunk. And the one in HashJoin.cpp 
should also be an exact match, as it was copy-pasted.

> BTW, does this algorithm have a name to google for?

Jim's Hash Function? 


Dmitry


--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-29 Thread Alex Peshkoff
On 12/29/2015 07:05 PM, Dimitry Sibiryakov wrote:
> 29.12.2015 11:43, Alex Peshkoff wrote:
>> That I do not expect to get something better than this bell curve:
>
>   I was unable to create a good load on database with my weak 
> notebook, but oltp-emul with 5 windows gave me these numbers for lock 
> manager using CRC32:
>
>> Hash slots: 8191, Hash lengths (min/avg/max):0/   0/   6
>> Hash lengths distribution:
>> 0  : 3591   (43%)
>> 1  : 2992   (36%)
>> 2  : 1194   (14%)
>> 3  :  324   (3%)
>> 4  :   71   (0%)
>> 5  :   14   (0%)
>> 6  :5   (0%)
>
>   It looks like the curve falls faster.

Yes - that looks better.
Remains to check on something like tpcc does it affect CS performance.


--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-29 Thread Tommi Prami
I stumbled upon quite evesome Hash-algorithm xxHash.

There are lightning fast implementations.

https://github.com/Cyan4973/xxHash

Noted that usually the Hash algorithm is not the bottleneck, but still
pretty interesting...

-Tee-

On Tue, Dec 29, 2015 at 6:05 PM, Dimitry Sibiryakov 
wrote:

> 29.12.2015 11:43, Alex Peshkoff wrote:
>
>> That I do not expect to get something better than this bell curve:
>>
>
>   I was unable to create a good load on database with my weak notebook,
> but oltp-emul with 5 windows gave me these numbers for lock manager using
> CRC32:
>
> Hash slots: 8191, Hash lengths (min/avg/max):0/   0/   6
>> Hash lengths distribution:
>> 0  : 3591   (43%)
>> 1  : 2992   (36%)
>> 2  : 1194   (14%)
>> 3  :  324   (3%)
>> 4  :   71   (0%)
>> 5  :   14   (0%)
>> 6  :5   (0%)
>>
>
>   It looks like the curve falls faster.
>
> --
>   WBR, SD.
>
>
> --
>
> Firebird-Devel mailing list, web interface at
> https://lists.sourceforge.net/lists/listinfo/firebird-devel
>
>
--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-29 Thread Dimitry Sibiryakov

29.12.2015 11:43, Alex Peshkoff wrote:

That I do not expect to get something better than this bell curve:


  I was unable to create a good load on database with my weak notebook, but oltp-emul 
with 5 windows gave me these numbers for lock manager using CRC32:



Hash slots: 8191, Hash lengths (min/avg/max):0/   0/   6
Hash lengths distribution:
0  : 3591   (43%)
1  : 2992   (36%)
2  : 1194   (14%)
3  :  324   (3%)
4  :   71   (0%)
5  :   14   (0%)
6  :5   (0%)


  It looks like the curve falls faster.

--
  WBR, SD.
--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-29 Thread Alex Peshkoff

On 12/29/2015 01:23 PM, Dimitry Sibiryakov wrote:

24.12.2015 18:29, Alex Peshkoff wrote:

If we have one chain with 12 objects, three empty chains and  chains
with 1, 2 or 3 objects - it's not bad.
I.e. existing stat-s is not enough for analysis.

I've got following data from Igor Valchenko:


Hash slots: 65521, Hash lengths (min/avg/max): 0/ 1/ 8
Hash lengths distribution:
0 : 13761
1 : 21925
2 : 1
3 : 8675
4 : 3219
5 : 955
6 : 253
7 : 53
8 : 14

What would you say?



That I do not expect to get something better than this bell curve:


--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-29 Thread Dimitry Sibiryakov
24.12.2015 18:29, Alex Peshkoff wrote:
> If we have one chain with 12 objects, three empty chains and  chains
> with 1, 2 or 3 objects - it's not bad.
> I.e. existing stat-s is not enough for analysis.

   I've got following data from Igor Valchenko:

> Hash slots: 65521, Hash lengths (min/avg/max): 0/ 1/ 8
> Hash lengths distribution:
>   0 : 13761
>   1 : 21925
>   2 : 1
>   3 : 8675
>   4 : 3219
>   5 : 955
>   6 : 253
>   7 : 53
>   8 : 14

   What would you say?

-- 
   WBR, SD.

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-28 Thread Dimitry Sibiryakov
24.12.2015 18:22, Dmitry Yemanov wrote:
>> >Yes. But in this case the hash is only used internally and not stored 
>> >anywhere. Because
>> >of this, different platforms can use different algorithms for it.
> Of course, but then it should be encapsulated and moved to /common to
> avoid the same #ifdefs used here and there. This algorithm is used twice
> or thrice in our code, IIRC.

   AFAICS, in other places a little different algorithm is used. It uses 
different sum 
logic and then division in loop instead of single op.
   BTW, does this algorithm have a name to google for?

-- 
   WBR, SD.

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-25 Thread Dimitry Sibiryakov
25.12.2015 14:14, Vlad Khorsun wrote:
> crc32" used _mm_crc32_u32, _mm_crc32_u16 and _mm_crc32_u8 when appropriate

   Cool. I'm surprised that these instructions work even with unaligned data.

-- 
   WBR, SD.

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-25 Thread Vlad Khorsun
25.12.2015 13:40, Dimitry Sibiryakov wrote:
> 25.12.2015 12:12, Dimitry Sibiryakov wrote:
>> You'd better try final attached variant yourself.
>
> BTW, MSVC2010 32 bits showed the worst result: only 2.5x speed up while 
> GCC -O3 gave 5x.
> Vlad, could you try it with MSVC2013?..
>

MSVC13, /02 /Oi, Intel Core i3-4130T, 32 bit - Release build

min   max
original31   234
unrolled3162
unrolled2   1547
crc320   125
crc32"   031

crc32" used _mm_crc32_u32, _mm_crc32_u16 and _mm_crc32_u8 when appropriate

Regards,
Vlad

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-25 Thread Alex Peshkoff
On 12/25/2015 02:12 PM, Dimitry Sibiryakov wrote:
> 24.12.2015 17:58, Alex Peshkoff wrote:
>> BTW, please try it in such form:
>
>   I can't believe my eyes. You'd better try final attached variant 
> yourself.
>

Telling true have not noticed great difference. Looks like it's highly 
HW-dependent.


--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-25 Thread Dimitry Sibiryakov
25.12.2015 12:12, Dimitry Sibiryakov wrote:
> You'd better try final attached variant yourself.

   BTW, MSVC2010 32 bits showed the worst result: only 2.5x speed up while GCC 
-O3 gave 5x.
   Vlad, could you try it with MSVC2013?..

-- 
   WBR, SD.

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-25 Thread Wols Lists
On 24/12/15 17:22, Dmitry Yemanov wrote:
>> I suspect that "true hash" could make distribution of hash values more 
>> equival
>> > and thus improve overall timing for lock table search.

> A couple of years ago I played a bit with different hash functions that 
> are considered good nowadays. Quite surprisingly, none of them provided 
> consistently better value distribution.
> 
Once you get a decent number of buckets, it's almost impossible to
spread stuff evenly ...

Talking from experience with Pick (which puts its data in hashed files)
its normal to have a few buckets seriously overflowed. You just can't
avoid it.

Let's say I have 80 objects, with space for 5 per bucket. That gives me
20 buckets. Pretty much any algorithm will probably give me one or two
empty buckets, and one bucket with 11 necessitating storing it as three
groups of five. That said, we reckon on finding 19 out of 20 records in
the very first group we try.

If you want a reasonably objective measure, try calculating the average
distance an item is from the head of the chain. So let's assume we have
20 items, in 10 chains as follows:

1 1 3 1 0 4 0 4 3 3

So that gives us

0, 0, 1+2, 0, 0, 1+2+3, 0, 1+2+3, 1+2, 1+2

which = 21.

So with an average of two items per chain, that's a score of just over
1. With perfection being 0.5 and a worst case of 10, that's not bad.

(Pick would score this differently, putting them in five buckets we get
1+4 1+0 3+4 1+3 0+3. That gives us one bucket with 7, which scores 2.
That gives us an overall score of 0.1. But for Pick each group access is
a disk hit, so the cost of searching a group for the required item is
lost in the noise).

But if you want some figures for comparison, add 25% to the number of
items to give the number of buckets. So with 20 items, that gives 25
buckets. If your algorithm then returns a score of 0.05, you're doing
very well.

Cheers,
Wol

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-25 Thread Dimitry Sibiryakov

24.12.2015 17:58, Alex Peshkoff wrote:

BTW, please try it in such form:


  I can't believe my eyes. You'd better try final attached variant yourself.

--
  WBR, SD.


HashTest.cpp.7z
Description: Binary data
--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-24 Thread Alex Peshkoff
On 12/24/2015 08:24 PM, Dimitry Sibiryakov wrote:
> 24.12.2015 18:19, Alex Peshkoff wrote:
>> This should be checked. At first one should show that current hash
>> provides bad distribution.
> In fb_lock_print output I often see something like this:
>
>> Hash lengths (min/avg/max):0/   2/  12
> Isn't it bad?
>

If we have one chain with 12 objects, three empty chains and  chains 
with 1, 2 or 3 objects - it's not bad.
I.e. existing stat-s is not enough for analysis.


--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-24 Thread Leyne, Sean


>On the other hand, CS is going down, so lock manager performance may be
> not a critical part anymore?

It will take groups like mine a while before they embrace the new SMP 
functionality of v3 -- there is application/deployment testing to be 
performed...

So, CS will be around for a while* and performance does matter!


Sean

* the notion of support for clustered Firebird deployments is still very 
attractive to me -- especially with the mainstream availability of low latency 
RDMA NICs (Infiniband, RoCE, iWarp).  Lock manager will be required those 
deployments.


--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-24 Thread Dimitry Sibiryakov
24.12.2015 18:19, Alex Peshkoff wrote:
> This should be checked. At first one should show that current hash
> provides bad distribution.

   In fb_lock_print output I often see something like this:

> Hash lengths (min/avg/max):0/   2/  12

   Isn't it bad?

-- 
   WBR, SD.

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-24 Thread Dmitry Yemanov
24.12.2015 20:07, Dimitry Sibiryakov wrote:
>
> Yes. But in this case the hash is only used internally and not stored 
> anywhere. Because
> of this, different platforms can use different algorithms for it.

Of course, but then it should be encapsulated and moved to /common to 
avoid the same #ifdefs used here and there. This algorithm is used twice 
or thrice in our code, IIRC.

> I suspect that "true hash" could make distribution of hash values more equival
> and thus improve overall timing for lock table search.

A couple of years ago I played a bit with different hash functions that 
are considered good nowadays. Quite surprisingly, none of them provided 
consistently better value distribution.

> On the other hand, CS is going down, so lock manager performance may be not a 
> critical
> part anymore?

Maybe not, but hash tables may still be used in other places and they're 
quite important from the performance POV. For example, hash joins use 
the same hash function.


Dmitry


--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-24 Thread Alex Peshkoff
On 12/24/2015 08:07 PM, Dimitry Sibiryakov wrote:
> 24.12.2015 17:48, Alex Peshkoff wrote:
>> Use of CPU-accelerated code appears suspicious from non-intel ports POV.
> Yes. But in this case the hash is only used internally and not stored 
> anywhere. Because
> of this, different platforms can use different algorithms for it. I suspect 
> that "true
> hash" could make distribution of hash values more equival and thus improve 
> overall timing
> for lock table search.

This should be checked. At first one should show that current hash 
provides bad distribution.

> On the other hand, CS is going down, so lock manager performance may be 
> not a critical
> part anymore?
>

I suppose that users who need high reliability will continue with 
classic. I.e. it's usage will get lower but it will remain.


--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-24 Thread Dimitry Sibiryakov
24.12.2015 17:58, Alex Peshkoff wrote:
> For me (AMD CPU) it behaves much better for length in 11,15,19,etc.

   I have AMD too, so we need someone with Intel to see if these conditions 
won't defeat 
its jump prediction algorithms there.

-- 
   WBR, SD.

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-24 Thread Dimitry Sibiryakov
24.12.2015 17:48, Alex Peshkoff wrote:
> Use of CPU-accelerated code appears suspicious from non-intel ports POV.

   Yes. But in this case the hash is only used internally and not stored 
anywhere. Because 
of this, different platforms can use different algorithms for it. I suspect 
that "true 
hash" could make distribution of hash values more equival and thus improve 
overall timing 
for lock table search.
   On the other hand, CS is going down, so lock manager performance may be not 
a critical 
part anymore?

-- 

   WBR, SD.

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-24 Thread Alex Peshkoff
On 12/24/2015 07:48 PM, Alex Peshkoff wrote:
> On 12/24/2015 04:19 PM, Dimitry Sibiryakov wrote:
>> Hello, All.
>>
>>I played a little with hash calculation code from lock manager and
>> found that there is a room for improvement.
>>Attached test program compiled by GCC 4.7.1 with switches "-O3
>> -msse4.2" shown that simple unroll of the loop makes it 2-3 times
>> faster. Current code is even slower than CPU-accelerated CRC32.
>>Comments?
> Use of CPU-accelerated code appears suspicious from non-intel ports POV.
> What about unrolled loop - I like it.

BTW, please try it in such form:

int hash_value = 0;
 { // scope
 char* p;
 const char* q = value;
 int l = length;
 while (l >= 4)
 {
 p = (char*) &hash_value;
 *p++ += *q++;
 *p++ += *q++;
 *p++ += *q++;
 *p++ += *q++;
 l -= 4;
 }
 p = (char*) &hash_value;
 if (l >= 2)
 {
 *p++ += *q++;
 *p++ += *q++;
 l -= 2;
 }
 if (l)
 {
 *p++ += *q++;
 }
 } // scope

For me (AMD CPU) it behaves much better for length in 11,15,19,etc.



--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-24 Thread Alex Peshkoff
On 12/24/2015 04:19 PM, Dimitry Sibiryakov wrote:
> Hello, All.
>
>   I played a little with hash calculation code from lock manager and 
> found that there is a room for improvement.
>   Attached test program compiled by GCC 4.7.1 with switches "-O3 
> -msse4.2" shown that simple unroll of the loop makes it 2-3 times 
> faster. Current code is even slower than CPU-accelerated CRC32.
>   Comments?

Use of CPU-accelerated code appears suspicious from non-intel ports POV.
What about unrolled loop - I like it.


--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Lock manager's hash calculations

2015-12-24 Thread Dimitry Sibiryakov
24.12.2015 14:19, Dimitry Sibiryakov wrote:
> simple unroll of the loop makes it 2-3 times faster.

   Actually, even replace of USHORT with ULONG in the loop wins about 15%.

-- 
   WBR, SD.

--
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel