Re: General purpose hashing func in pgbench

Fabien COELHO Fri, 12 Jan 2018 07:03:56 -0800


Hello Ildar,

Hmm. I do not think that we should want a shared seed value. The seed
should be different for each call so as to avoid undesired
correlations. If wanted, correlation could be obtained by using an
explicit identical seed.


Probably I'm missing something but I cannot see the point. If we change
seed on every invokation then we get uniform-like distribution (see
attached image). And we don't get the same hash value for the same input
which is the whole point of hash functions. Maybe I didn't understand
you correctly.

I suggest to fix the seed when parsing the script, so that it is the sameseed on each script for a given pgbench invocation, so that for one run itruns with the same seed for each hash call, but changes if pgbench isre-invoked so that the results would be different.

Also, if hash(:i) and hash(:j) appears in two distinct scripts, ISTM thatwe do not necessarily want the same seed, otherwise i == j would correlateto hash(i) == hash(j), which may not be a desirable property for some usecase.


Maybe it would be desirable for other use cases, though.

Anyway I've attached a new version with some tests and docs added.


--
Fabien.

Re: General purpose hashing func in pgbench

Reply via email to