Hello Ildar,

Hmm. I do not think that we should want a shared seed value. The seed
should be different for each call so as to avoid undesired
correlations. If wanted, correlation could be obtained by using an
explicit identical seed.

Probably I'm missing something but I cannot see the point. If we change
seed on every invokation then we get uniform-like distribution (see
attached image). And we don't get the same hash value for the same input
which is the whole point of hash functions. Maybe I didn't understand
you correctly.

I suggest to fix the seed when parsing the script, so that it is the same seed on each script for a given pgbench invocation, so that for one run it runs with the same seed for each hash call, but changes if pgbench is re-invoked so that the results would be different.

Also, if hash(:i) and hash(:j) appears in two distinct scripts, ISTM that we do not necessarily want the same seed, otherwise i == j would correlate to hash(i) == hash(j), which may not be a desirable property for some use case.

Maybe it would be desirable for other use cases, though.


Anyway I've attached a new version with some tests and docs added.

--
Fabien.

Reply via email to