Pushed with that change and some others, notably:
Thanks for the improvements and the push!
--
Fabien.
Fabien COELHO writes:
>> Good, but then you shouldn't write associated code as if that's still
>> a problem, because you'll cause other people to think it's still a
>> problem and write equally contorted code elsewhere. "v & 1" is a
>> transparent way of producing a bool, while this code isn't.
Hello,
They are not more nor less relevant than any other "random" constant. The
state needs a default initialization. The point of using DK's is that it
is somehow cannot be some specially crafted value which would have some
special property only know to the purveyor of the constant and could
Fabien COELHO writes:
>> How did Knuth get into this? This algorithm is certainly not his,
>> so why are those constants at all relevant?
> They are not more nor less relevant than any other "random" constant. The
> state needs a default initialization. The point of using DK's is that it
> is
Hello Tom,
Thanks for the feedback.
+/* use Donald Knuth's LCG constants for default state */
How did Knuth get into this? This algorithm is certainly not his,
so why are those constants at all relevant?
They are not more nor less relevant than any other "random" constant. The
state
I wrote:
> ... What we do need is a decision
> about what to do on Windows. We could write it like
> +#ifndef WIN32
> + srandom(pg_prng_i32(_global_prng_state));
> +#endif
> but I have a different modest suggestion: add
> #define srandom(seed) srand(seed)
> in win32_port.h. As far as I can
Aleksander Alekseev writes:
> It looks like the patch is in pretty good shape. I noticed that the
> return value of pg_prng_strong_seed() is not checked in several
> places, also there was a typo in pg_trgm.c. The corrected patch is
> attached. Assuming the new version will not upset cfbot, I
Hi hackers,
>
> > I guess the declaration needs PGDLLIMPORT.
>
> Indeed, thanks!
>
> Attached v16 adds that.
It looks like the patch is in pretty good shape. I noticed that the
return value of pg_prng_strong_seed() is not checked in several
places, also there was a typo in pg_trgm.c. The
I guess the declaration needs PGDLLIMPORT.
Indeed, thanks!
Attached v16 adds that.
--
Fabien.diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index d19f73127c..b250ae912b 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@
On Thu, Sep 30, 2021 at 9:23 PM Fabien COELHO wrote:
> The missing symbol is really defined in common/pg_prng.c which AFAICT is
> linked with postgres.
>
> If someone experienced with the windows compilation chain could give a
> hint of what is needed, I'd appreciate it!
I guess the declaration
Attached v15 also does call srandom if it is there, and fixes yet another
remaining random call.
I think that I have now removed all references to "random" from pg source.
However, the test still fails on windows, because the linker does not find
a global variable when compiling
Just FTR, I strongly object to your removal of process-startup srandom()
calls.
Ok. The point of the patch is to replace and unify the postgres underlying
PRNG, so there was some logic behind this removal.
FTR, this was triggered by your comment on Jul 1:
[...] I see that you probably
Hello Tom,
Just FTR, I strongly object to your removal of process-startup srandom()
calls.
Ok. The point of the patch is to replace and unify the postgres underlying
PRNG, so there was some logic behind this removal.
Those are not only setting the seed for our own use, but also ensuring
Fabien COELHO writes:
> Not enough. Here is a v14 which might improve things further more again.
> Sorry for this noise due to blind windows tests.
Just FTR, I strongly object to your removal of process-startup srandom()
calls. Those are not only setting the seed for our own use, but also
[1]: http://cfbot.cputube.org/
Indeed. I wish that these results would be available from the cf
interface.
Attached a v11 which might improve things.
Not enough. Here is a v12 which might improve things further.
Not enough. Here is a v13 which might improve things further more.
Not
[1]: http://cfbot.cputube.org/
Indeed. I wish that these results would be available from the cf interface.
Attached a v11 which might improve things.
Not enough. Here is a v12 which might improve things further.
Not enough. Here is a v13 which might improve things further more.
--
Hello again,
Just wanted to let you know that cfbot [1] doesn't seem to be happy with
the patch. Apparently, some tests are falling. To be honest, I didn't
invest too much time into investigating this. Hopefully, it's not a big
deal.
[1]: http://cfbot.cputube.org/
Indeed. I wish that these
Hello Aleksander,
Attached a v10 which is some kind of compromise where the interface uses
inclusive min and max bounds, so that all values can be reached.
Just wanted to let you know that cfbot [1] doesn't seem to be happy with
the patch. Apparently, some tests are falling. To be honest, I
Hi Fabien,
> Attached a v10 which is some kind of compromise where the interface uses
> inclusive min and max bounds, so that all values can be reached.
Just wanted to let you know that cfbot [1] doesn't seem to be happy with
the patch. Apparently, some tests are falling. To be honest, I didn't
Hello Yura,
Given 99.99% cases will be in the likely case, branch predictor should
eliminate decision cost.
Hmmm. ISTM that a branch predictor should predict that unknown < small
should probably be false, so a hint should be given that it is really
true.
Why? Branch predictor is history
Finally, I think it would be better to treat the upper bound of the
range as inclusive.
This bothered me as well, but the usual approach seems to use range as the
number of values, so I was hesitant to depart from that. I'm still
hesitant to go that way.
Yeah, that bothered me too.
For
On Thu, 8 Jul 2021 at 09:26, Fabien COELHO wrote:
>
> > Finally, I think it would be better to treat the upper bound of the
> > range as inclusive.
>
> This bothered me as well, but the usual approach seems to use range as the
> number of values, so I was hesitant to depart from that. I'm still
>
Hello Dean,
Whilst it has been interesting learning and discussing all these
different techniques, I think it's probably best to stick with the
bitmask method, rather than making the code too complex and difficult
to follow.
Yes.
The bitmask method has the advantage of being very simple,
On Wed, 7 Jul 2021 at 04:00, Yura Sokolov wrote:
>
> Anyway, excuse me for heating this discussion cause of such
> non-essential issue.
> I'll try to control myself and don't proceed it further.
>
Whilst it has been interesting learning and discussing all these
different techniques, I think it's
Fabien COELHO писал 2021-07-06 23:49:
Hello Yura,
However, I'm not enthousiastic at combining two methods depending on
the range, the function looks complex enough without that, so I would
suggest not to take this option. Also, the decision process adds to
the average cost, which is
Hello Yura,
However, I'm not enthousiastic at combining two methods depending on
the range, the function looks complex enough without that, so I would
suggest not to take this option. Also, the decision process adds to
the average cost, which is undesirable.
Given 99.99% cases will be in
Fabien COELHO писал 2021-07-06 09:13:
Hello Yura,
I believe most "range" values are small, much smaller than UINT32_MAX.
In this case, according to [1] fastest method is Lemire's one (I'd
take
original version from [2]) [...]
Yep.
I share your point that the range is more often 32 bits.
Hello Yura,
I believe most "range" values are small, much smaller than UINT32_MAX.
In this case, according to [1] fastest method is Lemire's one (I'd take
original version from [2]) [...]
Yep.
I share your point that the range is more often 32 bits.
However, I'm not enthousiastic at
Fabien COELHO писал 2021-07-04 23:29:
The important property of determinism is that if I set a seed, and
then make an identical set of calls to the random API, the results
will be identical every time, so that it's possible to write tests
with predictable/repeatable results.
Hmmm… I like my
The important property of determinism is that if I set a seed, and then
make an identical set of calls to the random API, the results will be
identical every time, so that it's possible to write tests with
predictable/repeatable results.
Hmmm… I like my stronger determinism definition more
On Sun, 4 Jul 2021 at 17:03, Fabien COELHO wrote:
>
> > As for determinism, the end result is still fully deterministic.
>
> The result is indeed deterministic of you call the function with the same
> range. However, if you change the range value in one place then sometimes
> the state can
Now suppose we want a random number in the range [0,6). This is what
happens with your algorithm for each of the possible prng() return
values:
prng() returns 0 -- OK
prng() returns 1 -- OK
prng() returns 2 -- OK
prng() returns 3 -- OK
prng() returns 4 -- OK
prng() returns 5 -- OK
On Sun, 4 Jul 2021 at 10:35, Fabien COELHO wrote:
>
> I did not understand why it is not correct.
>
Well, to make it easier to visualise, let's imagine our word size is
just 3 bits instead of 64 bits, and that the basic prng() function
generates numbers in the range [0,8). Similarly, imagine a
Hello Dean,
- moves the stuff to common and fully removes random/srandom (Tom)
- includes a range generation function based on the bitmask method (Dean)
but iterates with splitmix so that the state always advances once (Me)
At the risk of repeating myself: do *not* invent your own
On Sat, 3 Jul 2021 at 08:06, Fabien COELHO wrote:
>
> Here is a v4, which:
>
> - moves the stuff to common and fully removes random/srandom (Tom)
> - includes a range generation function based on the bitmask method (Dean)
> but iterates with splitmix so that the state always advances once
1. PostgreSQL source uses `uint64` and `uint32`, but not
`uint64_t`/`uint32_t`
Indeed you are right. Attached v6 does that as well.
--
Fabien.diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 2c2f149fb0..146b524076 100644
--- a/contrib/file_fdw/file_fdw.c
+++
Hello Yura,
1. PostgreSQL source uses `uint64` and `uint32`, but not
`uint64_t`/`uint32_t`
2. I don't see why pg_prng_state could not be `typedef uint64
pg_prng_state[2];`
It could, but I do not see that as desirable. From an API design point of
view we want something clean and abstract,
Yura Sokolov writes:
> 2. I don't see why pg_prng_state could not be `typedef uint64
> pg_prng_state[2];`
Please no. That sort of typedef behaves so weirdly that it's
a foot-gun.
regards, tom lane
Fabien COELHO wrote 2021-07-03 11:45:
And a v5 where an unused test file does also compile if we insist.
About patch:
1. PostgreSQL source uses `uint64` and `uint32`, but not
`uint64_t`/`uint32_t`
2. I don't see why pg_prng_state could not be `typedef uint64
pg_prng_state[2];`
3. Then
Here is a v4, which:
- moves the stuff to common and fully removes random/srandom (Tom)
- includes a range generation function based on the bitmask method (Dean)
but iterates with splitmix so that the state always advances once (Me)
And a v5 where an unused test file does also compile if
Hello Dean & Tom,
Here is a v4, which:
- moves the stuff to common and fully removes random/srandom (Tom)
- includes a range generation function based on the bitmask method (Dean)
but iterates with splitmix so that the state always advances once (Me)
--
Fabien.diff --git
Hello Dean,
It may be true that the bias is of the same magnitude as FP multiply,
but it is not of the same nature. With FP multiply, the
more-likely-to-be-chosen values are more-or-less evenly distributed
across the range, whereas modulo concentrates them all at one end,
making it more
On Thu, 1 Jul 2021 at 22:18, Fabien COELHO wrote:
>
> > However, this patch seems to be replacing that with a simple
> > modulo operation, which is perhaps the worst possible way to do it.
>
> The modulo operation is biased for large ranges close to the limit, sure.
> Also, the bias is somehow of
Hello Dean,
I haven't looked at the patch in detail, but one thing I object to is
the code to choose a random integer in an arbitrary range.
Thanks for bringing up this interesting question!
Currently, this is done in pgbench by getrand(), which has its
problems.
Yes. That is one of the
Hello Tom,
Indeed. Here is a blind attempt at fixing the build, I'll check later to
see whether it works. It would help me if the cfbot results were
integrated into the cf app.
Hmm, not there yet per cfbot, not sure why not.
I'll investigate.
Anyway, after taking a very quick look at
On Thu, 1 Jul 2021 at 19:41, Tom Lane wrote:
>
> Anyway, after taking a very quick look at the patch itself, I've
> got just one main objection: I don't approve of putting this in
> port.h or src/port/.
I haven't looked at the patch in detail, but one thing I object to is
the code to choose a
Fabien COELHO writes:
>> Although this patch is marked RFC, the cfbot shows it doesn't
>> even compile on Windows. I think you missed updating Mkvcbuild.pm.
> Indeed. Here is a blind attempt at fixing the build, I'll check later to
> see whether it works. It would help me if the cfbot results
Although this patch is marked RFC, the cfbot shows it doesn't
even compile on Windows. I think you missed updating Mkvcbuild.pm.
Indeed. Here is a blind attempt at fixing the build, I'll check later to
see whether it works. It would help me if the cfbot results were
integrated into the cf
Although this patch is marked RFC, the cfbot shows it doesn't
even compile on Windows. I think you missed updating Mkvcbuild.pm.
regards, tom lane
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation:tested, passed
Although the patch looks OK I would like to keep the
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation:tested, passed
Although the patch looks OK I would like to keep the status
Sorry for a duplicate entry on CF web application
Hello Aleksander,
- better software engineering
- similar speed (slightly slower)
- better statistical quality
- quite small state
- soundness
Personally, I think your patch is great.
Thanks for having a look!
Speaking of the speed I believe we should consider the performance
Hello Andrey,
- NOT to invent a new design!
Radical version of this argument would be to use de-facto standard and
ubiquitous MT19937.
Indeed, I started considering this one for this reason, obviously.
Though, I suspect, it's not optimal solution to the date.
"not optimal" does not
Hi Fabien,
> To summarize:
> - better software engineering
> - similar speed (slightly slower)
> - better statistical quality
> - quite small state
> - soundness
Personally, I think your patch is great. Speaking of the speed I
believe we should consider the performance of the entire
> 24 мая 2021 г., в 15:31, Fabien COELHO написал(а):
>
>
> - NOT to invent a new design!
Radical version of this argument would be to use de-facto standard and
ubiquitous MT19937.
Though, I suspect, it's not optimal solution to the date.
Best regards, Andrey Borodin.
Hello Tomas,
I have given a go at proposing a replacement for rand48.
So what is the motivation for replacing rand48? Speed, quality of produced
random numbers, features rand48 can't provide, or what?
Speed can only be near rand48, see below. Quality (eg no trivial cycles,
does not fail
Hi,
On 5/24/21 12:31 PM, Fabien COELHO wrote:
Hello pg-devs,
I have given a go at proposing a replacement for rand48.
So what is the motivation for replacing rand48? Speed, quality of
produced random numbers, features rand48 can't provide, or what?
POSIX 1988 (?) rand48 is a LCG PRNG
Hello pg-devs,
I have given a go at proposing a replacement for rand48.
POSIX 1988 (?) rand48 is a LCG PRNG designed to generate 32 bits integers
or floats based on a 48 bits state on 16 or 32 bits architectures. LCG
cycles on the low bits, which can be quite annoying. Given that we run on
59 matches
Mail list logo