Re: DSM segment handle generation in background workers
On Thu, Nov 15, 2018 at 4:25 AM Tom Lane wrote: > Thomas Munro writes: > > What do you think about the attached? > > I think you need to cast MyStartTimestamp to uint64 before shifting > to ensure portable behavior of the shifts. In principle it wouldn't > matter because the int64 sign bit is nowhere near the part we care > about, but I've heard some pretty wild claims about what compiler > writers are entitled to do with "undefined" cases. Thanks. Pushed. -- Thomas Munro http://www.enterprisedb.com
Re: DSM segment handle generation in background workers
On Wed, Nov 14, 2018 at 8:52 PM Noah Misch wrote: > On Wed, Nov 14, 2018 at 08:22:42PM +1300, Thomas Munro wrote: > > On Wed, Nov 14, 2018 at 6:34 PM Noah Misch wrote: > > > On Wed, Nov 14, 2018 at 05:50:26PM +1300, Thomas Munro wrote: > > > > On Wed, Nov 14, 2018 at 3:24 PM Noah Misch wrote: > > > > > What counts is the ease of predicting a complete seed. HEAD's > > > > > algorithm has > > > > > ~13 trivially-predictable bits, and the algorithm that stood in > > > > > BackendRun() > > > > > from 98c5065 until 197e4af had no such bits. You're right that the > > > > > other 19 > > > > > bits are harder to predict than any given 19 bits under the old > > > > > algorithm, but > > > > > the complete seed remains more predictable than it was before 197e4af. > > > > > > > > However we mix them, given that the source code is well known, isn't > > > > an attacker's job really to predict the time and pid, two not > > > > especially well guarded secrets? > > > > > > True. Better to frame the issue as uniform distribution of seed, not > > > unpredictability of seed selection. > > > > What do you think about the attached? > > You mentioned that you rewrote the algorithm because the new function had a > TimestampTz. But the BackendRun() code, which it replaced, also had a > TimestampTz. You can reuse the exact algorithm. Is there a reason to change? The old code used a "slightly hacky way to convert timestamptz into integers" because TimestampTz might have been floating point back in the day. Now that TimestampTz is always an integer, we might as well use it directly and shuffle its bits for the same general effect, no? The difference between x >> 20 and x / USECS_PER_SEC doesn't seem to be material. -- Thomas Munro http://www.enterprisedb.com
Re: DSM segment handle generation in background workers
On Wed, Nov 14, 2018 at 08:22:42PM +1300, Thomas Munro wrote: > On Wed, Nov 14, 2018 at 6:34 PM Noah Misch wrote: > > On Wed, Nov 14, 2018 at 05:50:26PM +1300, Thomas Munro wrote: > > > On Wed, Nov 14, 2018 at 3:24 PM Noah Misch wrote: > > > > What counts is the ease of predicting a complete seed. HEAD's > > > > algorithm has > > > > ~13 trivially-predictable bits, and the algorithm that stood in > > > > BackendRun() > > > > from 98c5065 until 197e4af had no such bits. You're right that the > > > > other 19 > > > > bits are harder to predict than any given 19 bits under the old > > > > algorithm, but > > > > the complete seed remains more predictable than it was before 197e4af. > > > > > > However we mix them, given that the source code is well known, isn't > > > an attacker's job really to predict the time and pid, two not > > > especially well guarded secrets? > > > > True. Better to frame the issue as uniform distribution of seed, not > > unpredictability of seed selection. > > What do you think about the attached? You mentioned that you rewrote the algorithm because the new function had a TimestampTz. But the BackendRun() code, which it replaced, also had a TimestampTz. You can reuse the exact algorithm. Is there a reason to change?
Re: DSM segment handle generation in background workers
On Wed, Nov 14, 2018 at 6:34 PM Noah Misch wrote: > On Wed, Nov 14, 2018 at 05:50:26PM +1300, Thomas Munro wrote: > > On Wed, Nov 14, 2018 at 3:24 PM Noah Misch wrote: > > > What counts is the ease of predicting a complete seed. HEAD's algorithm > > > has > > > ~13 trivially-predictable bits, and the algorithm that stood in > > > BackendRun() > > > from 98c5065 until 197e4af had no such bits. You're right that the other > > > 19 > > > bits are harder to predict than any given 19 bits under the old > > > algorithm, but > > > the complete seed remains more predictable than it was before 197e4af. > > > > However we mix them, given that the source code is well known, isn't > > an attacker's job really to predict the time and pid, two not > > especially well guarded secrets? > > True. Better to frame the issue as uniform distribution of seed, not > unpredictability of seed selection. What do you think about the attached? -- Thomas Munro http://www.enterprisedb.com 0001-Increase-the-number-of-possible-random-seeds-per-tim.patch Description: Binary data
Re: DSM segment handle generation in background workers
On Wed, Nov 14, 2018 at 05:50:26PM +1300, Thomas Munro wrote: > On Wed, Nov 14, 2018 at 3:24 PM Noah Misch wrote: > > On Mon, Nov 12, 2018 at 09:39:01AM -0500, Tom Lane wrote: > > > I doubt that's a good idea; to a first approximation, it would mean that > > > half the seed depends only on the PID and the other half only on the > > > timestamp. Maybe we could improve matters a little by left-shifting the > > > PID four bits or so, but I think we still want it to mix with some > > > rapidly-changing time bits. > > > > > > I'm not really sure that we need to do anything though. Basically, > > > what we've got here is a tradeoff between how many bits change over > > > a given timespan and how unpredictable those bits are. I don't see > > > that one of those is necessarily more important than the other. > > > > What counts is the ease of predicting a complete seed. HEAD's algorithm has > > ~13 trivially-predictable bits, and the algorithm that stood in BackendRun() > > from 98c5065 until 197e4af had no such bits. You're right that the other 19 > > bits are harder to predict than any given 19 bits under the old algorithm, > > but > > the complete seed remains more predictable than it was before 197e4af. > > However we mix them, given that the source code is well known, isn't > an attacker's job really to predict the time and pid, two not > especially well guarded secrets? True. Better to frame the issue as uniform distribution of seed, not unpredictability of seed selection. Incidentally, possible future work may be to use pg_strong_random() when available, like pgbench set_random_seed() does. That would achieve both unpredictability and uniform distribution. It would be mere defense in depth; if unpredictability matters, one still needs a CSPRNG (e.g. pgcrypto).
Re: DSM segment handle generation in background workers
On Wed, Nov 14, 2018 at 3:24 PM Noah Misch wrote: > On Mon, Nov 12, 2018 at 09:39:01AM -0500, Tom Lane wrote: > > Thomas Munro writes: > > > On Mon, Nov 12, 2018 at 9:34 PM Noah Misch wrote: > > >> Compared to the old code, the new code requires more wall time to visit > > >> every > > >> possible seed value. New code xor's the PID bits into the > > >> fastest-changing > > >> timestamp bits, so only about twenty bits can vary within any given > > >> one-second > > >> period. (That assumes a PID space of twenty or fewer bits; fifteen bits > > >> is > > >> the Linux default.) Is that aspect of the change justified? > > > > > Hmm, right. How about applying pg_bswap32() to one of the terms, as > > > an easy approximation of reversing the bits? > > That seems adequate, yes. But why not just restore the old algorithm in the > new function? (I'm not opposed to revising the algorithm, but I think a > revision should have explicit goals. The ease of pg_bswap32() is not itself a > worthy goal, because the old BackendRun() algorithm was also quite easy.) Ok. I rewrote it only because in the new coding I had a TimestampTz rather than the separate sec and usec components of the old code. I didn't intend to revise the algorithm fundamentally, but I missed the significance of the bit shifting from commit 98c50656c that moved the faster moving bits up. I'll change it back. > > I doubt that's a good idea; to a first approximation, it would mean that > > half the seed depends only on the PID and the other half only on the > > timestamp. Maybe we could improve matters a little by left-shifting the > > PID four bits or so, but I think we still want it to mix with some > > rapidly-changing time bits. > > > > I'm not really sure that we need to do anything though. Basically, > > what we've got here is a tradeoff between how many bits change over > > a given timespan and how unpredictable those bits are. I don't see > > that one of those is necessarily more important than the other. > > What counts is the ease of predicting a complete seed. HEAD's algorithm has > ~13 trivially-predictable bits, and the algorithm that stood in BackendRun() > from 98c5065 until 197e4af had no such bits. You're right that the other 19 > bits are harder to predict than any given 19 bits under the old algorithm, but > the complete seed remains more predictable than it was before 197e4af. However we mix them, given that the source code is well known, isn't an attacker's job really to predict the time and pid, two not especially well guarded secrets? I don't see how the mixing matters in that respect. (You can also just clobber the seed from SQL, but that'd be cheating.) > Goal I recommend: if connections arrive in a Poisson distribution of unknown > lambda, maximize the number of distinct seeds. Yeah. I think bit reversing one of them achieves the maximum number of distinct seeds by putting the busy ends far apart, but the original bit shifting is similar. -- Thomas Munro http://www.enterprisedb.com
Re: DSM segment handle generation in background workers
On Mon, Nov 12, 2018 at 09:39:01AM -0500, Tom Lane wrote: > Thomas Munro writes: > > On Mon, Nov 12, 2018 at 9:34 PM Noah Misch wrote: > >> Compared to the old code, the new code requires more wall time to visit > >> every > >> possible seed value. New code xor's the PID bits into the fastest-changing > >> timestamp bits, so only about twenty bits can vary within any given > >> one-second > >> period. (That assumes a PID space of twenty or fewer bits; fifteen bits is > >> the Linux default.) Is that aspect of the change justified? > > > Hmm, right. How about applying pg_bswap32() to one of the terms, as > > an easy approximation of reversing the bits? That seems adequate, yes. But why not just restore the old algorithm in the new function? (I'm not opposed to revising the algorithm, but I think a revision should have explicit goals. The ease of pg_bswap32() is not itself a worthy goal, because the old BackendRun() algorithm was also quite easy.) > I doubt that's a good idea; to a first approximation, it would mean that > half the seed depends only on the PID and the other half only on the > timestamp. Maybe we could improve matters a little by left-shifting the > PID four bits or so, but I think we still want it to mix with some > rapidly-changing time bits. > > I'm not really sure that we need to do anything though. Basically, > what we've got here is a tradeoff between how many bits change over > a given timespan and how unpredictable those bits are. I don't see > that one of those is necessarily more important than the other. What counts is the ease of predicting a complete seed. HEAD's algorithm has ~13 trivially-predictable bits, and the algorithm that stood in BackendRun() from 98c5065 until 197e4af had no such bits. You're right that the other 19 bits are harder to predict than any given 19 bits under the old algorithm, but the complete seed remains more predictable than it was before 197e4af. Goal I recommend: if connections arrive in a Poisson distribution of unknown lambda, maximize the number of distinct seeds. nm
Re: DSM segment handle generation in background workers
Thomas Munro writes: > On Mon, Nov 12, 2018 at 9:34 PM Noah Misch wrote: >> Compared to the old code, the new code requires more wall time to visit every >> possible seed value. New code xor's the PID bits into the fastest-changing >> timestamp bits, so only about twenty bits can vary within any given >> one-second >> period. (That assumes a PID space of twenty or fewer bits; fifteen bits is >> the Linux default.) Is that aspect of the change justified? > Hmm, right. How about applying pg_bswap32() to one of the terms, as > an easy approximation of reversing the bits? I doubt that's a good idea; to a first approximation, it would mean that half the seed depends only on the PID and the other half only on the timestamp. Maybe we could improve matters a little by left-shifting the PID four bits or so, but I think we still want it to mix with some rapidly-changing time bits. I'm not really sure that we need to do anything though. Basically, what we've got here is a tradeoff between how many bits change over a given timespan and how unpredictable those bits are. I don't see that one of those is necessarily more important than the other. regards, tom lane
Re: DSM segment handle generation in background workers
On Mon, Nov 12, 2018 at 9:34 PM Noah Misch wrote: > On Sat, Oct 13, 2018 at 11:45:17PM +1300, Thomas Munro wrote: > > + /* Set a different seed for random() in every backend. */ > > + srandom((unsigned int) MyProcPid ^ (unsigned int) MyStartTimestamp); > > > - TimestampDifference(0, port->SessionStartTime, , ); > > - srandom((unsigned int) (MyProcPid ^ (usecs << 12) ^ secs)); > > Compared to the old code, the new code requires more wall time to visit every > possible seed value. New code xor's the PID bits into the fastest-changing > timestamp bits, so only about twenty bits can vary within any given one-second > period. (That assumes a PID space of twenty or fewer bits; fifteen bits is > the Linux default.) Is that aspect of the change justified? Hmm, right. How about applying pg_bswap32() to one of the terms, as an easy approximation of reversing the bits? -- Thomas Munro http://www.enterprisedb.com
Re: DSM segment handle generation in background workers
On Sat, Oct 13, 2018 at 11:45 PM Thomas Munro wrote: > On Thu, Oct 11, 2018 at 5:51 PM Tom Lane wrote: > > The comment adjacent to the change in InitStandaloneProcess bothers me. > > In particular, it points out that what BackendRun() is currently doing > > creates more entropy in the process's random seed than what you have > > here: MyStartTime is only good to the nearest second, whereas the code > > you removed from BackendRun() factors in fractional seconds to whatever > > the precision of GetCurrentTimestamp is. This does not seem like an > > improvement. > > True. > > > I'm inclined to think we should refactor a bit more aggressively so > > that, rather than dumbing backends down to the standard of other > > processes, we make them all use the better method. A reasonable way > > to approach this would be to invent a global variable MyStartTimestamp > > beside MyStartTime, and initialize both of them in the places that > > currently initialize the latter, using code like that in > > BackendInitialize: > > > > /* save process start time */ > > port->SessionStartTime = GetCurrentTimestamp(); > > MyStartTime = timestamptz_to_time_t(port->SessionStartTime); > > > > and then this new InitRandomSeeds function could adopt BackendRun's > > seed initialization method instead of the stupider way. > > Ok, done. > > With MyStartTimestamp in the picture, port->SessionStartTime is > probably redundant... and in fact the comment in struct > Port says it shouldn't even be there. So... I removed it, and used > the new MyStartTimestamp in the couple of places that wanted it. > Thoughts? > > > Possibly it'd be sane to merge the MyStartTime* initializations > > and the srandom resets into one function, not sure. > > +1 > > The main difficulty was coming up with a name for the function that > does those things. I went with InitProcessGlobals(). Better > suggestions welcome. Pushed to master only. -- Thomas Munro http://www.enterprisedb.com
Re: DSM segment handle generation in background workers
On Thu, Oct 11, 2018 at 5:51 PM Tom Lane wrote: > The comment adjacent to the change in InitStandaloneProcess bothers me. > In particular, it points out that what BackendRun() is currently doing > creates more entropy in the process's random seed than what you have > here: MyStartTime is only good to the nearest second, whereas the code > you removed from BackendRun() factors in fractional seconds to whatever > the precision of GetCurrentTimestamp is. This does not seem like an > improvement. True. > I'm inclined to think we should refactor a bit more aggressively so > that, rather than dumbing backends down to the standard of other > processes, we make them all use the better method. A reasonable way > to approach this would be to invent a global variable MyStartTimestamp > beside MyStartTime, and initialize both of them in the places that > currently initialize the latter, using code like that in > BackendInitialize: > > /* save process start time */ > port->SessionStartTime = GetCurrentTimestamp(); > MyStartTime = timestamptz_to_time_t(port->SessionStartTime); > > and then this new InitRandomSeeds function could adopt BackendRun's > seed initialization method instead of the stupider way. Ok, done. With MyStartTimestamp in the picture, port->SessionStartTime is probably redundant... and in fact the comment in struct Port says it shouldn't even be there. So... I removed it, and used the new MyStartTimestamp in the couple of places that wanted it. Thoughts? > Possibly it'd be sane to merge the MyStartTime* initializations > and the srandom resets into one function, not sure. +1 The main difficulty was coming up with a name for the function that does those things. I went with InitProcessGlobals(). Better suggestions welcome. -- Thomas Munro http://www.enterprisedb.com 0001-Refactor-random-seed-and-start-time-initialization.patch Description: Binary data
Re: DSM segment handle generation in background workers
Thomas Munro writes: >> Ok, here is a sketch patch with a new function InitRandomSeeds() to do >> that. It is called from PostmasterMain(), InitPostmasterChild() and >> InitStandaloneProcess() for consistency. > Here's a version I propose to push to master and 11 tomorrow, if there > are no objections. The comment adjacent to the change in InitStandaloneProcess bothers me. In particular, it points out that what BackendRun() is currently doing creates more entropy in the process's random seed than what you have here: MyStartTime is only good to the nearest second, whereas the code you removed from BackendRun() factors in fractional seconds to whatever the precision of GetCurrentTimestamp is. This does not seem like an improvement. I'm inclined to think we should refactor a bit more aggressively so that, rather than dumbing backends down to the standard of other processes, we make them all use the better method. A reasonable way to approach this would be to invent a global variable MyStartTimestamp beside MyStartTime, and initialize both of them in the places that currently initialize the latter, using code like that in BackendInitialize: /* save process start time */ port->SessionStartTime = GetCurrentTimestamp(); MyStartTime = timestamptz_to_time_t(port->SessionStartTime); and then this new InitRandomSeeds function could adopt BackendRun's seed initialization method instead of the stupider way. Possibly it'd be sane to merge the MyStartTime* initializations and the srandom resets into one function, not sure. regards, tom lane
Re: DSM segment handle generation in background workers
On Tue, Oct 9, 2018 at 3:21 PM Thomas Munro wrote: > On Tue, Oct 9, 2018 at 1:53 PM Tom Lane wrote: > > Thomas Munro writes: > > > On Mon, Oct 8, 2018 at 1:17 AM Thomas Munro > > > wrote: > > >> That's because the bgworker startup path doesn't contain a call to > > >> srandom(...distinguishing stuff...), unlike BackendRun(). I suppose > > >> do_start_bgworker() could gain a similar call... or perhaps that call > > >> should be moved into InitPostmasterChild(). If we put it in there > > >> right after MyStartTime is assigned a new value, we could use the same > > >> incantation that PostmasterMain() uses. > > > > > Maybe something like this? > > > > I think the bit with > > > > #ifndef HAVE_STRONG_RANDOM > > random_seed = 0; > > random_start_time.tv_usec = 0; > > #endif > > > > should also be done in every postmaster child, no? If we want to hide the > > postmaster's state from child processes, we ought to hide it from all. > > Ok, here is a sketch patch with a new function InitRandomSeeds() to do > that. It is called from PostmasterMain(), InitPostmasterChild() and > InitStandaloneProcess() for consistency. Here's a version I propose to push to master and 11 tomorrow, if there are no objections. -- Thomas Munro http://www.enterprisedb.com 0001-Use-different-random-seeds-in-all-backends.patch Description: Binary data
Re: DSM segment handle generation in background workers
On Tue, Oct 9, 2018 at 1:53 PM Tom Lane wrote: > Thomas Munro writes: > > On Mon, Oct 8, 2018 at 1:17 AM Thomas Munro > > wrote: > >> That's because the bgworker startup path doesn't contain a call to > >> srandom(...distinguishing stuff...), unlike BackendRun(). I suppose > >> do_start_bgworker() could gain a similar call... or perhaps that call > >> should be moved into InitPostmasterChild(). If we put it in there > >> right after MyStartTime is assigned a new value, we could use the same > >> incantation that PostmasterMain() uses. > > > Maybe something like this? > > I think the bit with > > #ifndef HAVE_STRONG_RANDOM > random_seed = 0; > random_start_time.tv_usec = 0; > #endif > > should also be done in every postmaster child, no? If we want to hide the > postmaster's state from child processes, we ought to hide it from all. Ok, here is a sketch patch with a new function InitRandomSeeds() to do that. It is called from PostmasterMain(), InitPostmasterChild() and InitStandaloneProcess() for consistency. It seems a bit strange to me that internal infrastructure shares a random number generator with SQL-callable functions random() and setseed(), though I'm not saying it's harmful. While wondering if something like this should be back-patched, I noticed that SQL random() is declared as parallel-restricted, which is good: it means we aren't exposing a random() function that generates the same values in every parallel worker. So I think this is probably just a minor nuisance and should probably only be pushed to master, or at most to 11 (since Parallel Hash likes to create DSM segments in workers), unless someone can think of a more serious way this can hurt you. (Tangentially: I suppose it might be useful to have a different SQL random number function that is parallel safe, that isn't associated with a user-controllable seed, and whose seed is different in each backend.) -- Thomas Munro http://www.enterprisedb.com 0001-Make-sure-we-initialize-random-seeds-per-backend.patch Description: Binary data
Re: DSM segment handle generation in background workers
Thomas Munro writes: > On Mon, Oct 8, 2018 at 1:17 AM Thomas Munro > wrote: >> That's because the bgworker startup path doesn't contain a call to >> srandom(...distinguishing stuff...), unlike BackendRun(). I suppose >> do_start_bgworker() could gain a similar call... or perhaps that call >> should be moved into InitPostmasterChild(). If we put it in there >> right after MyStartTime is assigned a new value, we could use the same >> incantation that PostmasterMain() uses. > Maybe something like this? I think the bit with #ifndef HAVE_STRONG_RANDOM random_seed = 0; random_start_time.tv_usec = 0; #endif should also be done in every postmaster child, no? If we want to hide the postmaster's state from child processes, we ought to hide it from all. regards, tom lane
Re: DSM segment handle generation in background workers
On Mon, Oct 8, 2018 at 1:17 AM Thomas Munro wrote: > That's because the bgworker startup path doesn't contain a call to > srandom(...distinguishing stuff...), unlike BackendRun(). I suppose > do_start_bgworker() could gain a similar call... or perhaps that call > should be moved into InitPostmasterChild(). If we put it in there > right after MyStartTime is assigned a new value, we could use the same > incantation that PostmasterMain() uses. > > I noticed that the comment in PostmasterMain() refers to > PostmasterRandom(), which is gone. Maybe something like this? -- Thomas Munro http://www.enterprisedb.com move-srandom.patch Description: Binary data
DSM segment handle generation in background workers
Hello, I noticed that every parallel worker generates the same sequence of handles here: /* * Loop until we find an unused identifier for the new control segment. We * sometimes use 0 as a sentinel value indicating that no control segment * is known to exist, so avoid using that value for a real control * segment. */ for (;;) { Assert(dsm_control_address == NULL); Assert(dsm_control_mapped_size == 0); dsm_control_handle = random(); if (dsm_control_handle == DSM_HANDLE_INVALID) continue; if (dsm_impl_op(DSM_OP_CREATE, dsm_control_handle, segsize, _control_impl_private, _control_address, _control_mapped_size, ERROR)) break; } It's harmless AFAICS, but it produces sequences of syscalls like this when Parallel Hash is building the hash table: shm_open("/PostgreSQL.240477264",O_RDWR|O_CREAT|O_EXCL,0600) ERR#17 'File exists' shm_open("/PostgreSQL.638747851",O_RDWR|O_CREAT|O_EXCL,0600) ERR#17 'File exists' shm_open("/PostgreSQL.1551053007",O_RDWR|O_CREAT|O_EXCL,0600) = 5 (0x5) That's because the bgworker startup path doesn't contain a call to srandom(...distinguishing stuff...), unlike BackendRun(). I suppose do_start_bgworker() could gain a similar call... or perhaps that call should be moved into InitPostmasterChild(). If we put it in there right after MyStartTime is assigned a new value, we could use the same incantation that PostmasterMain() uses. I noticed that the comment in PostmasterMain() refers to PostmasterRandom(), which is gone. -- Thomas Munro http://www.enterprisedb.com