On Tue, Nov 8, 2016 at 2:58 PM, Craig Ringer <craig.rin...@2ndquadrant.com> wrote:
> 2ndQuadrant/bdr That is similar. I'm not clear on the usage of OID for sequence (` DirectFunctionCall1(nextval_oid, seqoid)`) ... does that imply a lock around a sequence generation? also different is that your sequence doesn't reset on the time basis, it ascends and wraps independently of the time. (also, you appear to modulo against the max (2^n-1), not the cardinality (2^n), ... should that be an & ... i.e. take SEQUENCE_BITS of 1 -> MAX_SEQ_ID of ((1<<1)-1) = 1 -> (seq % 1) = {0} ... not {0,1} as expected; (seq & 1) = {0,1} as expected) We tried 64-bit values for ids (based on twitter's snowflake), but found that time-replay would cause collisions. We had a server have its time corrected, going backwards, by an admin; leading to duplicate ids being generated, leading to a fun day of debugging and a hard lesson about our assumption that time always increases over time. Using node_id doesn't protect against this, since it is the same node creating the colliding ids as the original ids. By extending the ids to include a significant amount of randomness, and requiring a restart of the db for the time value to move backwards (by latching onto the last seen time), we narrow the window for collisions to close enough to zero that winning the lottery is far more likely (http://preshing.com/20110504/hash-collision-probabilities/ has the exact math). We also increase the time scale for id wrap around to long past the likely life expectancy of the software we're building today. -- Clifford Hammerschmidt, P.Eng.