Re: [GENERAL] Table with seemingly duplicated primary key values
Please. Always, ALWAYS, give the PostgreSQL version and O/S when reporting a problem. First, WHAT IS THE POSTGRESQL VERSION? WHAT IS THE O/S? Then try this: select a.ctid, a.id, a.field1, b.ctid, b.id, b.field1 from some_table a, some_table b WHERE a.ctid <> b.ctid AND a.id = b.id; On Tue, Dec 22, 2015 at 4:03 AM, Aleksander Łukaszwrote: > Hi, > > a table in our database with about 3 million rows ended up in a state > where its seems to have duplicated entries (duplicated primary key values): > > # \d some_table; >Table "public.some_table" > Column |Type |Modifiers > > +-+- > id | integer | not null default > nextval('some_table_id_seq'::regclass) > field1 | character varying(40) | > field2 | character varying(128) | > ts | timestamp without time zone | > > Indexes: > "some_table_pkey" PRIMARY KEY, btree (id) > "ix_some_table_field1" btree (field1) > "ix_some_table_field2" btree (field2) > "ix_some_table_ts" btree (ts) > > > # select id, field1, field2 from some_table where field1 is null and > field2 is not null; >id| field1 |field2 > -++-- > 2141750 || some_value2 > (1 row) > > > # select id, field1, field2 from some_table where id = 2141750; >id | field1 |field2 > -+-+-- > 2141750 | some_value1 | some_value2 > (1 row) > > Another way this manifests itself it that running this: > > # update some_table >set field2 = field1 > where >id = 2141750; > > works perfectly fine (but doesn't change the result of the first two > queries above), > but this results in an error: > > # update some_table >set field2 = field1 > where >field1 is not null >and field2 is null >and ts between '2015-12-01' and '2015-12-02'; > > ERROR: duplicate key value violates unique constraint "some_table_pkey" > DETAIL: Key (id)=(2141750) already exists. > > Do you have any idea what could be happening and what measures should be > undertaken to fix this issue? Thanks. > > -- *Melvin Davidson* I reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you.
[GENERAL] Table with seemingly duplicated primary key values
Hi, a table in our database with about 3 million rows ended up in a state where its seems to have duplicated entries (duplicated primary key values): # \d some_table; Table "public.some_table" Column |Type |Modifiers +-+- id | integer | not null default nextval('some_table_id_seq'::regclass) field1 | character varying(40) | field2 | character varying(128) | ts | timestamp without time zone | Indexes: "some_table_pkey" PRIMARY KEY, btree (id) "ix_some_table_field1" btree (field1) "ix_some_table_field2" btree (field2) "ix_some_table_ts" btree (ts) # select id, field1, field2 from some_table where field1 is null and field2 is not null; id| field1 |field2 -++-- 2141750 || some_value2 (1 row) # select id, field1, field2 from some_table where id = 2141750; id | field1 |field2 -+-+-- 2141750 | some_value1 | some_value2 (1 row) Another way this manifests itself it that running this: # update some_table set field2 = field1 where id = 2141750; works perfectly fine (but doesn't change the result of the first two queries above), but this results in an error: # update some_table set field2 = field1 where field1 is not null and field2 is null and ts between '2015-12-01' and '2015-12-02'; ERROR: duplicate key value violates unique constraint "some_table_pkey" DETAIL: Key (id)=(2141750) already exists. Do you have any idea what could be happening and what measures should be undertaken to fix this issue? Thanks.
Re: [GENERAL] Table with seemingly duplicated primary key values
=?UTF-8?Q?Aleksander_=C5=81ukasz?=writes: > a table in our database with about 3 million rows ended up in a state > where its seems to have duplicated entries (duplicated primary key values): > ... > Do you have any idea what could be happening and what measures should be > undertaken to fix this issue? Thanks. Looks like corruption of the primary key index. Try REINDEXing that index. This might fail if there actually are duplicate rows in the table and not just incorrect pointers to them in the index, in which case you'll have to clean up the duplicates somehow and try again till the REINDEX succeeds. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Session Identifiers
Thanks Michael, you are right, that is a very good alternative solution. Oleg On Tue, Dec 22, 2015 at 6:27 AM, Michael Paquierwrote: > On Tue, Dec 22, 2015 at 1:42 AM, Stephen Frost wrote: > > Oleg, > > > > * oleg yusim (olegyu...@gmail.com) wrote: > >> tcp_keepalives_idle = 900 > >> tcp_keepalives_interval=0 > >> tcp_keepalives_count=0 > >> > >> Doesn't terminate connection to database in 15 minutes of inactivity of > >> psql prompt. So, it looks like that would work only for case if network > >> connection is broken and session left hanging. For psql prompt case > looks > >> like pg_terminate_backend() would be the only solution. > > > > Those settings aren't for controlling idle timeout of a connection. > > > > pg_terminate_backend() will work and could be run out of a cronjob. > > Or a background worker if you are using PG >= 9.3: > https://github.com/michaelpq/pg_plugins/tree/master/kill_idle > This has the advantage to not have the cronjob error out should the > server be stopped. That's less error handling to take care of at > frontend level. > -- > Michael >
Re: [GENERAL] Table with seemingly duplicated primary key values
Hi, thanks for your reply. 2015-12-22 16:34 GMT+01:00 Melvin Davidson: > Please. Always, ALWAYS, give the PostgreSQL version and O/S when reporting > a problem. > > First, WHAT IS THE POSTGRESQL VERSION? >WHAT IS THE O/S? > # select version(); version -- PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit (1 row) >Then try this: > select a.ctid, a.id, a.field1, > b.ctid, b.id, b.field1 > from some_table a, > some_table b > WHERE a.ctid <> b.ctid >AND a.id = b.id; > ctid| id| field1 |ctid| id| field1 +-+--++-+-- (79664,59) | 2141750 | | (79772,23) | 2141750 | some_value1 (79772,23) | 2141750 | some_value1 | (79664,59) | 2141750 | (2 rows)
[GENERAL] Shared system resources
Greetings, I'm looking at the following security control right now: *The DBMS must prevent unauthorized and unintended information transfer via shared system resources.* The purpose of this control is to prevent information, including encrypted representations of information, produced by the actions of a prior user/role (or the actions of a process acting on behalf of a prior user/role) from being available to any current user/role (or current process) that obtains access to a shared system resource (e.g., registers, main memory, secondary storage) after the resource has been released back to the information system. Control of information in shared resources is also referred to as object reuse. >From previous discussions, I understood that session in PostgreSQL is closely associated with process, and it is essentially new process for every user connection. In regards to that, my question would be: When user disconnects, process is terminated and all resources are released, does memory, session/process stored information at gets "sanitized" or just released as is? Thanks, Oleg
Re: [GENERAL] Table with seemingly duplicated primary key values
On 12/22/15 1:31 PM, Melvin Davidson wrote: The fact that you have rows with an identical id of 2141750 verifies a corrupted primary index. To correct it, you need to decide which row to keep. So review the results of SELECT * FROM some_table WHERE ctid = (79664,59) OR ctid = (79772,23) DELETE FROM some_table WHERE id = 2147150 AND field1 = {value for field1 of row to delete} AND field2 = {value for field2 of row to delete} ... ... AND fieldn = {value for fieldn of row to delete}; Then REINDEX TABLE some_table; And most importantly: review your logs for hardware errors and your Postgres and filesystem settings for anything dangerous. Index corruption is not normal and indicates the underlying hardware or OS is faulty (or maybe a bug in Postgres, but that's very unlikely). You should also consider turning on page checksums if you haven't already. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Shared system resources
On 12/22/15 6:03 PM, oleg yusim wrote: Absolutely. But we are not talking about that type of data leakage here. We are talking about potential situation when user, who doesn't have access to database, but has (or gained) access to the Linux box DB is installed one and gets his hands on data, database processes stored in memory (memory would be a common resource here). Of far larger concern at that point is unauthorized access to the database files. Basically, if someone gains access to the OS user that Postgres is running as, or to root, it's game-over. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Shared system resources
Jim, Yes, you are right. Generally the security control here is encryption of data at rest (TDE), but PostgreSQL doesn't support it, to my knowledge. I know about that vulnerability, but here I posed the question on different one. I agree it is smaller one, compare to the absence of TDE, but I would like to find out if this gates are opened too or not. Thanks, Oleg On Tue, Dec 22, 2015 at 8:48 PM, Jim Nasbywrote: > On 12/22/15 6:03 PM, oleg yusim wrote: > >> Absolutely. But we are not talking about that type of data leakage here. >> We are talking about potential situation when user, who doesn't have >> access to database, but has (or gained) access to the Linux box DB is >> installed one and gets his hands on data, database processes stored in >> memory (memory would be a common resource here). >> > > Of far larger concern at that point is unauthorized access to the database > files. > > Basically, if someone gains access to the OS user that Postgres is running > as, or to root, it's game-over. > -- > Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX > Experts in Analytics, Data Architecture and PostgreSQL > Data in Trouble? Get it in Treble! http://BlueTreble.com >
Re: [GENERAL] Test disk reliability (or HGST HTS721010A9E630 surprisingly reliable)
On 12/21/15 8:22 AM, Bill Moran wrote: Why? Just because a disk isn't enterprise-grade doesn't mean it has to lie > >about fsync, which is the only thing diskchecker.pl tests for. > > > >I was thinking that since the disk have a 32M write-cache (with not >battery) it would lie to the OS (and postgres) about when data are really >on disk (not in the disk write cache). But maybe that thinking was wrong. There are ways to make on-disk write caches safe without a battery. IIRC some hard drives would use the inertia of the platter (turning the motor into a generator) to write contents out on power-off. You could also use a "super cap". It varies by vendor and product, which is why diskchecker.pl exists. It's even possible that the behavior is configurable ... check to see if the vendor provides a utility for configuring it. Your OS might let you control it too; I know FreeBSD has support for this. (Whether the drive obeys or not is a different matter...) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Shared system resources
John, Absolutely. But we are not talking about that type of data leakage here. We are talking about potential situation when user, who doesn't have access to database, but has (or gained) access to the Linux box DB is installed one and gets his hands on data, database processes stored in memory (memory would be a common resource here). Thanks, Oleg On Tue, Dec 22, 2015 at 5:28 PM, John R Piercewrote: > On 12/22/2015 2:52 PM, oleg yusim wrote: > > > *The DBMS must prevent unauthorized and unintended information transfer > via shared system resources.* > > > > you realize the database *itself* is a shared system resource and of > and by itself has no idea about unauthorized/unintended information > transfer, eg, any user with the proper credentials to connect to the > database can query any tables those credentials are allowed to? > > -- > john r pierce, recycling bits in santa cruz > >
Re: [GENERAL] Table with seemingly duplicated primary key values
BTW, Jim is referring to the O/S logs for hardware errors, not the PostgreSQL logs. Also, another way of deleting the bad row would be DELETE FROM some_table where ctid = '(79664,59)'; or DELETE FROM some_table where ctid = '(79772,23)'; On Tue, Dec 22, 2015 at 9:44 PM, Jim Nasbywrote: > On 12/22/15 1:31 PM, Melvin Davidson wrote: > >> The fact that you have rows with an identical id of 2141750 verifies a >> corrupted primary index. >> To correct it, you need to decide which row to keep. >> >> So review the results of >> SELECT * FROM some_table WHERE ctid = (79664,59) OR ctid = (79772,23) >> >> DELETE FROM some_table >> WHERE id = 2147150 >> AND field1 = {value for field1 of row to delete} >> AND field2 = {value for field2 of row to delete} >> ... >> ... >> AND fieldn = {value for fieldn of row to delete}; >> >> Then >> REINDEX TABLE some_table; >> > > And most importantly: review your logs for hardware errors and your > Postgres and filesystem settings for anything dangerous. Index corruption > is not normal and indicates the underlying hardware or OS is faulty (or > maybe a bug in Postgres, but that's very unlikely). > > You should also consider turning on page checksums if you haven't already. > -- > Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX > Experts in Analytics, Data Architecture and PostgreSQL > Data in Trouble? Get it in Treble! http://BlueTreble.com > -- *Melvin Davidson* I reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you.
Re: [GENERAL] Table with seemingly duplicated primary key values
The fact that you have rows with an identical id of 2141750 verifies a corrupted primary index. To correct it, you need to decide which row to keep. So review the results of SELECT * FROM some_table WHERE ctid = (79664,59) OR ctid = (79772,23) DELETE FROM some_table WHERE id = 2147150 AND field1 = {value for field1 of row to delete} AND field2 = {value for field2 of row to delete} ... ... AND fieldn = {value for fieldn of row to delete}; Then REINDEX TABLE some_table;
Re: [GENERAL] Shared system resources
On Tue, Dec 22, 2015 at 04:52:23PM -0600, oleg yusim wrote: > Greetings, > > I'm looking at the following security control right now: > > The DBMS must prevent unauthorized and unintended information transfer via > shared system resources. > > The purpose of this control is to prevent information, including encrypted > representations of information, produced by the actions of a prior user/role > (or the actions of a process acting on behalf of a prior user/role) from being > available to any current user/role (or current process) that obtains access to > a shared system resource (e.g., registers, main memory, secondary storage) > after the resource has been released back to the information system. Control > of > information in shared resources is also referred to as object reuse. > > From previous discussions, I understood that session in PostgreSQL is closely > associated with process, and it is essentially new process for every user > connection. In regards to that, my question would be: > > When user disconnects, process is terminated and all resources are released, > does memory, session/process stored information at gets "sanitized" or just > released as is? On Linux the memory pages of an exiting process aren't sanitized at exit, however it is impossible(?) for userspace to reallocate them without the kernel first zeroing their contents. It might be possible for root to use some debugging mechanisms to access the freed memory, but certainly there is no mechanism for a non-root user to do so. Regarding PG in particular, I can't speak for any shared internal state that might be maintained after a session has exitted (e.g. in the SHM regions PG maintains). Since that state lives longer than an individual process, it's possible some information leakage could occur that way, but "object reuse", it seems doubtful. David -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Shared system resources
On 12/22/2015 2:52 PM, oleg yusim wrote: /The DBMS must prevent unauthorized and unintended information transfer via shared system resources./ you realize the database *itself* is a shared system resource and of and by itself has no idea about unauthorized/unintended information transfer, eg, any user with the proper credentials to connect to the database can query any tables those credentials are allowed to? -- john r pierce, recycling bits in santa cruz
Re: [GENERAL] Secret Santa List
On 23 December 2015 at 16:49, Lou Duchezwrote: > I have a company with four employees who participate in a Secret Santa > program, where each buys a gift for an employee chosen at random. (For > now, I do not mind if an employee ends up buying a gift for himself.) How > can I make this work with an SQL statement? > > Here is my Secret Santa table: > > -- > create table secretsanta > (giver text, > recipient text, > primary key (giver)); > > insert into secretsanta (giver) values ('Frank'), ('Joe'), ('Steve'), > ('Earl'); > -- > > Here is the SQL statement I am using to populate the "recipient" column: > > -- > update secretsanta set recipient = > ( select giver from secretsanta s2 where not exists (select * from > secretsanta s3 where s3.recipient = s2.giver) order by random() limit 1 ); > -- > > The problem: every time I run this, a single name is chosen at random and > used to populate all the rows. So all four rows will get a recipient of > "Steve" or "Earl" or whatever single name is chosen at random. > Why not generate the required results in a SELECT then update from that. row_number() could allow you to generate a random number to each giver, then we can generate another random number and join to each random number. That'll give you a giver and recipient combination. e.g: select giver,recipient from (select row_number() over (order by random()) rn, giver from secretsanta) g inner join (select row_number() over (order by random()) rn, giver recipient from secretsanta) r on g.rn = r.rn You can then wrap that up in a CTE, something along the lines of: with cte (giver, recipient) as ( select giver,recipient from (select row_number() over (order by random()) rn, giver from secretsanta) g inner join (select row_number() over (order by random()) rn, giver recipient from secretsanta) r on g.rn = r.rn ) update secretsanta set recipient = cte.recipient from cte WHERE cte.giver = secretsanta.giver; -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
[GENERAL] Secret Santa List
I have a company with four employees who participate in a Secret Santa program, where each buys a gift for an employee chosen at random. (For now, I do not mind if an employee ends up buying a gift for himself.) How can I make this work with an SQL statement? Here is my Secret Santa table: -- create table secretsanta (giver text, recipient text, primary key (giver)); insert into secretsanta (giver) values ('Frank'), ('Joe'), ('Steve'), ('Earl'); -- Here is the SQL statement I am using to populate the "recipient" column: -- update secretsanta set recipient = ( select giver from secretsanta s2 where not exists (select * from secretsanta s3 where s3.recipient = s2.giver) order by random() limit 1 ); -- The problem: every time I run this, a single name is chosen at random and used to populate all the rows. So all four rows will get a recipient of "Steve" or "Earl" or whatever single name is chosen at random. I suppose the problem is that the "exists" subquery does not re-evaluate for each record. How do I prevent this from happening? Can I use a "lateral" join of some kind, or somehow tell PostgreSQL to not be so optimized? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] BDR error while adding 3rd node to cluster
Hi Craig, I remove all the bdr packages and reinstall it and setup again the BDR cluster, still facing the same issue on 3rd node. On Tue, Dec 22, 2015 at 10:58 AM, Amit Bondwalwrote: > > On Tue, Dec 22, 2015 at 10:05 AM, Craig Ringer > wrote: > >> select * from bdr.bdr_connections; >> > > > Hi Craig, Thanks for your reply, These commands shows nothing on 3rd node. > > > > *on node3:-*hakuna=# select * from bdr.bdr_nodes; > node_sysid | node_timeline | node_dboid | node_status | node_name | > node_local_dsn | node_init_from_dsn > > +---++-+---++ > (0 rows) > > conn_sysid | conn_timeline | conn_dboid | conn_origin_sysid | > conn_origin_timeline | conn_origin_dboid | conn_is_unidirectional | > conn_dsn | conn_apply_delay | conn_replication_sets > > +---++---+--+---++--+--+--- > (0 rows) > > > > *on node1:-* > > hakuna=# select * from bdr.bdr_nodes; > node_sysid | node_timeline | node_dboid | node_status | > node_name | node_local_dsn | > node_init_from_dsn > > -+---++-+---+--+-- > 6229651184159874988 | 1 | 18719 | r | > node1 | host=pg1 port=5432 dbname=hakuna | > 6229651217067355961 | 1 | 17161 | r | > node2 | host=pg2 port=5432 dbname=hakuna | host=pg1 port=5432 > dbname=hakuna > (2 rows) > > It is conflicting with ID of node 2 as it shows output of above command on > node 1. > > My OS is debian Jessie and I installed it from bdr repo. Current version > of bdr is 0.9.3 > > >
Re: [GENERAL] Session Identifiers
On Tue, Dec 22, 2015 at 1:42 AM, Stephen Frostwrote: > Oleg, > > * oleg yusim (olegyu...@gmail.com) wrote: >> tcp_keepalives_idle = 900 >> tcp_keepalives_interval=0 >> tcp_keepalives_count=0 >> >> Doesn't terminate connection to database in 15 minutes of inactivity of >> psql prompt. So, it looks like that would work only for case if network >> connection is broken and session left hanging. For psql prompt case looks >> like pg_terminate_backend() would be the only solution. > > Those settings aren't for controlling idle timeout of a connection. > > pg_terminate_backend() will work and could be run out of a cronjob. Or a background worker if you are using PG >= 9.3: https://github.com/michaelpq/pg_plugins/tree/master/kill_idle This has the advantage to not have the cronjob error out should the server be stopped. That's less error handling to take care of at frontend level. -- Michael -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] uuid-ossp: Performance considerations for different UUID approaches?
On Tue, 22 Dec 2015 11:07:30 -0600 Brendan McCollamwrote: > (previously posted to the pgsql-performance list) > > Hello, > > We're in the process of designing the database for a new service, and > some of our tables are going to be using UUID primary key columns. > > We're trying to decide between: > > * UUIDv1 (timestamp/MAC uuid) and > > * UUIDv4 (random uuid) > > And the separate but related choice between: > > * Generating the UUIDs client-side with the Python uuid library > (https://docs.python.org/2/library/uuid.html) or > > * Letting PostgreSQL handle uuid creation with the uuid-ossp extension > (http://www.postgresql.org/docs/9.4/static/uuid-ossp.html) > > In terms of insert and indexing/retrieval performance, is there one > clearly superior approach? If not, could somebody speak to the > performance tradeoffs of different approaches? > > There seem to be sources online (e.g. > https://blog.starkandwayne.com/2015/05/23/uuid-primary-keys-in-postgresql/ > http://rob.conery.io/2014/05/29/a-better-id-generator-for-postgresql/) > that claim that UUIDv4 (random) will lead to damaging keyspace > fragmentation and using UUIDv1 will avoid this. There's no substance to these claims. Chasing the links around we finally find this article: http://www.sqlskills.com/blogs/kimberly/guids-as-primary-keys-andor-the-clustering-key/ which makes the reasonable argument that random primary keys can cause performance robbing fragmentation on clustered indexes. But Postgres doesn't _have_ clustered indexes, so that article doesn't apply at all. The other authors appear to have missed this important point. One could make the argument that the index itself becomming fragmented could cause some performance degredation, but I've yet to see any convincing evidence that index fragmentation produces any measurable performance issues (my own experiments have been inconclusive). Looking at it another way, a quick experiment shows that PG can fit about 180 UUID primary keys per database page, which means a million row table will use about 5600 pages to the tune of about 46m. On modern hardware, that index is likely to be wholly in memory all the time. If your performance requirements are really so dire, then you should probably consider ditching UUIDs as keys. Taking the same million row table I postulated in the previous paragraph, but using ints insted of UUIDs for the primary key, the primary key index would be about 3200 pages (~26m) ... or almost 1/2 the size -- making it more likely to all be in memory at any point in time. I seriously doubt that trying to make your UUIDs generate in a predictable fashon will produce any measurable improvement, and I see no evidence in the articles you cited that claims otherwise have any real basis or were made by anyone knowledgeable enough to know. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] uuid-ossp: Performance considerations for different UUID approaches?
(previously posted to the pgsql-performance list) Hello, We're in the process of designing the database for a new service, and some of our tables are going to be using UUID primary key columns. We're trying to decide between: * UUIDv1 (timestamp/MAC uuid) and * UUIDv4 (random uuid) And the separate but related choice between: * Generating the UUIDs client-side with the Python uuid library (https://docs.python.org/2/library/uuid.html) or * Letting PostgreSQL handle uuid creation with the uuid-ossp extension (http://www.postgresql.org/docs/9.4/static/uuid-ossp.html) In terms of insert and indexing/retrieval performance, is there one clearly superior approach? If not, could somebody speak to the performance tradeoffs of different approaches? There seem to be sources online (e.g. https://blog.starkandwayne.com/2015/05/23/uuid-primary-keys-in-postgresql/ http://rob.conery.io/2014/05/29/a-better-id-generator-for-postgresql/) that claim that UUIDv4 (random) will lead to damaging keyspace fragmentation and using UUIDv1 will avoid this. Is that true? If so, does that mean UUIDv1 must always be generated on the same machine (with same MAC address) in order to benefit from the better clustering of UUIDs? What about uuid_generate_v1mc() in uuid-ossp? Not exposing a server's real MAC address seems like a good security feature, but does it compromise the clustering properties of UUIDv1? Thanks in advance, Brendan McCollam -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general