Re: [GENERAL] Why does splitting $PGDATA and xlog yield a performance benefit?

2015-08-25 Thread David Kerr
On Tue, Aug 25, 2015 at 10:16:37AM PDT, Andomar wrote:
 However, I know from experience that's not entirely true, (although it's not 
 always easy to measure all aspects of your I/O bandwith).
 
 Am I missing something?
 
 Two things I can think of:
 
 Transaction writes are entirely sequential.  If you have disks
 assigned for just this purpose, then the heads will always be in the
 right spot, and the writes go through more quickly.
 
 A database server process waits until the transaction logs are
 written and then returns control to the client. The data writes can
 be done in the background while the client goes on to do other
 things.  Splitting up data and logs mean that there is less chance
 the disk controller will cause data writes to interfere with log
 files.
 
 Kind regards,
 Andomar
 

hmm, yeah those are both what I'd lump into I/O bandwith. 
If your disk subsystem is fast enough, or you're on a RAIDd SAN 
or EBS you'd either overcome that, or not neccssarily be able to.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why does splitting $PGDATA and xlog yield a performance benefit?

2015-08-25 Thread Bill Moran
On Tue, 25 Aug 2015 10:08:48 -0700
David Kerr d...@mr-paradox.net wrote:

 Howdy All,
 
 For a very long time I've held the belief that splitting PGDATA and xlog on 
 linux systems fairly universally gives a decent performance benefit for many 
 common workloads.
 (i've seen up to 20% personally).
 
 I was under the impression that this had to do with regular fsync()'s from 
 the WAL 
 interfearing with and over-reaching writing out the filesystem buffers. 
 
 Basically, I think i was conflating fsync() with sync(). 
 
 So if it's not that, then that just leaves bandwith (ignoring all of the 
 other best practice reasons for reliablity, etc.). So, in theory if you're 
 not swamping your disk I/O then you won't really benefit from relocating your 
 XLOGs.

Disk performance can be a bit more complicated than just swamping. Even if
you're not maxing out the IO bandwidth, you could be getting enough that some
writes are waiting on other writes before they can be processed. Consider the
fact that old-style ethernet was only able to hit ~80% of its theoretical
capacity in the real world, because the chance of collisions increased with
the amount of data, and each collision slowed down the overall transfer speed.
Contrasted with modern ethernet that doesn't do collisions, you can get much
closer to 100% of the rated bandwith because the communications are effectively
partitioned from each other.

In the worst case scenerion, if two processes (due to horrible luck) _always_
try to write at the same time, the overall responsiveness will be lousy, even
if the bandwidth usage is only a small percent of the available. Of course,
that worst case doesn't happen in actual practice, but as the usage goes up,
the chance of hitting that interference increases, and the effective response
goes down, even when there's bandwidth still available.

Separate the competing processes, and the chance of conflict is 0. So your
responsiveness is pretty much at best-case all the time.

 However, I know from experience that's not entirely true, (although it's not 
 always easy to measure all aspects of your I/O bandwith).
 
 Am I missing something?

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Berend Tober

Melvin Davidson wrote:

9.
 1) What happens if someone mis-types the account-id?
  To correct that, you also need to correct the FK field in the
other dozen tables.
 2) What happens when your company starts a new project (or buys a
I would not consider the general use of natural primary keys to be
best practice.
Let's assume your account_id field is used as a foreign key in a
dozen other tables.
1) What happens if someone mis-types the account-id?
  To correct that, you also need to correct the FK field in the
other dozen tables.



... ON UPDATE CASCADE ?



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [pgsql-general] Daily digest v1.13732 (15 messages)

2015-08-25 Thread Marc Munro
On Tue, 2015-08-25 at 15:41 +,  Neil Tiffin ne...@neiltiffin.com
wrote:

 I really like the standardization that PostgreSQL uses in auto
 generating default names.  The rule I use is to always use the auto
 generated names unless the object is referenced routinely in code. In
 most cases developers don’t care about index, unique, foreign key, or
 primary key names (from a coding standpoint) so why should they be
 creating the names. Since the postgresql standard uses auto generated
 names with ‘_pkey’ for PRIMARY KEY  ‘_fkey’ for FOREIGN KEY, and
 ‘_key’ for UNIQUE, why not use the same rules for consistency?  So I
 disagree with 6 and would extend 10 to include these other names if
 they are manually generated.

I prefer to take control of names in order to be certain that on
multiple database instances, the names will *always* be the same.  This
allows schema-diff tools (like my own skit) to provide more useful
results.  Although, as you point out, Postgres does a pretty good job of
naming things, there are (or at least have been in the past) cases where
names have not been predictable.

Furthermore, a policy of explicit naming seems to me a relatively light
burden on a developer or DBA, and one that may even lead to more thought
being applied during database object design.  If the developer has to
think of a name, they may be more inclined to think more deeply about
the purpose of that object.

For the record, I favour using a double underscore to separate the
table_name part of constraints, etc from any other parts of the name.
So:

account__name_idx would be an index on the name field of the
accounts table;

account_name__pk would be a primary key on the account_names
table.

It's a personal preference and works for me, your mileage may vary.

__
Marc



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Melvin Davidson
Adrian,

Stop being so technical. When we/I speak of natural keys, we are talking
about the column
that would NATURALly lend itself as the primary key.
No one ever said a number is not natural. just that there is no need to
duplicate uniqueness
with a separate number.

IOW: If we have an account table, then the account_id or account_no
 would be the primary key. There is no need to have a separate serial
id as the primary key.

 Likewise, if we have a car table, then registration (or vehicle_id) is
preferred.

 EG: Good
 CREATE TABLE car
 (
  registration_no varchar(30) not null,
  car_make   varchar(25) not null,
  model  varchar(15) not null;
  build_year date not null;
  owner  varchar(50),
  CONSTRAINT car_pk PRIMARY KEY (registration_no)
  );

 bad
 CREATE TABLE car
 (
  id  serial not null,
  registration_no varchar(30) not null,
  car_make   varchar(25) not null,
  model  varchar(15) not null;
  build_year date not null;
  owner  varchar(50),
  CONSTRAINT car_pk PRIMARY KEY (id)
  );

The benefit in avoiding arbitrary and simple values for the key is that it
makes
the database design much more logical.

Consider:
SELECT c.registration_no,
   c.car_make,
   p.part_no
   FROM car c
   JOIN parts p ON ( p.registration_no = c.registration_no)
 WHERE registration_no = some_var;

 versus:
 SELECT c.registration_no,
   c.car_make,
   p.part_no
   FROM car c
   JOIN parts p ON ( p.id = c.id)
 WHERE registration_no = some_var;

 Why join on id when registration_no is better?


On Tue, Aug 25, 2015 at 10:17 AM, Adrian Klaver adrian.kla...@aklaver.com
wrote:

 On 08/24/2015 08:44 PM, Rob Sargent wrote:


 On Aug 24, 2015, at 6:53 PM, Melvin Davidson melvin6...@gmail.com
 wrote:

 You are right, he was probably talking about FK's. I was just so
 frustrated about people insisting that using ID as the primary key in
 every table is a good idea,
 I didn't bother to reply previously. I stand firm on my belief that the
 primary key should be something meaningful and NOT id just for the sake
 of having a unique numeric key.

 What, pray tell, is the unique natural key of person in any meaningfully
 large domain such as state? Certainly not name + birthdate.  Current
 address isn’t guaranteed. Social isn’t reliable and actually not truly
 unique.


 To add:

 1) Who determined that a number is not natural?

 2) One of the older unique natural keys (genus, species) is not so unique.
 I am a fisheries biologist by training and in my time the 'unique'
 identifier for various fishes has changed. Now that ichthyologists have
 discovered DNA testing, it can be expected there will be even more changes.
 This is even more apparent when you go back in in history. As an example:

 https://en.wikipedia.org/wiki/Rainbow_trout

 Rainbow trout

 Current

 Oncorhynchus mykiss

 Past

 Salmo mykiss Walbaum, 1792
 Parasalmo mykiss (Walbaum, 1792)
 Salmo purpuratus Pallas, 1814
 Salmo penshinensis Pallas, 1814
 Parasalmo penshinensis (Pallas, 1814)
 Salmo gairdnerii Richardson, 1836  --The one I learned.
 Fario gairdneri (Richardson, 1836)
 Oncorhynchus gairdnerii (Richardson, 1836)
 Salmo gairdnerii gairdnerii Richardson, 1836
 Salmo rivularis Ayres, 1855
 Salmo iridea Gibbons, 1855
 Salmo gairdnerii irideus Gibbons, 1855
 Salmo irideus Gibbons, 1855
 Trutta iridea (Gibbons, 1855)
 Salmo truncatus Suckley, 1859
 Salmo masoni Suckley, 1860
 Oncorhynchus kamloops Jordan, 1892
 Salmo kamloops (Jordan, 1892)
 Salmo rivularis kamloops (Jordan, 1892)
 Salmo gairdneri shasta Jordan, 1894
 Salmo gilberti Jordan, 1894
 Salmo nelsoni Evermann, 1908


 All the above point to the same fish and have appeared and appear in
 articles and reports about said fish. Lets not even get into the common
 name situation:).


 Even given that there are models which are made of entities with
 legitimate attributes which per force define a unique instance, I see no
 benefit in avoiding the convenience of an arbitrary and simple value for
 the key.  Is it the overhead of generating and storing one more value per
 tuple that you can’t abide?






 --
 Adrian Klaver
 adrian.kla...@aklaver.com




-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread David G. Johnston
On Tue, Aug 25, 2015 at 11:40 AM, Melvin Davidson melvin6...@gmail.com
wrote:

 Consider:
 SELECT c.registration_no,
c.car_make,
p.part_no
FROM car c
JOIN parts p ON ( p.registration_no = c.registration_no)
  WHERE registration_no = some_var;

  versus:
  SELECT c.registration_no,
c.car_make,
p.part_no
FROM car c
JOIN parts p ON ( p.id = c.id)
  WHERE registration_no = some_var;

  Why join on id when registration_no is better?


​I believe you are mistaken if you think there are absolute rules you can
cling to here.  But even then I would lean toward calling primary keys an
internal implementation detail that should be under the full control of the
database in which they are directly used.  Artifical natural keys I would
lean toward turning into, possibly unique, attributes.  Inherent natural​

​keys get some consideration for using directly.

The issue arise more, say, in a many-to-many situation.  Do you define the
PK of the linking table as a two-column composite key or do you introduce
​a third, serial, field to stand in for the pair?

David J.


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Joshua D. Drake

On 08/25/2015 09:09 AM, Rob Sargent wrote:

On 08/25/2015 09:40 AM, Melvin Davidson wrote:

Adrian,

Stop being so technical. When we/I speak of natural keys, we are
talking about the column
that would NATURALly lend itself as the primary key.
No one ever said a number is not natural. just that there is no need
to duplicate uniqueness
with a separate number.

IOW: If we have an account table, then the account_id or account_no
 would be the primary key. There is no need to have a separate
serial id as the primary key.

If I'm following correctly, you're saying that if the definition of the
entity contains and arbitrary unique value then use that. Fine.  I guess
I quibble with the notion of VIN as a natural attribute of car.  (I
have no firsthand experience with VINs but I would bet there's
information tucked inside them, which would make me sceptical of using
them :) )



But a VIN is in fact, UNIQUE so it is useful as a PK.

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing I'm offended is basically telling the world you can't
control your own emotions, so everyone else should do it for you.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread David G. Johnston
On Tue, Aug 25, 2015 at 12:09 PM, Rob Sargent robjsarg...@gmail.com wrote:

 On 08/25/2015 09:40 AM, Melvin Davidson wrote:

 Adrian,

 Stop being so technical. When we/I speak of natural keys, we are talking
 about the column
 that would NATURALly lend itself as the primary key.
 No one ever said a number is not natural. just that there is no need to
 duplicate uniqueness
 with a separate number.

 IOW: If we have an account table, then the account_id or account_no
  would be the primary key. There is no need to have a separate serial
 id as the primary key.

 If I'm following correctly, you're saying that if the definition of the
 entity contains and arbitrary unique value then use that. Fine.  I guess I
 quibble with the notion of VIN as a natural attribute of car.  (I have no
 firsthand experience with VINs but I would bet there's information tucked
 inside them, which would make me sceptical of using them :) )


​Yes, the VIN is an encoding in a similar fashion to how Object IDs
function in the computer world:
http://www.zytrax.com/books/ldap/apa/oid.html

The problem with using a VIN is a combination of usability during manual
entry - even with the checksum feature - and the fact that only physically
produced vehicles are assigned one but both manufacturers and their dealers
end up dealing with the concept of a vehicle before one is ever produced.
Neither are overly problematic but they are annoying enough that usually
additional identifiers are constructed an used by the business in order to
avoid having to see the VIN as anything other than an attribute.  The
length itself is also problematic - 17 characters typically is a bit much
when the user likely only care about thousands or tens of thousands of
entities at any given time.

David J.


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Karsten Hilbert
 No one ever said a number is not natural. just that there is no need to 
 duplicate uniqueness
 with a separate number.

The whole point is that people are telling you that surrogate keys do not 
_duplicate_ uniqueness but
rather _generate_ it, artificially, and therefore reliably.

Today's external uniqueness is ambiguous tomorrow.

Karsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Adrian Klaver

On 08/24/2015 08:44 PM, Rob Sargent wrote:



On Aug 24, 2015, at 6:53 PM, Melvin Davidson melvin6...@gmail.com wrote:

You are right, he was probably talking about FK's. I was just so frustrated about people insisting 
that using ID as the primary key in every table is a good idea,
I didn't bother to reply previously. I stand firm on my belief that the primary key 
should be something meaningful and NOT id just for the sake of having a 
unique numeric key.


What, pray tell, is the unique natural key of person in any meaningfully large 
domain such as state? Certainly not name + birthdate.  Current address isn’t 
guaranteed. Social isn’t reliable and actually not truly unique.


To add:

1) Who determined that a number is not natural?

2) One of the older unique natural keys (genus, species) is not so 
unique. I am a fisheries biologist by training and in my time the 
'unique' identifier for various fishes has changed. Now that 
ichthyologists have discovered DNA testing, it can be expected there 
will be even more changes. This is even more apparent when you go back 
in in history. As an example:


https://en.wikipedia.org/wiki/Rainbow_trout

Rainbow trout

Current

Oncorhynchus mykiss

Past

Salmo mykiss Walbaum, 1792
Parasalmo mykiss (Walbaum, 1792)
Salmo purpuratus Pallas, 1814
Salmo penshinensis Pallas, 1814
Parasalmo penshinensis (Pallas, 1814)
Salmo gairdnerii Richardson, 1836  --The one I learned.
Fario gairdneri (Richardson, 1836)
Oncorhynchus gairdnerii (Richardson, 1836)
Salmo gairdnerii gairdnerii Richardson, 1836
Salmo rivularis Ayres, 1855
Salmo iridea Gibbons, 1855
Salmo gairdnerii irideus Gibbons, 1855
Salmo irideus Gibbons, 1855
Trutta iridea (Gibbons, 1855)
Salmo truncatus Suckley, 1859
Salmo masoni Suckley, 1860
Oncorhynchus kamloops Jordan, 1892
Salmo kamloops (Jordan, 1892)
Salmo rivularis kamloops (Jordan, 1892)
Salmo gairdneri shasta Jordan, 1894
Salmo gilberti Jordan, 1894
Salmo nelsoni Evermann, 1908


All the above point to the same fish and have appeared and appear in 
articles and reports about said fish. Lets not even get into the common 
name situation:).




Even given that there are models which are made of entities with legitimate 
attributes which per force define a unique instance, I see no benefit in 
avoiding the convenience of an arbitrary and simple value for the key.  Is it 
the overhead of generating and storing one more value per tuple that you can’t 
abide?







--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why this lock?

2015-08-25 Thread Johann Spies
On 25 August 2015 at 15:52, Merlin Moncure mmonc...@gmail.com wrote:

 h...

 creating and index requires exclusive access.  did you try the
 concurrent variant?


Yes. The one which I stopped after 5 days, was running concurrently.  There
was a similar lock involved.

Regards
Johann

-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


Re: [GENERAL] Why this lock?

2015-08-25 Thread Tom Lane
Johann Spies johann.sp...@gmail.com writes:
 On 25 August 2015 at 15:52, Merlin Moncure mmonc...@gmail.com wrote:
 creating and index requires exclusive access.  did you try the
 concurrent variant?

 Yes. The one which I stopped after 5 days, was running concurrently.  There
 was a similar lock involved.

That lock type is used by CREATE INDEX CONCURRENTLY when it has to wait
out another transaction.  There is no way around this, it's an inherent
part of that algorithm.

http://www.postgresql.org/docs/9.4/static/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Adam Brusselback
Personally I always set the natural key with a not null and unique
constraint, but create an artificial key for it as well.  As an example, if
we had a product table, the product_sku is defined as not null with a
unique constraint on it, while product_id is the primary key which all
other tables reference as a foreign key.

In the case of a many to many situation, I prefer to use a two column
composite key.  In the case of a many to many, i've never run into a case
where I needed to reference a single row in that table without knowing
about both sides of that relation.

Just my $0.02
-Adam

On Tue, Aug 25, 2015 at 12:15 PM, David G. Johnston 
david.g.johns...@gmail.com wrote:

 On Tue, Aug 25, 2015 at 11:40 AM, Melvin Davidson melvin6...@gmail.com
 wrote:

 Consider:
 SELECT c.registration_no,
c.car_make,
p.part_no
FROM car c
JOIN parts p ON ( p.registration_no = c.registration_no)
  WHERE registration_no = some_var;

  versus:
  SELECT c.registration_no,
c.car_make,
p.part_no
FROM car c
JOIN parts p ON ( p.id = c.id)
  WHERE registration_no = some_var;

  Why join on id when registration_no is better?


 ​I believe you are mistaken if you think there are absolute rules you can
 cling to here.  But even then I would lean toward calling primary keys an
 internal implementation detail that should be under the full control of the
 database in which they are directly used.  Artifical natural keys I would
 lean toward turning into, possibly unique, attributes.  Inherent natural​

 ​keys get some consideration for using directly.

 The issue arise more, say, in a many-to-many situation.  Do you define the
 PK of the linking table as a two-column composite key or do you introduce
 ​a third, serial, field to stand in for the pair?

 David J.




Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Melvin Davidson
Consistency in naming convention.

Good suggestion!

On Tue, Aug 25, 2015 at 12:33 PM, Marc Munro marc.mu...@gmail.com wrote:

 On Sat, 2015-08-22 at 15:15 +, Melvin Davidson wrote:

  I've been searching for a PostgreSQL Developer Best Practices with not
  much luck,
  so I've started my own. At the risk of stirring up a storm of
 controversy,
  I would appreciate additional suggestions and feedback.
 

 You might add: Create all relation names as plurals.

 Or, if your site uses predominantly singular names, make that the
 standard.  Consistency within the site is more important than any
 dogmatic belief about whether singular or plural forms is better.  If
 you don't put it in the standard, someone will eventually create tables
 with names that don't gel with everything else.

 __
 Marc






-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Rob Sargent

On 08/25/2015 09:40 AM, Melvin Davidson wrote:

Adrian,

Stop being so technical. When we/I speak of natural keys, we are 
talking about the column

that would NATURALly lend itself as the primary key.
No one ever said a number is not natural. just that there is no need 
to duplicate uniqueness

with a separate number.

IOW: If we have an account table, then the account_id or account_no
 would be the primary key. There is no need to have a separate 
serial id as the primary key.
If I'm following correctly, you're saying that if the definition of the 
entity contains and arbitrary unique value then use that. Fine.  I guess 
I quibble with the notion of VIN as a natural attribute of car.  (I 
have no firsthand experience with VINs but I would bet there's 
information tucked inside them, which would make me sceptical of using 
them :) )



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Why does splitting $PGDATA and xlog yield a performance benefit?

2015-08-25 Thread David Kerr
Howdy All,

For a very long time I've held the belief that splitting PGDATA and xlog on 
linux systems fairly universally gives a decent performance benefit for many 
common workloads.
(i've seen up to 20% personally).

I was under the impression that this had to do with regular fsync()'s from the 
WAL 
interfearing with and over-reaching writing out the filesystem buffers. 

Basically, I think i was conflating fsync() with sync(). 

So if it's not that, then that just leaves bandwith (ignoring all of the other 
best practice reasons for reliablity, etc.). So, in theory if you're not 
swamping your disk I/O then you won't really benefit from relocating your XLOGs.

However, I know from experience that's not entirely true, (although it's not 
always easy to measure all aspects of your I/O bandwith).

Am I missing something?

Thanks


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why does splitting $PGDATA and xlog yield a performance benefit?

2015-08-25 Thread Andomar

However, I know from experience that's not entirely true, (although it's not 
always easy to measure all aspects of your I/O bandwith).

Am I missing something?


Two things I can think of:

Transaction writes are entirely sequential.  If you have disks assigned 
for just this purpose, then the heads will always be in the right spot, 
and the writes go through more quickly.


A database server process waits until the transaction logs are written 
and then returns control to the client. The data writes can be done in 
the background while the client goes on to do other things.  Splitting 
up data and logs mean that there is less chance the disk controller will 
cause data writes to interfere with log files.


Kind regards,
Andomar


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why this lock?

2015-08-25 Thread Merlin Moncure
On Tue, Aug 25, 2015 at 8:33 AM, Johann Spies johann.sp...@gmail.com wrote:
 I have a long-running query (running now for more than 6 days already
 (process 17434).  It involves three tables of which one contains XML-data.

 On another, unrelated table with  30718567 records, I ran a query to create
 an index on a field.  This morning I cancelled this process because it did
 not finish after 5 days.

 I then did a vacuum analyze on that table and rerun the query (process
 9732) to create the index.  It soon stalled again and the following result
 shows that proces 17434 is blocking it:

 locktype  | database | relation | page | tuple | virtualxid | transactionid
 | classid | objid | objsubid | virtualtransaction | pid  |   mode|
 granted | fastpath | virtualtransaction |  pid  | mode  | granted
 +--+--+--+---++---+-+---+--++--+---+-+--++---+---+-
  virtualxid |  |  |  |   | 6/24891|
 | |   |  | 7/27906| 9732 | ShareLock | f
 | f| 6/24891| 17434 | ExclusiveLock | t


 Now my questions:

 What would cause such a lock?
 What can I do to remove the lock without stopping the long-running process
 which started before the query to create the index?  I suppose I just have
 to wait for the first process to finish...

creating and index requires exclusive access.  did you try the
concurrent variant?

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread rob stone
On Tue, 2015-08-25 at 20:17 -0400, Melvin Davidson wrote:
 I think a lot of people here are missing the point. I was trying to 
 give examples of natural keys, but a lot of people are taking great 
 delight 
 in pointing out exceptions to examples, rather than understanding the 
 point.
 So for the sake of argument, a natural key is something that in 
 itself is unique and the possibility of a duplicate does not exist.
 Before ANYONE continues to insist that a serial id column is good, 
 consider the case where the number of tuples will exceed a bigint.
 Don't say it cannot happen, because it can.
 However, if you have an alphanumeric field, let's say varchar 50, and 
 it's guaranteed that it will never have a duplicate, then THAT is a 
 natural primary 
 key and beats the hell out of a generic id field.
 
 Further to the point, since I started this thread, I am holding to it 
 and will not discuss natural primary keys any further.
 
 Other suggestions for good PostgreSQL Developer database (not web 
 app) guidelines are still welcome.
 

Funny how Melvin's attempt to bring order to the chaos ended up as a
discussion about primary keys.

We once hired a genius to design an application to handle fixed
assets. Every table had a primary key named id. Some were integer and
some were character. So the foreign key columns in child tables had to
be named differently. Writing the joins was complex.

I also know of an airline reservation system where you are unable to
alter your e-mail address. It apparently needs a DBA type person to
make the change. I can only guess that your e-mail address is used as a
foreign key in one or more tables. As well as assigning you a frequent
flyer number they also assign another integer identifier. A bit of
common sense goes a long way when designing an application.

Cheers,
rob



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Grouping sets, cube and rollup

2015-08-25 Thread Guillaume Lelarge
Le 26 août 2015 2:06 AM, Edson Richter rich...@simkorp.com.br a écrit :

 Any chance to get those amazing wonderful features backported to 9.4?


No. Only bug fixes are backported.

-- 
Guillaume.


Re: [GENERAL] Problem with pl/python procedure connecting to the internet

2015-08-25 Thread Igor Sosa Mayor
Igor Sosa Mayor joseleopoldo1...@gmail.com writes:

 Igor Sosa Mayor joseleopoldo1...@gmail.com writes:

 My question is therefore:
 1. is there a way to permit the pl/python to connect to the internet all
the time and with a better configuration?
 2. or should I forget the procedure and write a python script outside
the database?

I'm again with the last update. The problem seems to be that for some
reason PG does not realize that there is a network connection.

More precisely:
1. I start the computer
2. PG starts and I can use it normally, EXCEPT from the plpython
   procedure which does not connect
3. then I restart PG (with systemd) and it works. PG seems to see that
   there is a connection.

Maybe it has something to do with systemd? I'm trying to get all debug
information, but I can't see anything strange in the log. In order not
to full the list with an attachment, I post the log here:
http://pastie.org/10373991

Any ideas?

Thanks in advance!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem with pl/python procedure connecting to the internet

2015-08-25 Thread Igor Sosa Mayor
Adrian Klaver adrian.kla...@aklaver.com writes:

 But I see that the log options in PG are really rich. Could maybe
 someone tell me which could be the best options to find the problem? I
 will be offline now during 24h, but I will try to make some experiments
 in the meantime.

 From here:
 http://www.postgresql.org/docs/9.4/interactive/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT

thanks. I answered you indirectly in the other email.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem with pl/python procedure connecting to the internet

2015-08-25 Thread Igor Sosa Mayor
Igor Sosa Mayor joseleopoldo1...@gmail.com writes:

 Maybe it has something to do with systemd? I'm trying to get all debug

A little more information: the unit of postgresql in my systemd looks
like this[1]. That means, it is started of course after the network (but
maybe there is not any connection avalaible?)

[1]
[Unit]
Description=PostgreSQL database server
After=network.target

[Service]
Type=forking
TimeoutSec=120
User=postgres
Group=postgres

Environment=PGROOT=/var/lib/postgres

SyslogIdentifier=postgres
PIDFile=/var/lib/postgres/data/postmaster.pid

ExecStartPre=/usr/bin/postgresql-check-db-dir ${PGROOT}/data
ExecStart= /usr/bin/pg_ctl -s -D ${PGROOT}/data start -w -t 120
ExecReload=/usr/bin/pg_ctl -s -D ${PGROOT}/data reload
ExecStop=  /usr/bin/pg_ctl -s -D ${PGROOT}/data stop -m fast

# Due to PostgreSQL's use of shared memory, OOM killer is often overzealous in
# killing Postgres, so adjust it downward
OOMScoreAdjust=-200

# Additional security-related features
PrivateTmp=true
ProtectSystem=full
NoNewPrivileges=true

[Install]
WantedBy=multi-user.target

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem with pl/python procedure connecting to the internet

2015-08-25 Thread Igor Sosa Mayor
Dave Potts dave.po...@pinan.co.uk writes:

 In cases like this I normally restart the progresql under strace/truss
 etc and then wade through the output, it will normally tell me which
 process was invoked.

Thanks for the hint. I answered you indirectly in other email.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Karsten Hilbert
On Tue, Aug 25, 2015 at 02:02:17PM +1200, Gavin Flower wrote:

 On 25/08/15 01:15, Ray Cote wrote:
 On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert karsten.hilb...@gmx.net
 mailto:karsten.hilb...@gmx.net wrote:
 
 [...]
 
 9. Do NOT arbitrarily assign an id column to a table as a
 primary key when other columns
 are perfectly suited as a unique primary key.
 
 ...
 
   Good example:
 CREATE TABLE accounts
 ( accout_id bigint NOT NULL ,
 
 
 I would not consider the general use of natural primary keys to be best
 practice.

Gavin, Ray,

I certainly didn't write any of the above.

Karsten
-- 
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why does splitting $PGDATA and xlog yield a performance benefit?

2015-08-25 Thread David Kerr

 On Aug 25, 2015, at 10:45 AM, Bill Moran wmo...@potentialtech.com wrote:
 
 On Tue, 25 Aug 2015 10:08:48 -0700
 David Kerr d...@mr-paradox.net wrote:
 
 Howdy All,
 
 For a very long time I've held the belief that splitting PGDATA and xlog on 
 linux systems fairly universally gives a decent performance benefit for many 
 common workloads.
 (i've seen up to 20% personally).
 
 I was under the impression that this had to do with regular fsync()'s from 
 the WAL 
 interfearing with and over-reaching writing out the filesystem buffers. 
 
 Basically, I think i was conflating fsync() with sync(). 
 
 So if it's not that, then that just leaves bandwith (ignoring all of the 
 other best practice reasons for reliablity, etc.). So, in theory if you're 
 not swamping your disk I/O then you won't really benefit from relocating 
 your XLOGs.
 
 Disk performance can be a bit more complicated than just swamping. Even if

Funny, on revision of my question, I left out basically that exact line for 
simplicity sake. =)

 you're not maxing out the IO bandwidth, you could be getting enough that some
 writes are waiting on other writes before they can be processed. Consider the
 fact that old-style ethernet was only able to hit ~80% of its theoretical
 capacity in the real world, because the chance of collisions increased with
 the amount of data, and each collision slowed down the overall transfer speed.
 Contrasted with modern ethernet that doesn't do collisions, you can get much
 closer to 100% of the rated bandwith because the communications are 
 effectively
 partitioned from each other.
 
 In the worst case scenerion, if two processes (due to horrible luck) _always_
 try to write at the same time, the overall responsiveness will be lousy, even
 if the bandwidth usage is only a small percent of the available. Of course,
 that worst case doesn't happen in actual practice, but as the usage goes up,
 the chance of hitting that interference increases, and the effective response
 goes down, even when there's bandwidth still available.
 
 Separate the competing processes, and the chance of conflict is 0. So your
 responsiveness is pretty much at best-case all the time.

Understood. Now in my previous delve into this issue, I showed minimal/no disk 
queuing, the SAN showed nothing on it's queues and no retries. (of course 
#NeverTrustTheSANGuy) but I still yielded a 20% performance increase by 
splitting the WAL and $PGDATA

But that's besides the point and my data on that environment is long gone.

I'm content to leave this at I/O is complicated I just wanted to make sure 
that i wasn't correct but for a slightly wrong reason.

Thanks!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Karsten Hilbert
 In most cases developers don’t care about index, unique, foreign key, or 
 primary key names (from a coding standpoint)

Until the day they'd like to write a reliable database change script.

(PG's internal conventions for object names _have_ changed over the years)

Karsten


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Gavin Flower

On 25/08/15 19:04, Karsten Hilbert wrote:

On Tue, Aug 25, 2015 at 02:02:17PM +1200, Gavin Flower wrote:


On 25/08/15 01:15, Ray Cote wrote:

On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert karsten.hilb...@gmx.net
mailto:karsten.hilb...@gmx.net wrote:


[...]

9. Do NOT arbitrarily assign an id column to a table as a
primary key when other columns
are perfectly suited as a unique primary key.

...

  Good example:
CREATE TABLE accounts
( accout_id bigint NOT NULL ,


I would not consider the general use of natural primary keys to be best
practice.

Gavin, Ray,

I certainly didn't write any of the above.

Karsten

Hi Karsten,

It took me a couple of minutes, but I traced 9.  ... to 
melvin6...@gmail.com who opened the thread


Looks like Ray misquoted back in the entry that can be identified by
(using the 'source' option on my mail client)

   From: Ray Cote rgac...@appropriatesolutions.com
   Date: Mon, 24 Aug 2015 09:15:27 -0400
   Message-ID:
   CAG5tnzqTausEhFtRpfWCunx4YNFuGTFyUZyTkn5f2E7RaYKE=g...@mail.gmail.com

which was

   On Sat, Aug 22, 2015 at 11:46 AM, Karsten Hilbert karsten.hilb...@gmx.net
   wrote:

 1. Prefix ALL literals with an Escape
EG:  SELECT E'This is a \'quoted literal \'';
 SELECT E'This is an unquoted literal';

Doing so will prevent the annoying WARNING:  nonstandard use of
escape in a string literal
   

   I'd be concerned that what is missing here is the bigger issue of  Best
   Practice #0: Use Bound Variables.
   The only way I've seen invalid literals show up in SQL queries is through
   the dynamic generation of SQL Statements vs. using bound variables.
   Not using bound variables is your doorway to SQL injection exploits.


   9. Do NOT arbitrarily assign an id column to a table as a primary key
when other columns
are perfectly suited as a unique primary key.

   ...

Good example:
CREATE TABLE accounts
( accout_id bigint NOT NULL ,


   I would not consider the general use of natural primary keys to be best
   practice.
   Let's assume your account_id field is used as a foreign key in a dozen
   other tables.
   1) What happens if someone mis-types the account-id?
 To correct that, you also need to correct the FK field in the other
   dozen tables.
   2) What happens when your company starts a new project (or buys a
   competitor) and all the new account numbers are alpha-numeric?
   3) Your example shows the id as a bigint, but your rule is not limited to
   integers.
   What if your table is country populations and the primary key is country
   name?
   Now, you have quite large foreign keys (and a country changing its name is
   not unheard of).
   (and let's not even get started on case-sensitivity or character encodings).


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Adrian Klaver

On 08/25/2015 02:44 PM, Gavin Flower wrote:

On 26/08/15 02:17, Adrian Klaver wrote:
[...]


2) One of the older unique natural keys (genus, species) is not so
unique. I am a fisheries biologist by training and in my time the
'unique' identifier for various fishes has changed. Now that
ichthyologists have discovered DNA testing, it can be expected there
will be even more changes. This is even more apparent when you go back
in in history. As an example:

https://en.wikipedia.org/wiki/Rainbow_trout

Rainbow trout

Current

Oncorhynchus mykiss

Past

Salmo mykiss Walbaum, 1792
Parasalmo mykiss (Walbaum, 1792)

[...]


Salmo gilberti Jordan, 1894
Salmo nelsoni Evermann, 1908


So you probably need a date stamp so you could record things relating to
the correct name for a given period in a mapping table, and still relate
to the same surrogate key for referencing other tables.

Maybe even worse, is when a species is suddenly found to be 2 or more
distinct species!


Funny you should say that. Furry critters instead of slimy:

http://news.nationalgeographic.com/2015/07/150730-jackals-wolves-evolution-new-species-animals-africa/



Something similar could happen with account numbers: 2 companies with
similar names might be assigned to the same account number, and lots of
transactions recorded before the mistake is discovered. Though obviously
a surrogate key would not give you complete protection from a lot of
work sorting the mess out, but it would probably help!


Or if you have a mortgage with Well Fargo and find your account number 
is being used in their agent training program which explains why you 
have been receiving all sorts of correspondence saying your account is 
in arrears and is facing foreclosure(personal experience).


Bottom line is databases are great and theory is useful, but it all goes 
out the window when people start meddling.




I read on post a year or 2 back, a guy in Europe had at least 4
different variations on his name depending on the country he was in and
the local language and cultural norms.


I am familiar with that issue.



When I worked at a freezing works in the 1970's in Auckland, I heard
that the pay roll allowed for over 52 different names per employee (per
year?).  Though, I was never told the maximum name changes ever used.
Essentially management might fire someone, but the union would complain,
and they would be rehired under a different name - so I was told!  So
the correct holiday pay  PAYE tax deductions would still relate to the
same individual no matter how many name changes they had.


Or a system I took over where someone had made a natural primary key of 
first name, last name and that was all. So you had John Smith, John 
Smith2, etc. Poor design obviously, but that stuff is out there.





Cheers,
Gavin




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Gavin Flower
NB the attribution colours seems to be mixed up a bit here, but this all 
dialogue between me  Adrian.



On 26/08/15 09:48, Adrian Klaver wrote:

On 08/25/2015 02:23 PM, Gavin Flower wrote:

On 26/08/15 08:56, Adrian Klaver wrote:

[...]

have all gone to the same seminar on how to be Walmart and decided
they did not want unique numbers, but UPCs tied to price groups that
covered a variety of plants. Luckily, I was too stupid to

Natural Stupidity??? :-)


Oh yeah and a long history too, but that needs at least a pitcher of 
beer to recount.

Well if you're ever in Auckland, I'll shout you a beer!
(We might even put you up for a night or two.)




(Sorry, couldn't resist!)


know surrogate keys where bad and had a sequence attached to the tag
table. This then became the tag id and made life a lot easier during
the transition. It still remains there, because people are people and
'natural' tends to be artificial and transient.

Extremely good examples, I'll bear them in mind - makes me even more
keen on surrogate primary keys.  I'm always very wary when people tell
me some numbering scheme will NEVER change!!!


To add a recent one. My partner Nancy signed up for Medicare last year 
to avoid the premium penalty. This year in July she signed up for 
Social Security. Turns out, for reasons I do not understand, 
CMS(https://www.cms.gov/) changes the Medicare account number at that 
point. The reason we even cared is that the billing system thinks she 
has two accounts and is double billing. Time on the phone with someone 
at CMS was not enlightening. We where told to trust the system and 
eventually it will work itself out. Still waiting:(


STOP IT!!!  You're making me even more cynical and paranoid!  :-)



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Gavin Flower

On 26/08/15 04:33, Marc Munro wrote:

On Sat, 2015-08-22 at 15:15 +, Melvin Davidson wrote:


I've been searching for a PostgreSQL Developer Best Practices with not
much luck,
so I've started my own. At the risk of stirring up a storm of controversy,
I would appreciate additional suggestions and feedback.


You might add: Create all relation names as plurals.

Or, if your site uses predominantly singular names, make that the
standard.  Consistency within the site is more important than any
dogmatic belief about whether singular or plural forms is better.  If
you don't put it in the standard, someone will eventually create tables
with names that don't gel with everything else.

__
Marc





Actually I would suggest standardising on singular names, not JUST 
because that this the standard I prefer!  :-)


But (also) because:

1. Singular words tend to be shorter

2. plurals are more ambiguous wrt spelling

3. there other good reasons, that I've forgotten for now :-(
   (but I remember having them!!!)


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread John R Pierce

On 8/25/2015 1:42 PM, Gavin Flower wrote:
Account numbers are externally generated, and may potentially change.  
Management might suddenly decide that they want to start using the 
year the account started as the first 4 digits, or that the branch 
code should be reflected in it, or something else.  The database 
should be protected from these arbitrary changes.  Hence the 
account_no is not a good candidate for a primary key.



such practices would raise total havoc on a traditional paper ledger 
accounting system as well as things like pending AR/AP where external 
companies will be referencing your account numbers.





--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why does splitting $PGDATA and xlog yield a performance benefit?

2015-08-25 Thread Gavin Flower

On 26/08/15 05:54, David Kerr wrote:

On Tue, Aug 25, 2015 at 10:16:37AM PDT, Andomar wrote:

However, I know from experience that's not entirely true, (although it's not 
always easy to measure all aspects of your I/O bandwith).

Am I missing something?


Two things I can think of:

Transaction writes are entirely sequential.  If you have disks
assigned for just this purpose, then the heads will always be in the
right spot, and the writes go through more quickly.

A database server process waits until the transaction logs are
written and then returns control to the client. The data writes can
be done in the background while the client goes on to do other
things.  Splitting up data and logs mean that there is less chance
the disk controller will cause data writes to interfere with log
files.

Kind regards,
Andomar


hmm, yeah those are both what I'd lump into I/O bandwith.
If your disk subsystem is fast enough, or you're on a RAIDd SAN
or EBS you'd either overcome that, or not neccssarily be able to.



Back when I actually understood the various timings of disc accessing on 
a MainFrame system, back in the 1980's (disc layout  accessing, is way 
more complicated now!), I found that there was a considerable difference 
between mainly sequential  mostly random access - easily greater than a 
factor of 5 (from memory) in terms of throughput.


Considering the time to move heads between tracks and rotational latency 
(caused by not reading sequential blocks on the same track).  There are 
other complications, which I have glossed over!



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Adrian Klaver

On 08/25/2015 08:40 AM, Melvin Davidson wrote:

Adrian,

Stop being so technical. When we/I speak of natural keys, we are talking
about the column
that would NATURALly lend itself as the primary key.


Pretty sure this is a technical list:)


No one ever said a number is not natural. just that there is no need to
duplicate uniqueness
with a separate number.


I would agree, but I have interacted with people, especially PHBes, 
where a duplicate 'hidden' key is a life saver. See more below.




IOW: If we have an account table, then the account_id or account_no
  would be the primary key. There is no need to have a separate
serial id as the primary key.

  Likewise, if we have a car table, then registration (or
vehicle_id) is preferred.

  EG: Good
  CREATE TABLE car
  (
   registration_no varchar(30) not null,
   car_make   varchar(25) not null,
   model  varchar(15) not null;
   build_year date not null;
   owner  varchar(50),
   CONSTRAINT car_pk PRIMARY KEY (registration_no)
   );

  bad
  CREATE TABLE car
  (
   id  serial not null,
   registration_no varchar(30) not null,
   car_make   varchar(25) not null,
   model  varchar(15) not null;
   build_year date not null;
   owner  varchar(50),
   CONSTRAINT car_pk PRIMARY KEY (id)
   );

The benefit in avoiding arbitrary and simple values for the key is that
it makes
the database design much more logical.

Consider:
SELECT c.registration_no,
c.car_make,
p.part_no
FROM car c
JOIN parts p ON ( p.registration_no = c.registration_no)
  WHERE registration_no = some_var;


Pretty sure parts are not unique to an exact vehicle, unless you are 
talking a totally handmade one. They are not even unique to make and 
model. As an example, I used to work on Class B Isuzu trucks. These 
models(FTR) where also built for Chevrolet as the Forward models. So 
right of the bat there where two part numbers for each part, one that 
started with 9 if you got it from Chevrolet and one with 11 from Isuzu, 
if memory serves. Then Isuzu decided to reorganize their part numbers, 
so that introduced another number, all pointing to the exact same part. 
Then there where those parts available from the parts houses(NAPA, etc).


Then there was the greenhouse I worked for where we supplied UPC coded 
tags for our customers. In the beginning, it was simple, the item 
portion of the UPC was unique and with the company prefix served as a 
'natural' key for the tags. Then the chain stores we worked with must 
have all gone to the same seminar on how to be Walmart and decided they 
did not want unique numbers, but UPCs tied to price groups that covered 
a variety of plants. Luckily, I was too stupid to know surrogate keys 
where bad and had a sequence attached to the tag table. This then became 
the tag id and made life a lot easier during the transition. It still 
remains there, because people are people and 'natural' tends to be 
artificial and transient.




  versus:
  SELECT c.registration_no,
c.car_make,
p.part_no
FROM car c
JOIN parts p ON ( p.id http://p.id = c.id http://c.id)
  WHERE registration_no = some_var;

  Why join on id when registration_no is better?


On Tue, Aug 25, 2015 at 10:17 AM, Adrian Klaver
adrian.kla...@aklaver.com mailto:adrian.kla...@aklaver.com wrote:

On 08/24/2015 08:44 PM, Rob Sargent wrote:


On Aug 24, 2015, at 6:53 PM, Melvin Davidson
melvin6...@gmail.com mailto:melvin6...@gmail.com wrote:

You are right, he was probably talking about FK's. I was
just so frustrated about people insisting that using ID as
the primary key in every table is a good idea,
I didn't bother to reply previously. I stand firm on my
belief that the primary key should be something meaningful
and NOT id just for the sake of having a unique numeric key.

What, pray tell, is the unique natural key of person in any
meaningfully large domain such as state? Certainly not name +
birthdate.  Current address isn’t guaranteed. Social isn’t
reliable and actually not truly unique.


To add:

1) Who determined that a number is not natural?

2) One of the older unique natural keys (genus, species) is not so
unique. I am a fisheries biologist by training and in my time the
'unique' identifier for various fishes has changed. Now that
ichthyologists have discovered DNA testing, it can be expected there
will be even more changes. This is even more apparent when you go
back in in history. As an example:

https://en.wikipedia.org/wiki/Rainbow_trout

Rainbow trout

Current

Oncorhynchus mykiss

Past

Salmo mykiss Walbaum, 1792
Parasalmo mykiss (Walbaum, 1792)
Salmo purpuratus Pallas, 1814
Salmo penshinensis 

Re: [GENERAL] Why does splitting $PGDATA and xlog yield a performance benefit?

2015-08-25 Thread Joseph Kregloh
On Tue, Aug 25, 2015 at 4:31 PM, Gavin Flower gavinflo...@archidevsys.co.nz
 wrote:

 On 26/08/15 05:54, David Kerr wrote:

 On Tue, Aug 25, 2015 at 10:16:37AM PDT, Andomar wrote:

 However, I know from experience that's not entirely true, (although it's
 not always easy to measure all aspects of your I/O bandwith).

 Am I missing something?

 Two things I can think of:

 Transaction writes are entirely sequential.  If you have disks
 assigned for just this purpose, then the heads will always be in the
 right spot, and the writes go through more quickly.

 A database server process waits until the transaction logs are
 written and then returns control to the client. The data writes can
 be done in the background while the client goes on to do other
 things.  Splitting up data and logs mean that there is less chance
 the disk controller will cause data writes to interfere with log
 files.

 Kind regards,
 Andomar

 hmm, yeah those are both what I'd lump into I/O bandwith.
 If your disk subsystem is fast enough, or you're on a RAIDd SAN
 or EBS you'd either overcome that, or not neccssarily be able to.



 Back when I actually understood the various timings of disc accessing on
 a MainFrame system, back in the 1980's (disc layout  accessing, is way
 more complicated now!), I found that there was a considerable difference
 between mainly sequential  mostly random access - easily greater than a
 factor of 5 (from memory) in terms of throughput.

 Considering the time to move heads between tracks and rotational latency
 (caused by not reading sequential blocks on the same track).  There are
 other complications, which I have glossed over!


It can go even further now with the use of SSDs. You can put the xlogs on
an SSD and the rest of the database on a mechanical drive. Same can be said
about partitions, you can place the most accessed partition on an SSD and
the rest of the db on a mechanical drive.

-Joseph Kregloh




 Cheers,
 Gavin


 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Gavin Flower

On 26/08/15 03:40, Melvin Davidson wrote:
[...]

IOW: If we have an account table, then the account_id or account_no
 would be the primary key. There is no need to have a separate 
serial id as the primary key.

[...]

Account numbers are externally generated, and may potentially change.  
Management might suddenly decide that they want to start using the year 
the account started as the first 4 digits, or that the branch code 
should be reflected in it, or something else.  The database should be 
protected from these arbitrary changes.  Hence the account_no is not a 
good candidate for a primary key.



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Grouping sets, cube and rollup

2015-08-25 Thread Edson Richter
Any chance to get those amazing wonderful features backported to 9.4?

Thanks, 

Edson 

Enviado do meu smartphone Sony Xperia™

Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Melvin Davidson
I think a lot of people here are missing the point. I was trying to give
examples of natural keys, but a lot of people are taking great delight
in pointing out exceptions to examples, rather than understanding the point.
So for the sake of argument, a natural key is something that in itself is
unique and the possibility of a duplicate does not exist.
Before ANYONE continues to insist that a serial id column is good, consider
the case where the number of tuples will exceed a bigint.
Don't say it cannot happen, because it can.
However, if you have an alphanumeric field, let's say varchar 50, and it's
guaranteed that it will never have a duplicate, then THAT is a natural
primary
key and beats the hell out of a generic id field.

Further to the point, since I started this thread, I am holding to it and
will not discuss natural primary keys any further.

Other suggestions for good PostgreSQL Developer database (not web app)
guidelines are still welcome.

On Tue, Aug 25, 2015 at 7:34 PM, Adrian Klaver adrian.kla...@aklaver.com
wrote:

 On 08/25/2015 04:23 PM, Jerry Sievers wrote:

 Adrian Klaver adrian.kla...@aklaver.com writes:

 On 08/25/2015 01:56 PM, John R Pierce wrote:

 On 8/25/2015 1:42 PM, Gavin Flower wrote:

 Account numbers are externally generated, and may potentially change.
 Management might suddenly decide that they want to start using the
 year the account started as the first 4 digits, or that the branch
 code should be reflected in it, or something else.  The database
 should be protected from these arbitrary changes.  Hence the
 account_no is not a good candidate for a primary key.



 such practices would raise total havoc on a traditional paper ledger
 accounting system as well as things like pending AR/AP where external
 companies will be referencing your account numbers.


 Agreed, but it happens. When Lowes took over a local hardware
 chain(Eagles) here in Washington state they moved very quickly on
 changing the account numbers. The company I worked for who supplied
 Eagles and then Lowes sat on a check for $22,000 that was sent to us
 in error because the account numbers got switched. We called them when
 we got the check, but it still took them six months to own up to it.


 DOH!

 Next time a screwball outfit sends you a check for $22k erroneously just
 go deposit it :-)


 Well that is what I wanted to do, the owner overruled me:(. Something
 about Lowes having more lawyers then we did. The strange part was we called
 them and told them what had happened and supplied the relevant information
 that explained the mix up. You would have thought us calling to return a
 check that was supposed to be to us would have raised a flag!


 --
 Adrian Klaver
 adrian.kla...@aklaver.com




 --
 Adrian Klaver
 adrian.kla...@aklaver.com



 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general




-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Gavin Flower

On 26/08/15 11:34, Adrian Klaver wrote:
[...]


Agreed, but it happens. When Lowes took over a local hardware
chain(Eagles) here in Washington state they moved very quickly on
changing the account numbers. The company I worked for who supplied
Eagles and then Lowes sat on a check for $22,000 that was sent to us
in error because the account numbers got switched. We called them when
we got the check, but it still took them six months to own up to it.


DOH!

Next time a screwball outfit sends you a check for $22k erroneously just
go deposit it :-)


Well that is what I wanted to do, the owner overruled me:(. Something 
about Lowes having more lawyers then we did. The strange part was we 
called them and told them what had happened and supplied the relevant 
information that explained the mix up. You would have thought us 
calling to return a check that was supposed to be to us would have 
raised a flag!

[...]

Many years ago a department store credited our account with a refund for 
about $150, which obviously was not meant for us - we had never bought 
that item!


They replied, essentially saying we we were due the refund.

We sent a letter yet again, explaining the problem, and saying we were 
not entitled.


They then sent used a check for the amount, which we cashed, feeling we 
had done our best to help them  that we could not be bothered wasting 
more time trying to sort things out for them!


I think any judge would laugh them out of court!




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Adrian Klaver

On 08/25/2015 05:17 PM, Melvin Davidson wrote:

I think a lot of people here are missing the point. I was trying to give
examples of natural keys, but a lot of people are taking great delight
in pointing out exceptions to examples, rather than understanding the point.
So for the sake of argument, a natural key is something that in itself
is unique and the possibility of a duplicate does not exist.
Before ANYONE continues to insist that a serial id column is good,
consider the case where the number of tuples will exceed a bigint.
Don't say it cannot happen, because it can.
However, if you have an alphanumeric field, let's say varchar 50, and
it's guaranteed that it will never have a duplicate, then THAT is a
natural primary


That is a big IF and a guarantee I would not put money on.


key and beats the hell out of a generic id field.

Further to the point, since I started this thread, I am holding to it
and will not discuss natural primary keys any further.





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Adrian Klaver

On 08/25/2015 05:21 PM, Gavin Flower wrote:

On 26/08/15 11:34, Adrian Klaver wrote:
[...]


Agreed, but it happens. When Lowes took over a local hardware
chain(Eagles) here in Washington state they moved very quickly on
changing the account numbers. The company I worked for who supplied
Eagles and then Lowes sat on a check for $22,000 that was sent to us
in error because the account numbers got switched. We called them when
we got the check, but it still took them six months to own up to it.


DOH!

Next time a screwball outfit sends you a check for $22k erroneously just
go deposit it :-)


Well that is what I wanted to do, the owner overruled me:(. Something
about Lowes having more lawyers then we did. The strange part was we
called them and told them what had happened and supplied the relevant
information that explained the mix up. You would have thought us
calling to return a check that was supposed to be to us would have
raised a flag!

[...]

Many years ago a department store credited our account with a refund for
about $150, which obviously was not meant for us - we had never bought
that item!

They replied, essentially saying we we were due the refund.

We sent a letter yet again, explaining the problem, and saying we were
not entitled.

They then sent used a check for the amount, which we cashed, feeling we
had done our best to help them  that we could not be bothered wasting
more time trying to sort things out for them!


I have seen this enough to think either companies would rather take the 
loss then admit the mistake or employees don't care because it is not 
their money.




I think any judge would laugh them out of court!






--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread David G. Johnston
On Tuesday, August 25, 2015, Melvin Davidson melvin6...@gmail.com wrote:

 Before ANYONE continues to insist that a serial id column is good,
 consider the case where the number of tuples will exceed a bigint.
 Don't say it cannot happen, because it can.
 However, if you have an alphanumeric field, let's say varchar 50, and it's
 guaranteed that it will never have a duplicate, then THAT is a natural
 primary
 key and beats the hell out of a generic id field.


Except for it being fatter.  400 bits wide instead of 64.  But that, too,
is simply another consideration to evaluate.

David J.


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Neil Tiffin

 On Aug 22, 2015, at 10:15 AM, Melvin Davidson melvin6...@gmail.com wrote:
 6. Although it is legal to use the form column TYPE PRIMARY KEY, It is best 
 to specify as a CONSTRAINT, 
that way YOU get to choose the name, otherwise postgres assigns a default 
 name which may not be to your liking.
EG: , CONSTRAINT accounts_pk PRIMARY KEY (account_id)

 10.  Standardize Index names with the form table_name + col(s) + “idx”
 EG: For accounts table:
 accounts_name_idx
 accounts_city_state_idx
 


I really like the standardization that PostgreSQL uses in auto generating 
default names.  The rule I use is to always use the auto generated names unless 
the object is referenced routinely in code. In most cases developers don’t care 
about index, unique, foreign key, or primary key names (from a coding 
standpoint) so why should they be creating the names. Since the postgresql 
standard uses auto generated names with ‘_pkey’ for PRIMARY KEY  ‘_fkey’ for 
FOREIGN KEY, and ‘_key’ for UNIQUE, why not use the same rules for consistency? 
 So I disagree with 6 and would extend 10 to include these other names if they 
are manually generated.

interestingly enough, when I searched 9.5 docs I could not find a description 
of these postgreSQL naming convention.  Probably because the developers 
consider it an internal detail that could change which is fine, since the names 
usually don’t matter, until they do.  

I would say use “column TYPE PRIMARY KEY”, “column TYPE UNIQUE”, and ‘column 
TYPE REFERENCES …’ every place you can and only create manual names when 
absolutely necessary.  When you do create manual names follow the standard 
PostgreSQL convention.

Now I have worked on mostly smaller installations so maybe someone should chime 
in if this is a bad best practice.

Neil




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Christine Desmuke

On 8/25/2015 1:38 PM, Joshua D. Drake wrote:

But a VIN is in fact, UNIQUE so it is useful as a PK.

JD

But a VIN is *not* guaranteed to exist, nor is it guaranteed never to 
change, and I regard those as pretty important characteristics in a PK.


VINs were not required in the U.S. until 1954, and were not in a 
standardized format until 1981; other countries have different dates. If 
you are dealing with [or ever might deal with] pre-war classics, early 
imports, kit cars, or other out-of-the-mainstream vehicles, you have to 
deal with the possibility of a vehicle that doesn't have a traditional 
VIN, certainly not one in the 'expected' 17-character  format. Changing 
VINs likewise are very very rare but not impossible (perhaps the most 
common instance would be something like an antique where they used the 
engine serial number as the VIN, only now it has had the engine replaced 
and the DMV insists it have a new number). A lot of natural PKs have 
similar oddities and corner cases that 99.99% of us will never 
encounter, but you don't want to be in the 0.01%. Artificial keys don't 
suffer these problems.


--christine desmuke


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Rob Sargent

On 08/25/2015 04:27 PM, Gavin Flower wrote:

On 26/08/15 04:33, Marc Munro wrote:

On Sat, 2015-08-22 at 15:15 +, Melvin Davidson wrote:

I've been searching for a PostgreSQL Developer Best Practices with 
not

much luck,
so I've started my own. At the risk of stirring up a storm of 
controversy,

I would appreciate additional suggestions and feedback.


You might add: Create all relation names as plurals.

Or, if your site uses predominantly singular names, make that the
standard.  Consistency within the site is more important than any
dogmatic belief about whether singular or plural forms is better.  If
you don't put it in the standard, someone will eventually create tables
with names that don't gel with everything else.

__
Marc





Actually I would suggest standardising on singular names, not JUST 
because that this the standard I prefer!  :-)


But (also) because:

1. Singular words tend to be shorter

2. plurals are more ambiguous wrt spelling

3. there other good reasons, that I've forgotten for now :-(
   (but I remember having them!!!)



4. Each tuple is an instance of entity, not entities :)

Cheers,
Gavin






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Adrian Klaver

On 08/25/2015 04:23 PM, Jerry Sievers wrote:

Adrian Klaver adrian.kla...@aklaver.com writes:


On 08/25/2015 01:56 PM, John R Pierce wrote:


On 8/25/2015 1:42 PM, Gavin Flower wrote:

Account numbers are externally generated, and may potentially change.
Management might suddenly decide that they want to start using the
year the account started as the first 4 digits, or that the branch
code should be reflected in it, or something else.  The database
should be protected from these arbitrary changes.  Hence the
account_no is not a good candidate for a primary key.



such practices would raise total havoc on a traditional paper ledger
accounting system as well as things like pending AR/AP where external
companies will be referencing your account numbers.


Agreed, but it happens. When Lowes took over a local hardware
chain(Eagles) here in Washington state they moved very quickly on
changing the account numbers. The company I worked for who supplied
Eagles and then Lowes sat on a check for $22,000 that was sent to us
in error because the account numbers got switched. We called them when
we got the check, but it still took them six months to own up to it.


DOH!

Next time a screwball outfit sends you a check for $22k erroneously just
go deposit it :-)


Well that is what I wanted to do, the owner overruled me:(. Something 
about Lowes having more lawyers then we did. The strange part was we 
called them and told them what had happened and supplied the relevant 
information that explained the mix up. You would have thought us calling 
to return a check that was supposed to be to us would have raised a flag!





--
Adrian Klaver
adrian.kla...@aklaver.com





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Neil Tiffin

 On Aug 25, 2015, at 1:38 PM, Karsten Hilbert karsten.hilb...@gmx.net wrote:
 
 In most cases developers don’t care about index, unique, foreign key, or 
 primary key names (from a coding standpoint)
 
 Until the day they’d like to write a reliable database change script.

Not sure I understand.  Once the object is created the name is set, it does not 
change, so I don’t understand why it is not possible to write a reliable 
database change script.  Dump and restore maintain the name. Of course every 
project has periodic scripts that need to run, so these objects would, if they 
are dropped or manipulated in the script, have to be manually named, especially 
during development since the whole database might be dropped and recreated 
multiple times.  My original comment included that situation. My projects 
typically have many, many objects that once created are not referred to again, 
unless a DBA is doing some tuning or troubleshooting.  In that case, the DBA 
just looks up the name.

I can see if say 2 years later you want to create a development database from 
the original SQL that generated the original table definitions that could be 
problematic.  But I always have used the current definitions not the original 
and those can be exported with the current names.

It just seems like busy work to me, but I would love to be enlightened.

Neil



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] BDR: cannot remove node from group

2015-08-25 Thread Florin Andrei
Testing BDR for the first time, using the binary packages for Ubuntu 
10.04 provided at http://packages.2ndquadrant.com/bdr/apt/


Postgres 9.4.4 and BDR 0.9.2 (I think)

I'm loosely following this document:

http://bdr-project.org/docs/stable/quickstart-enabling.html

Except I've created two separate instances and I'm trying to replicate 
between instances.


I've created the bdrdemo database, and then I've created the extensions:

CREATE EXTENSION btree_gist;
CREATE EXTENSION bdr;

Then I did bdr_group_create on one node:

SELECT bdr.bdr_group_create(
local_node_name := 'pg-test1-dev-uswest2-aws',
node_external_dsn := 'port=5432 dbname=bdrdemo'
);

But then I've realized I need a host statement in node_external_dsn. So 
now I'm trying to remove this node:


SELECT bdr.bdr_part_by_node_names('{pg-test1-dev-uswest2-aws}');

But if I try to re-add it with the new parameters:

SELECT bdr.bdr_group_create(
local_node_name := 'pg-test1-dev-uswest2-aws',
	node_external_dsn := 'host=pg-test1-dev-uswest2-aws port=5432 
dbname=bdrdemo'

);

I get this:

ERROR:  This node is already a member of a BDR group
HINT:  Connect to the node you wish to add and run bdr_group_join from 
it instead


What do I need to do to start over? I want to delete all traces of the 
BDR configuration I've done so far.


--
Florin Andrei
http://florin.myip.org/


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread David G. Johnston
On Tue, Aug 25, 2015 at 6:27 PM, Gavin Flower gavinflo...@archidevsys.co.nz
 wrote:

 On 26/08/15 04:33, Marc Munro wrote:

 On Sat, 2015-08-22 at 15:15 +, Melvin Davidson wrote:

 I've been searching for a PostgreSQL Developer Best Practices with not
 much luck,
 so I've started my own. At the risk of stirring up a storm of
 controversy,
 I would appreciate additional suggestions and feedback.

 You might add: Create all relation names as plurals.

 Or, if your site uses predominantly singular names, make that the
 standard.  Consistency within the site is more important than any
 dogmatic belief about whether singular or plural forms is better.  If
 you don't put it in the standard, someone will eventually create tables
 with names that don't gel with everything else.

 __
 Marc





 Actually I would suggest standardising on singular names, not JUST
 because that this the standard I prefer!  :-)

 But (also) because:

 1. Singular words tend to be shorter

 2. plurals are more ambiguous wrt spelling

 3. there other good reasons, that I've forgotten for now :-(
(but I remember having them!!!)


Because my first college IS professor taught it this​

​way...I do like these other reasons though.

The fact that a table can have multiple rows is pretty much a given - and I
wouldn't suggest plural adherents name any singleton tables using the
singular form - so no information is lost.  Having since learned OO the
class of something is labelled in the singular form and in many ways a
relation definition is equivalent to a class definition.

​David J.​
​


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Jerry Sievers
Adrian Klaver adrian.kla...@aklaver.com writes:

 On 08/25/2015 01:56 PM, John R Pierce wrote:

 On 8/25/2015 1:42 PM, Gavin Flower wrote:
 Account numbers are externally generated, and may potentially change.
 Management might suddenly decide that they want to start using the
 year the account started as the first 4 digits, or that the branch
 code should be reflected in it, or something else.  The database
 should be protected from these arbitrary changes.  Hence the
 account_no is not a good candidate for a primary key.


 such practices would raise total havoc on a traditional paper ledger
 accounting system as well as things like pending AR/AP where external
 companies will be referencing your account numbers.

 Agreed, but it happens. When Lowes took over a local hardware
 chain(Eagles) here in Washington state they moved very quickly on
 changing the account numbers. The company I worked for who supplied
 Eagles and then Lowes sat on a check for $22,000 that was sent to us
 in error because the account numbers got switched. We called them when
 we got the check, but it still took them six months to own up to it.

DOH!

Next time a screwball outfit sends you a check for $22k erroneously just
go deposit it :-)

 -- 
 Adrian Klaver
 adrian.kla...@aklaver.com

-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net
p: 312.241.7800


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Adrian Klaver

On 08/25/2015 01:56 PM, John R Pierce wrote:

On 8/25/2015 1:42 PM, Gavin Flower wrote:

Account numbers are externally generated, and may potentially change.
Management might suddenly decide that they want to start using the
year the account started as the first 4 digits, or that the branch
code should be reflected in it, or something else.  The database
should be protected from these arbitrary changes.  Hence the
account_no is not a good candidate for a primary key.



such practices would raise total havoc on a traditional paper ledger
accounting system as well as things like pending AR/AP where external
companies will be referencing your account numbers.


Agreed, but it happens. When Lowes took over a local hardware 
chain(Eagles) here in Washington state they moved very quickly on 
changing the account numbers. The company I worked for who supplied 
Eagles and then Lowes sat on a check for $22,000 that was sent to us in 
error because the account numbers got switched. We called them when we 
got the check, but it still took them six months to own up to it.










--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Gavin Flower

On 26/08/15 08:56, Adrian Klaver wrote:

On 08/25/2015 08:40 AM, Melvin Davidson wrote:

Adrian,

Stop being so technical. When we/I speak of natural keys, we are talking
about the column
that would NATURALly lend itself as the primary key.


Pretty sure this is a technical list:)


Don't let inconvenient facts get in the way of a good argument!  :-)

[...]
Pretty sure parts are not unique to an exact vehicle, unless you are 
talking a totally handmade one. They are not even unique to make and 
model. As an example, I used to work on Class B Isuzu trucks. These 
models(FTR) where also built for Chevrolet as the Forward models. So 
right of the bat there where two part numbers for each part, one that 
started with 9 if you got it from Chevrolet and one with 11 from 
Isuzu, if memory serves. Then Isuzu decided to reorganize their part 
numbers, so that introduced another number, all pointing to the exact 
same part. Then there where those parts available from the parts 
houses(NAPA, etc).


Then there was the greenhouse I worked for where we supplied UPC coded 
tags for our customers. In the beginning, it was simple, the item 
portion of the UPC was unique and with the company prefix served as a 
'natural' key for the tags. Then the chain stores we worked with must 
have all gone to the same seminar on how to be Walmart and decided 
they did not want unique numbers, but UPCs tied to price groups that 
covered a variety of plants. Luckily, I was too stupid to 

Natural Stupidity??? :-)

(Sorry, couldn't resist!)

know surrogate keys where bad and had a sequence attached to the tag 
table. This then became the tag id and made life a lot easier during 
the transition. It still remains there, because people are people and 
'natural' tends to be artificial and transient.
Extremely good examples, I'll bear them in mind - makes me even more 
keen on surrogate primary keys.  I'm always very wary when people tell 
me some numbering scheme will NEVER change!!!


[...]


Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Marc Munro
On Sat, 2015-08-22 at 15:15 +, Melvin Davidson wrote:

 I've been searching for a PostgreSQL Developer Best Practices with not
 much luck,
 so I've started my own. At the risk of stirring up a storm of controversy,
 I would appreciate additional suggestions and feedback.


You might add: Create all relation names as plurals.

Or, if your site uses predominantly singular names, make that the
standard.  Consistency within the site is more important than any
dogmatic belief about whether singular or plural forms is better.  If
you don't put it in the standard, someone will eventually create tables
with names that don't gel with everything else.  

__
Marc





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Adrian Klaver

On 08/25/2015 02:23 PM, Gavin Flower wrote:

On 26/08/15 08:56, Adrian Klaver wrote:

On 08/25/2015 08:40 AM, Melvin Davidson wrote:

Adrian,

Stop being so technical. When we/I speak of natural keys, we are talking
about the column
that would NATURALly lend itself as the primary key.


Pretty sure this is a technical list:)


Don't let inconvenient facts get in the way of a good argument!  :-)




have all gone to the same seminar on how to be Walmart and decided
they did not want unique numbers, but UPCs tied to price groups that
covered a variety of plants. Luckily, I was too stupid to

Natural Stupidity??? :-)


Oh yeah and a long history too, but that needs at least a pitcher of 
beer to recount.




(Sorry, couldn't resist!)


know surrogate keys where bad and had a sequence attached to the tag
table. This then became the tag id and made life a lot easier during
the transition. It still remains there, because people are people and
'natural' tends to be artificial and transient.

Extremely good examples, I'll bear them in mind - makes me even more
keen on surrogate primary keys.  I'm always very wary when people tell
me some numbering scheme will NEVER change!!!


To add a recent one. My partner Nancy signed up for Medicare last year 
to avoid the premium penalty. This year in July she signed up for Social 
Security. Turns out, for reasons I do not understand, 
CMS(https://www.cms.gov/) changes the Medicare account number at that 
point. The reason we even cared is that the billing system thinks she 
has two accounts and is double billing. Time on the phone with someone 
at CMS was not enlightening. We where told to trust the system and 
eventually it will work itself out. Still waiting:(




[...]


Cheers,
Gavin




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Developer Best Practices

2015-08-25 Thread Gavin Flower

On 26/08/15 02:17, Adrian Klaver wrote:
[...]


2) One of the older unique natural keys (genus, species) is not so 
unique. I am a fisheries biologist by training and in my time the 
'unique' identifier for various fishes has changed. Now that 
ichthyologists have discovered DNA testing, it can be expected there 
will be even more changes. This is even more apparent when you go back 
in in history. As an example:


https://en.wikipedia.org/wiki/Rainbow_trout

Rainbow trout

Current

Oncorhynchus mykiss

Past

Salmo mykiss Walbaum, 1792
Parasalmo mykiss (Walbaum, 1792)

[...]


Salmo gilberti Jordan, 1894
Salmo nelsoni Evermann, 1908

So you probably need a date stamp so you could record things relating to 
the correct name for a given period in a mapping table, and still relate 
to the same surrogate key for referencing other tables.


Maybe even worse, is when a species is suddenly found to be 2 or more 
distinct species!


Something similar could happen with account numbers: 2 companies with 
similar names might be assigned to the same account number, and lots of 
transactions recorded before the mistake is discovered. Though obviously 
a surrogate key would not give you complete protection from a lot of 
work sorting the mess out, but it would probably help!


I read on post a year or 2 back, a guy in Europe had at least 4 
different variations on his name depending on the country he was in and 
the local language and cultural norms.


When I worked at a freezing works in the 1970's in Auckland, I heard 
that the pay roll allowed for over 52 different names per employee (per 
year?).  Though, I was never told the maximum name changes ever used.  
Essentially management might fire someone, but the union would complain, 
and they would be rehired under a different name - so I was told!  So 
the correct holiday pay  PAYE tax deductions would still relate to the 
same individual no matter how many name changes they had.



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem with pl/python procedure connecting to the internet

2015-08-25 Thread Adrian Klaver

On 08/25/2015 01:30 AM, Igor Sosa Mayor wrote:

Igor Sosa Mayor joseleopoldo1...@gmail.com writes:


Igor Sosa Mayor joseleopoldo1...@gmail.com writes:


My question is therefore:
1. is there a way to permit the pl/python to connect to the internet all
the time and with a better configuration?
2. or should I forget the procedure and write a python script outside
the database?


I'm again with the last update. The problem seems to be that for some
reason PG does not realize that there is a network connection.

More precisely:
1. I start the computer
2. PG starts and I can use it normally, EXCEPT from the plpython
procedure which does not connect


At this point can you connect to the Postgres server on your laptop from 
another machine?



3. then I restart PG (with systemd) and it works. PG seems to see that
there is a connection.


Best guess, since this is a laptop, the network is not connected until 
NetworkManager(or something similar) is active. On my laptop that 
happens late in the startup sequence, after Postgres starts. This is 
especially true if you are connecting to a wireless AP.




Maybe it has something to do with systemd? I'm trying to get all debug
information, but I can't see anything strange in the log. In order not
to full the list with an attachment, I post the log here:
http://pastie.org/10373991


FYI, include the time in your log prefix, it gives some idea of how far 
apart the events are happening. Also logging NOTICE and above is a good 
place to start. DEBUG buries you in mass of detail which may or may not 
be relevant.




Any ideas?

Thanks in advance!






--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Why this lock?

2015-08-25 Thread Johann Spies
I have a long-running query (running now for more than 6 days already
(process 17434).  It involves three tables of which one contains XML-data.

On another, unrelated table with  30718567 records, I ran a query to create
an index on a field.  This morning I cancelled this process because it did
not finish after 5 days.

I then did a vacuum analyze on that table and rerun the query (process
9732) to create the index.  It soon stalled again and the following result
shows that proces 17434 is blocking it:

locktype  | database | relation | page | tuple | virtualxid | transactionid
| classid | objid | objsubid | virtualtransaction | pid  |   mode|
granted | fastpath | virtualtransaction |  pid  | mode  | granted
+--+--+--+---++---+-+---+--++--+---+-+--++---+---+-
 virtualxid |  |  |  |   | 6/24891
|   | |   |  | 7/27906| 9732 |
ShareLock | f   | f| 6/24891| 17434 | ExclusiveLock
| t


Now my questions:

What would cause such a lock?
What can I do to remove the lock without stopping the long-running process
which started before the query to create the index?  I suppose I just have
to wait for the first process to finish...

Regards
Johann
-- 
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)


Re: [GENERAL] BDR: cannot remove node from group

2015-08-25 Thread Craig Ringer
On 26 August 2015 at 07:19, Florin Andrei flo...@andrei.myip.org wrote:

 What do I need to do to start over? I want to delete all traces of the BDR
 configuration I've done so far.

you need to DROP the database you removed, then re-create it as a new
empty database. You cannot re-join a node that has been removed.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general