Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-18 Thread Rodrigo Sakai
  Thanks for all responses! I agree with most of you, and say that the RI is
best maintened by Database ! Performance must be improved in other ways
(indexes, hardware, etc)!


- Original Message - 
From: Jim C. Nasby [EMAIL PROTECTED]
To: Craig A. James [EMAIL PROTECTED]
Cc: PFC [EMAIL PROTECTED]; Michael Glaesemann [EMAIL PROTECTED];
Rodrigo Sakai [EMAIL PROTECTED];
pgsql-performance@postgresql.org
Sent: Wednesday, April 12, 2006 5:59 PM
Subject: Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE


 On Wed, Apr 12, 2006 at 10:36:28AM -0700, Craig A. James wrote:
  Jim C. Nasby wrote:
  1. You have only one application that modifies the data.  (Otherwise,
you
  have to duplicate the rules across many applications, leading to a
  code-maintenance nightmare).
  
  You forgot something:
  
  1a: You know that there will never, ever, ever, ever, be any other
  application that wants to talk to the database.
  
  I know tons of people that get burned because they go with something
  that's good enough for now, and then regret that decision for years
to
  come.
 
  No, I don't agree with this.  Too many people waste time designing for
  what if... scenarios that never happen.  You don't want to be dumb and
  design something that locks out a foreseeable and likely future need,
but
  referential integrity doesn't meet this criterion.  There's nothing to
keep
  you from changing from app-managed to database-managed referential
  integrity if your needs change.

 In this case your argument makes no sense, because you will spend far
 more time re-creating RI capability inside an application than if you
 just use what the database offers natively.

 It's certainly true that you don't want to over-engineer for no reason,
 but many times choices are made to save a very small amount of time or
 hassle up-front, and those choices become extremely painful later.
 -- 
 Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
 Pervasive Software  http://pervasive.comwork: 512-231-6117
 vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-12 Thread Craig A. James
  I think this is an old question, but I want to know if it really 
is  well worth to not create some foreign keys an deal with the 
referential  integrity at application-level?



Trust me : do it in the application and you'll enter a world of 
hurt. I'm  doing it with some mysql apps, and it's a nightmare ; doing 
cascaded  delete's by hand, etc, you always forget something, you have 
to modify a  million places in your code everytime you add a new table, 
your ORM  bloats, you get to write cleanup cron scripts which take 
forever to run,  your website crashes etc.


All good advice, but... there are no absolutes in this world.  
Application-enforced referential integrity makes sense if (and probably ONLY 
if):

1. You have only one application that modifies the data.  (Otherwise, you have 
to duplicate the rules across many applications, leading to a code-maintenance 
nightmare).

2. If your application crashes and leaves a mess, it's not a catastrophe, and 
you have a good way to clean it up.  For example, a bank shouldn't do this, but 
it might be OK for a computer-aided-design application, or the backend of a 
news web site.

3. You have application-specific knowledge about when you can skip referential 
integrity and thereby greatly improve performance.  For example, you may have 
batch operations where large numbers of rows are temporarily inconsistent.

If your application doesn't meet ALL of these criteria, you probably should use 
the database for referential integrity.

Craig

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-12 Thread Rodrigo Sakai
  Thanks for all help!! But my problem is with performance, I agree with all
of you, the RI must be maintained by the database, because a bunch of
reasons that everyone knows!
  But, I'm dealing with a very huge database that servers more than 200
clientes at the same time, and because of it, each manipulation (delete,
insert, update, select) on the database have a poor performance. So, if we
deal with RI in each client station, we take this work off the database!
  The application is an ERP developed with DELPHI + (postgresql or oracle or
sql server)!!

  Thanks again!!

- Original Message - 
From: Markus Schaber [EMAIL PROTECTED]
To: pgsql-performance@postgresql.org
Cc: Rodrigo Sakai [EMAIL PROTECTED]
Sent: Wednesday, April 12, 2006 10:18 AM
Subject: Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE


 Hi, Michael,
 Hi, Rodrigo,

 Michael Glaesemann wrote:

  If I had to choose between one or the other, I'd leave all  referential
  integrity in the database and deal with the errors thrown  when
  referential integrity is violated in the application. PostgreSQL  is
  designed to handle these kinds of issues. Anything you code in  your
  application is more likely to contain bugs or miss corner cases  that
  would allow referential integrity to be violated. PostgreSQL has  been
  pounded on for years by a great many users and developers,  making the
  likelihood of bugs still remaining much smaller.

 I strictly agree with Michael here.

  Of course, you can add some referential integrity checks in your
  application code, but those should be in addition to your database-
  level checks.

 Agree. It does make sense to have reference checks in the UI or
 application level for the sake of better error handling, but the
 database should be the mandatory judge.

 There's another advantage of database based checking: Should there ever
 be the need of a different application working on the same database (e.
 G. an expert level UI, or some connector that connects / synchronizes
 to another software, or a data import tool), database based constraints
 cannot be broken opposed to application based ones.

 HTH,
 Markus
 -- 
 Markus Schaber | Logical TrackingTracing International AG
 Dipl. Inf. | Software Development GIS

 Fight against software patents in EU! www.ffii.org
www.nosoftwarepatents.org



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-12 Thread Merlin Moncure
On 4/11/06, Rodrigo Sakai [EMAIL PROTECTED] wrote:

   Hi,

   I think this is an old question, but I want to know if it really is well
 worth to not create some foreign keys an deal with the referential integrity
 at application-level?
   Specifically, the system we are developing is a server/cliente
 architecture that the server is the database and the fat client is an
 application developed in DELPHI!!!

   Thanks in advance!!

Delphi IMO is the best RAD win32 IDE ever invented (especially when
paired the very excellent Zeos connection objects).  However, for
purposes of data management ,imperative languages, Delphi included,
simply suck.  Great form editor though.

merlin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-12 Thread PFC


What kinds of operations are made slow by foreign key checks ? Is it :
- Simple checks on INSERT ?
- Simple checks on UPDATE ?
- Cascaded deletes ?
- Cascaded updates ?
- Locks ?
- Something else ?

	Foreign keys are to ensure that the value in a column is always part of a  
specific set (the referenced table). If this set changes very rarely, like  
the list of countries in the world, or the list of states in a country, or  
the various possible states an order can be in (received, processed,  
shipped...) then it's alright to skip the check if you're sure your  
application will insert a valid value. For some other situations, doing  
the check in the application will need some queries and could be slower  
(and certainly will be more complicated...)


	Are you sure you're not missing a few indexes, which would then force  
fkey checks to use sequential scans ?


  Thanks for all help!! But my problem is with performance, I agree with  
all

of you, the RI must be maintained by the database, because a bunch of
reasons that everyone knows!
  But, I'm dealing with a very huge database that servers more than 200
clientes at the same time, and because of it, each manipulation (delete,
insert, update, select) on the database have a poor performance. So, if  
we

deal with RI in each client station, we take this work off the database!
  The application is an ERP developed with DELPHI + (postgresql or  
oracle or

sql server)!!


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-12 Thread Scott Marlowe
On Wed, 2006-04-12 at 09:49, Rodrigo Sakai wrote:
   Thanks for all help!! But my problem is with performance, I agree with all
 of you, the RI must be maintained by the database, because a bunch of
 reasons that everyone knows!
   But, I'm dealing with a very huge database that servers more than 200
 clientes at the same time, and because of it, each manipulation (delete,
 insert, update, select) on the database have a poor performance. So, if we
 deal with RI in each client station, we take this work off the database!
   The application is an ERP developed with DELPHI + (postgresql or oracle or
 sql server)!!

These are separate issues.

One is performance of PostgreSQL handling FK-PK relationships.
PostgreSQL, in my experience, is quite fast at this.  However, there are
ways you can set up FK-PK relationships that are non-optimal and will
result in poor performance.  FK-PK relationships are generally fastest
when they are 1-1 and based on integer types.  If there's a type
mismatch, or you use slower types, like large text fields, or numerics,
you may have poor performance.  Give us a sample of your schema where
you're having problems, let us help you troubleshoot your performance.

High parallel load is another issue.  No matter where you put your
FK-PK relationship handling, having 200+ users connected at the same
time and manipulating your database is a heavy load.

Handling FK-PK relationships in software often is vulnerable to race
conditions.  Like so:  (T1 and T2 are different threads)

T1: select id from mastertable where id=99; -- check for row
T2: delete from mastertable where id=99; -- delete a row
T1: insert into slavetable values (); -- whoops!  No master

If we change the T1 to select for update, we now have the overhead that
most FK-PK relationships have.

What version of PostgreSQL are you running.  Older versions had much
poorer performance than newer versions when updating FK-PK
relationships.

Don't assume that application level FK-PK relationships will be faster
AND as good as the ones at database level.  It's quite possible that
they're faster for you because you're cutting corners, referentially
speaking, and your data will wind up incoherent over time.

Also, you may be dealing with a database that is IO bound, and moving
the FK checks to software is only a short stop gap, and as the machine
hits the IO performance ceiling, you'll have the same problem again,
need a bigger machine, and have incoherent data.  I.e. the same problem,
plus a few more, and have spent a lot of time spinning your wheels going
a short distance.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-12 Thread Jim C. Nasby
On Wed, Apr 12, 2006 at 09:22:52AM +0200, PFC wrote:
 
   I think this is an old question, but I want to know if it really is  
 well worth to not create some foreign keys an deal with the referential  
 integrity at application-level?
 
   Trust me : do it in the application and you'll enter a world of 
   hurt. I'm  doing it with some mysql apps, and it's a nightmare ; doing 
 cascaded  delete's by hand, etc, you always forget something, you have to 
 modify a  million places in your code everytime you add a new table, your 
 ORM  bloats, you get to write cleanup cron scripts which take forever to 
 run,  your website crashes etc.

Well, yeah, thats typical for MySQL sites, but what's that have to do
with RI?

Sorry, couldn't resist. :P

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-12 Thread Jim C. Nasby
On Wed, Apr 12, 2006 at 07:45:17AM -0700, Craig A. James wrote:
 All good advice, but... there are no absolutes in this world.  
 Application-enforced referential integrity makes sense if (and probably 
 ONLY if):
 
 1. You have only one application that modifies the data.  (Otherwise, you 
 have to duplicate the rules across many applications, leading to a 
 code-maintenance nightmare).

You forgot something:

1a: You know that there will never, ever, ever, ever, be any other
application that wants to talk to the database.

I know tons of people that get burned because they go with something
that's good enough for now, and then regret that decision for years to
come.

 2. If your application crashes and leaves a mess, it's not a catastrophe, 
 and you have a good way to clean it up.  For example, a bank shouldn't do 
 this, but it might be OK for a computer-aided-design application, or the 
 backend of a news web site.
 
 3. You have application-specific knowledge about when you can skip 
 referential integrity and thereby greatly improve performance.  For 
 example, you may have batch operations where large numbers of rows are 
 temporarily inconsistent.
 
 If your application doesn't meet ALL of these criteria, you probably should 
 use the database for referential integrity.
 
 Craig
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
   http://www.postgresql.org/docs/faq
 

-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-12 Thread Craig A. James

Jim C. Nasby wrote:
1. You have only one application that modifies the data.  (Otherwise, you 
have to duplicate the rules across many applications, leading to a 
code-maintenance nightmare).


You forgot something:

1a: You know that there will never, ever, ever, ever, be any other
application that wants to talk to the database.

I know tons of people that get burned because they go with something
that's good enough for now, and then regret that decision for years to
come.


No, I don't agree with this.  Too many people waste time designing for what 
if... scenarios that never happen.  You don't want to be dumb and design something 
that locks out a foreseeable and likely future need, but referential integrity doesn't 
meet this criterion.  There's nothing to keep you from changing from app-managed to 
database-managed referential integrity if your needs change.

Design for your current requirements.


Let us be of good cheer, remembering that the misfortunes hardest to bear are 
those which never happen.		- James Russell Lowell (1819-1891)


Therefore do not be anxious about tomorrow, for tomorrow will be anxious for 
itself.  Let the day's own trouble be sufficient for the day.

- Matthew 6:34

Craig

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-12 Thread Jim C. Nasby
On Wed, Apr 12, 2006 at 10:36:28AM -0700, Craig A. James wrote:
 Jim C. Nasby wrote:
 1. You have only one application that modifies the data.  (Otherwise, you 
 have to duplicate the rules across many applications, leading to a 
 code-maintenance nightmare).
 
 You forgot something:
 
 1a: You know that there will never, ever, ever, ever, be any other
 application that wants to talk to the database.
 
 I know tons of people that get burned because they go with something
 that's good enough for now, and then regret that decision for years to
 come.
 
 No, I don't agree with this.  Too many people waste time designing for 
 what if... scenarios that never happen.  You don't want to be dumb and 
 design something that locks out a foreseeable and likely future need, but 
 referential integrity doesn't meet this criterion.  There's nothing to keep 
 you from changing from app-managed to database-managed referential 
 integrity if your needs change.

In this case your argument makes no sense, because you will spend far
more time re-creating RI capability inside an application than if you
just use what the database offers natively.

It's certainly true that you don't want to over-engineer for no reason,
but many times choices are made to save a very small amount of time or
hassle up-front, and those choices become extremely painful later.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-12 Thread Craig A. James

Jim C. Nasby wrote:
No, I don't agree with this.  Too many people waste time designing for 
what if... scenarios that never happen.  You don't want to be dumb and 
design something that locks out a foreseeable and likely future need, but 
referential integrity doesn't meet this criterion.  There's nothing to keep 
you from changing from app-managed to database-managed referential 
integrity if your needs change.


In this case your argument makes no sense, because you will spend far
more time re-creating RI capability inside an application than if you
just use what the database offers natively.


But one of the specific conditions in my original response was, You have 
application-specific knowledge about when you can skip referential integrity and thereby 
greatly improve performance.  If you can't do that, I agree with you.

Anyway, this discussion is probably going on too long, and I'm partly to blame. 
 I think we all agree that in almost all situations, using the database to do 
referential integrity is the right choice, and that you should only violate 
this rule if you have a really, really good reason, and you've thought out the 
implications carefully, and you know you may have to pay a steep price later if 
your requirements change.

Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-11 Thread Rodrigo Sakai



 Hi,

 I think this is an old question, but I want 
to know if it really is well worth to not create some foreign 
keys an deal with the referential integrity at 
application-level?
 Specifically, the system we are developing 
is a server/cliente architecture that the server is the database and the fat 
client is an application developed in DELPHI!!!

 Thanks in 
advance!!


Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-11 Thread Scott Marlowe
On Tue, 2006-04-11 at 14:13, Rodrigo Sakai wrote:
   Hi,
  
   I think this is an old question, but I want to know if it really is
 well worth to not create some foreign keys an deal with the
 referential integrity at application-level?
   Specifically, the system we are developing is a server/cliente
 architecture that the server is the database and the fat client is an
 application developed in DELPHI!!!
  

If ref integrity is important, you'll have to do it either in the app or
the database.

Almost always, it's faster to let the database do it, as there's less
traffic across the wire required to maintain ref integrity, plus, the
guys who wrote the database have spent years making sure race conditions
won't scram your data.

For simple, straight forward FK-PK relationships, you will likely NOT
be able to beat the database in terms of either reliability or
performance with your own code.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-11 Thread Michael Glaesemann


On Apr 12, 2006, at 4:13 , Rodrigo Sakai wrote:

  I think this is an old question, but I want to know if it really  
is well worth to not create some foreign keys an deal with the  
referential integrity at application-level?


If I had to choose between one or the other, I'd leave all  
referential integrity in the database and deal with the errors thrown  
when referential integrity is violated in the application. PostgreSQL  
is designed to handle these kinds of issues. Anything you code in  
your application is more likely to contain bugs or miss corner cases  
that would allow referential integrity to be violated. PostgreSQL has  
been pounded on for years by a great many users and developers,  
making the likelihood of bugs still remaining much smaller.


Of course, you can add some referential integrity checks in your  
application code, but those should be in addition to your database- 
level checks.


Michael Glaesemann
grzm myrealbox com




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-11 Thread Jim C. Nasby
On Wed, Apr 12, 2006 at 08:06:17AM +0900, Michael Glaesemann wrote:
 
 On Apr 12, 2006, at 4:13 , Rodrigo Sakai wrote:
 
   I think this is an old question, but I want to know if it really  
 is well worth to not create some foreign keys an deal with the  
 referential integrity at application-level?
 
 If I had to choose between one or the other, I'd leave all  
 referential integrity in the database and deal with the errors thrown  
 when referential integrity is violated in the application. PostgreSQL  
 is designed to handle these kinds of issues. Anything you code in  
 your application is more likely to contain bugs or miss corner cases  
 that would allow referential integrity to be violated. PostgreSQL has  
 been pounded on for years by a great many users and developers,  
 making the likelihood of bugs still remaining much smaller.

It's also pretty unlikely that you can make RI in the application
perform better than in the database.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-11 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 On Wed, Apr 12, 2006 at 08:06:17AM +0900, Michael Glaesemann wrote:
 ... Anything you code in  
 your application is more likely to contain bugs or miss corner cases  
 that would allow referential integrity to be violated. PostgreSQL has  
 been pounded on for years by a great many users and developers,  
 making the likelihood of bugs still remaining much smaller.

 It's also pretty unlikely that you can make RI in the application
 perform better than in the database.

I think the traditional assumption among the you should do RI in the
application crowd is that the application has higher-level knowledge
that lets it understand when it can skip doing an RI check entirely.
Skipping an RI check is always faster than doing it --- so that's right,
it's faster.  As long as you don't make any mistakes.

The question you have to ask yourself is whether you are really that
smart ... not just today, but every single time.  To quote Clint
Eastwood: Do you feel lucky punk?  Well, do you?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly