Re: [HACKERS] update/insert, delete/insert efficiency WRT vacuum

2006-07-05 Thread Hannu Krosing
Ühel kenal päeval, T, 2006-07-04 kell 14:53, kirjutas Zeugswetter
Andreas DCP SD:
   Is there a difference in PostgreSQL performance between these two 
   different strategies:
   
   
   if(!exec(update foo set bar='blahblah' where name = 'xx'))
   exec(insert into foo(name, bar) values('xx','blahblah'); or
 
 In pg, this strategy is generally more efficient, since a pk failing
 insert would create
 a tx abort and a heap tuple. (so in pg, I would choose the insert first
 strategy only when 
 the insert succeeds most of the time (say  95%))
 
 Note however that the above error handling is not enough, because two
 different sessions
 can still both end up trying the insert (This is true for all db systems
 when using this strategy).

I think the recommended strategy is to first try tu UPDATE, if not found
then INSERT, if primary key violation on insert, then UPDATE


-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] update/insert,

2006-07-05 Thread Mark Woodward
 On Tue, Jul 04, 2006 at 11:59:27AM +0200, Zdenek Kotala wrote:
 Mark,
 I don't know how it will exactly works in postgres but my expectations
 are:

 Mark Woodward wrote:
 Is there a difference in PostgreSQL performance between these two
 different strategies:
 
 
 if(!exec(update foo set bar='blahblah' where name = 'xx'))
 exec(insert into foo(name, bar) values('xx','blahblah');
 or

 The update code generates new tuple in the datafile and pointer has been
 changed in the indexfile to the new version of tuple. This action does
 not generate B-Tree structure changes. If update falls than insert
 command creates new tuple in the datafile and it adds new item into
 B-Tree. It should be generate B-Tree node split.

 Actually, not true. Both versions will generate a row row and create a
 new index tuple. The only difference may be that in the update case the
 may be a ctid link from the old version to the new one, but that's
 about it...

 Which is faster will probably depends on what is more common in your DB:
 row already exists or not. If you know that 99% of the time the row
 will exist, the update will probably be faster because you'll only
 execute one query 99% of the time.

OK, but the point of the question is that constantly updating a single row
steadily degrades performance, would delete/insery also do the same?

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] update/insert,

2006-07-05 Thread Andrew Dunstan

Mark Woodward wrote:


On Tue, Jul 04, 2006 at 11:59:27AM +0200, Zdenek Kotala wrote:
   


Mark,
I don't know how it will exactly works in postgres but my expectations
are:

Mark Woodward wrote:
 


Is there a difference in PostgreSQL performance between these two
different strategies:


if(!exec(update foo set bar='blahblah' where name = 'xx'))
  exec(insert into foo(name, bar) values('xx','blahblah');
or
   


The update code generates new tuple in the datafile and pointer has been
changed in the indexfile to the new version of tuple. This action does
not generate B-Tree structure changes. If update falls than insert
command creates new tuple in the datafile and it adds new item into
B-Tree. It should be generate B-Tree node split.
 


Actually, not true. Both versions will generate a row row and create a
new index tuple. The only difference may be that in the update case the
may be a ctid link from the old version to the new one, but that's
about it...

Which is faster will probably depends on what is more common in your DB:
row already exists or not. If you know that 99% of the time the row
will exist, the update will probably be faster because you'll only
execute one query 99% of the time.
   



OK, but the point of the question is that constantly updating a single row
steadily degrades performance, would delete/insery also do the same?

 




If that was the point of the question, you should have said so.

And unless I am much mistaken the answer is of course it will.

cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] update/insert,

2006-07-05 Thread Zeugswetter Andreas DCP SD

 OK, but the point of the question is that constantly updating 
 a single row steadily degrades performance, would 
 delete/insery also do the same?

Yes, there is currently no difference (so you should do the update).
Of course performance only degrades if vaccuum is not setup correctly.

Andreas

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] update/insert,

2006-07-05 Thread mark
On Wed, Jul 05, 2006 at 04:59:52PM +0200, Zeugswetter Andreas DCP SD wrote:
  OK, but the point of the question is that constantly updating 
  a single row steadily degrades performance, would 
  delete/insery also do the same?
 Yes, there is currently no difference (so you should do the update).
 Of course performance only degrades if vaccuum is not setup correctly.

As Martijn pointed out, there are two differences. One almost
insignificant having to do with internal linkage. The other that
multiples queries are being executed. I would presume with separate
query plans, and so on, therefore you should do the update.

For the case you are talking about, the difference is:

 1) Delete which will always succeed
 2) Insert that will probably succeed

Vs:

 1) Update which if it succeeds, will stop
 2) Insert that will probably succeed

In the first case, you are always executing two queries. In the second,
you can sometimes get away with only one query.

Note what other people mentioned, though, that neither of the above is
safe against parallel transactions updating or inserting rows with the
same key.

In both cases, a 'safe' implementation should loop if 2) fails and
restart the operation.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] update/insert,

2006-07-05 Thread Joshua D. Drake

  Which is faster will probably depends on what is more common in your DB:
  row already exists or not. If you know that 99% of the time the row
  will exist, the update will probably be faster because you'll only
  execute one query 99% of the time.

 OK, but the point of the question is that constantly updating a single row
 steadily degrades performance, would delete/insery also do the same?

Yes. Delete still creates a dead row. There are programatic ways around this
but keeping a delete table that can be truncated at intervals.

Joshua D. Drake



 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] update/insert, delete/insert efficiency WRT vacuum and

2006-07-04 Thread Zdenek Kotala

Mark,
I don't know how it will exactly works in postgres but my expectations are:

Mark Woodward wrote:

Is there a difference in PostgreSQL performance between these two
different strategies:


if(!exec(update foo set bar='blahblah' where name = 'xx'))
exec(insert into foo(name, bar) values('xx','blahblah');
or


The update code generates new tuple in the datafile and pointer has been 
changed in the indexfile to the new version of tuple. This action does 
not generate B-Tree structure changes. If update falls than insert 
command creates new tuple in the datafile and it adds new item into 
B-Tree. It should be generate B-Tree node split.




exec(delete from foo where name = 'xx');
exec(insert into foo(name, bar) values('xx','blahblah');



Both commands should generate B-Tree structure modification.

I expect that first variant is better, but It should depend on many 
others things - for examples triggers, other indexes ...



REPLACE/UPSERT command solves this problem, but It is still in the TODO 
list.


Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] update/insert, delete/insert efficiency WRT vacuum and

2006-07-04 Thread Martijn van Oosterhout
On Tue, Jul 04, 2006 at 11:59:27AM +0200, Zdenek Kotala wrote:
 Mark,
 I don't know how it will exactly works in postgres but my expectations are:
 
 Mark Woodward wrote:
 Is there a difference in PostgreSQL performance between these two
 different strategies:
 
 
 if(!exec(update foo set bar='blahblah' where name = 'xx'))
 exec(insert into foo(name, bar) values('xx','blahblah');
 or
 
 The update code generates new tuple in the datafile and pointer has been 
 changed in the indexfile to the new version of tuple. This action does 
 not generate B-Tree structure changes. If update falls than insert 
 command creates new tuple in the datafile and it adds new item into 
 B-Tree. It should be generate B-Tree node split.

Actually, not true. Both versions will generate a row row and create a
new index tuple. The only difference may be that in the update case the
may be a ctid link from the old version to the new one, but that's
about it...

Which is faster will probably depends on what is more common in your DB:
row already exists or not. If you know that 99% of the time the row
will exist, the update will probably be faster because you'll only
execute one query 99% of the time.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] update/insert, delete/insert efficiency WRT vacuum and

2006-07-04 Thread Zeugswetter Andreas DCP SD

  Is there a difference in PostgreSQL performance between these two 
  different strategies:
  
  
  if(!exec(update foo set bar='blahblah' where name = 'xx'))
  exec(insert into foo(name, bar) values('xx','blahblah'); or

In pg, this strategy is generally more efficient, since a pk failing
insert would create
a tx abort and a heap tuple. (so in pg, I would choose the insert first
strategy only when 
the insert succeeds most of the time (say  95%))

Note however that the above error handling is not enough, because two
different sessions
can still both end up trying the insert (This is true for all db systems
when using this strategy).

Andreas

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] update/insert, delete/insert efficiency WRT vacuum and MVCC

2006-07-03 Thread Mark Woodward
Is there a difference in PostgreSQL performance between these two
different strategies:


if(!exec(update foo set bar='blahblah' where name = 'xx'))
exec(insert into foo(name, bar) values('xx','blahblah');
or
exec(delete from foo where name = 'xx');
exec(insert into foo(name, bar) values('xx','blahblah');

In my session handler code I can do either, but am curious if it makes any
difference. Yes, name is unique.

---(end of broadcast)---
TIP 6: explain analyze is your friend