[GENERAL] reserving space in a rec for future update

2007-11-14 Thread Gauthier, Dave
Hi:

 

I have a situation where I will be inserting thousands of records into a
table but leaving 2 of it's columns null.  Later on, I will be updating
most of those records and putting real values in  place of those 2
nulls. As for the ones that do not get updated, I want to leave them
null.   My concern has to do with record fragmentation at the time of
update because there's no room to expand them to accept the non-null
data.  (BTW, the columns are floating point).

 

Is there a way to initially insert nulls, but reserve space for the
future update (and avoid record fragmentation)?  

Is my record fragmentation concern unfounded?

 

Thanks in Advance !

 



Re: [GENERAL] reserving space in a rec for future update

2007-11-14 Thread Scott Marlowe
On Nov 14, 2007 9:28 AM, Gauthier, Dave [EMAIL PROTECTED] wrote:

 Hi:

 I have a situation where I will be inserting thousands of records into a
 table but leaving 2 of it's columns null.  Later on, I will be updating most
 of those records and putting real values in  place of those 2 nulls. As for
 the ones that do not get updated, I want to leave them null.   My concern
 has to do with record fragmentation at the time of update because there's no
 room to expand them to accept the non-null data.  (BTW, the columns are
 floating point).

I don't think you really understand how PostgreSQL storage works.
every update is the exact same as a delete / insert in terms of
storage.  So, you're worrying about a problem that doesn't exist.
Read up

Read up on it here:  http://www.postgresql.org/docs/8.2/static/mvcc.html

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] reserving space in a rec for future update

2007-11-14 Thread Andrew Sullivan
On Wed, Nov 14, 2007 at 10:28:30AM -0500, Gauthier, Dave wrote:
 null.   My concern has to do with record fragmentation at the time of
 update because there's no room to expand them to accept the non-null
 data.  (BTW, the columns are floating point).

You have a mistaken idea about how this works.

 Is there a way to initially insert nulls, but reserve space for the
 future update (and avoid record fragmentation)?  

No.

 Is my record fragmentation concern unfounded?

Sort of.

The way this will work in Postgres is that, when you UPDATE the row, the old
row will be marked dead, and a _new_ row will be written out with the new
data.  You will need to perform VACUUM in order to keep the table from
bloating.  You'll want to read the manual carefully about this topic, in
order to keep your table from getting so bloated that your free space map
becomes useless.  One of the weakest areas for PostgreSQL is its behaviour
under this sort of most rows updated scenario, and it is wise to plan
carefully how you will accomplish these sorts of activities without causing
yourself extreme pain.

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] reserving space in a rec for future update

2007-11-14 Thread Mike Charnoky
In this usage scenario, doesn't the new HOT (heap only tuples) feature
of PG8.3 help, in terms of the DB requiring less VACUUM maintenance?

I am similarly performing a huge number of inserts, followed by a huge
number of updates to fill in a few null fields.  The data is indexed by
insert time.  My problem is, selects using that index degrade over time
as updates are performed, presumably because data is no longer ordered
sequentially across pages after updates are performed.  I was hoping
that HOT would help here and am actually installing PG8.3 now in order
to perform some testing...


Mike

Andrew Sullivan wrote:
 On Wed, Nov 14, 2007 at 10:28:30AM -0500, Gauthier, Dave wrote:
 null.   My concern has to do with record fragmentation at the time of
 update because there's no room to expand them to accept the non-null
 data.  (BTW, the columns are floating point).
 
 You have a mistaken idea about how this works.
 
 Is there a way to initially insert nulls, but reserve space for the
 future update (and avoid record fragmentation)?  
 
 No.
 
 Is my record fragmentation concern unfounded?
 
 Sort of.
 
 The way this will work in Postgres is that, when you UPDATE the row, the old
 row will be marked dead, and a _new_ row will be written out with the new
 data.  You will need to perform VACUUM in order to keep the table from
 bloating.  You'll want to read the manual carefully about this topic, in
 order to keep your table from getting so bloated that your free space map
 becomes useless.  One of the weakest areas for PostgreSQL is its behaviour
 under this sort of most rows updated scenario, and it is wise to plan
 carefully how you will accomplish these sorts of activities without causing
 yourself extreme pain.
 
 A
 

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] reserving space in a rec for future update

2007-11-14 Thread Erik Jones


On Nov 14, 2007, at 10:44 AM, Mike Charnoky wrote:


In this usage scenario, doesn't the new HOT (heap only tuples) feature
of PG8.3 help, in terms of the DB requiring less VACUUM maintenance?

I am similarly performing a huge number of inserts, followed by a huge
number of updates to fill in a few null fields.  The data is  
indexed by
insert time.  My problem is, selects using that index degrade over  
time

as updates are performed, presumably because data is no longer ordered
sequentially across pages after updates are performed.  I was hoping
that HOT would help here and am actually installing PG8.3 now in order
to perform some testing...


Mike


Some, what HOT does is keeps index rows from being updated when  
updates are made to column values that aren't indexed.  The same  
insert/delete still happens in the table data.


Erik Jones

Software Developer | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate  market in style.
Visit us online at http://www.myemma.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] reserving space in a rec for future update

2007-11-14 Thread Andrew Sullivan
On Wed, Nov 14, 2007 at 11:31:11AM -0500, Gauthier, Dave wrote:
 Thanks for the advanced warning about problems with vaccuum !

Note this isn't a _problem_ with vacuum, exactly, it's just the set of
compromises that PostgreSQL has settled on.  There are other ways of
cleaning up the system (defrag, of the sort you were implying is one, pay
the cost during transaction is another one, c.).  The piper has to be
paid, and all we're doing is arguing about what currency we'll use :)

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] reserving space in a rec for future update

2007-11-14 Thread Andrew Sullivan
On Wed, Nov 14, 2007 at 11:44:55AM -0500, Mike Charnoky wrote:
 In this usage scenario, doesn't the new HOT (heap only tuples) feature
 of PG8.3 help, in terms of the DB requiring less VACUUM maintenance?

It should, yes.  We'll probably know more once 8.3 is in the field.  For new
work, though, I would certainly suggest trying this on 8.3.  I know that's
the _point_ of the feature.  But if you've already got an application you
need to field today, doing it on a beta is risky.

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] reserving space in a rec for future update

2007-11-14 Thread Gauthier, Dave
OK, I didn't know Postgres did it this way. I was hoping it would retain
the old rec and update in place (if the updated values could fit).  I
guess not.

I can rewrite the DB loading algorithm to get those values in advance,
load into program memory, and reference at the time of the initial load.


Thanks for the advanced warning about problems with vaccuum !

-dave

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Andrew Sullivan
Sent: Wednesday, November 14, 2007 11:13 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] reserving space in a rec for future update

On Wed, Nov 14, 2007 at 10:28:30AM -0500, Gauthier, Dave wrote:
 null.   My concern has to do with record fragmentation at the time of
 update because there's no room to expand them to accept the non-null
 data.  (BTW, the columns are floating point).

You have a mistaken idea about how this works.

 Is there a way to initially insert nulls, but reserve space for the
 future update (and avoid record fragmentation)?  

No.

 Is my record fragmentation concern unfounded?

Sort of.

The way this will work in Postgres is that, when you UPDATE the row, the
old
row will be marked dead, and a _new_ row will be written out with the
new
data.  You will need to perform VACUUM in order to keep the table from
bloating.  You'll want to read the manual carefully about this topic, in
order to keep your table from getting so bloated that your free space
map
becomes useless.  One of the weakest areas for PostgreSQL is its
behaviour
under this sort of most rows updated scenario, and it is wise to plan
carefully how you will accomplish these sorts of activities without
causing
yourself extreme pain.

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] reserving space in a rec for future update

2007-11-14 Thread Alvaro Herrera
Erik Jones wrote:

 On Nov 14, 2007, at 10:44 AM, Mike Charnoky wrote:

 In this usage scenario, doesn't the new HOT (heap only tuples) feature
 of PG8.3 help, in terms of the DB requiring less VACUUM maintenance?

 I am similarly performing a huge number of inserts, followed by a huge
 number of updates to fill in a few null fields.  The data is indexed by
 insert time.  My problem is, selects using that index degrade over time
 as updates are performed, presumably because data is no longer ordered
 sequentially across pages after updates are performed.  I was hoping
 that HOT would help here and am actually installing PG8.3 now in order
 to perform some testing...

 Some, what HOT does is keeps index rows from being updated when updates are 
 made to column values that aren't indexed.  The same insert/delete still 
 happens in the table data.

But another thing HOT does is make it possible to vacuum the *page* that
the update is taking place on.  So if there are dead tuples that nobody
needs, they can be removed and the new tuple can be placed there.

-- 
Alvaro Herrerahttp://www.advogato.org/person/alvherre
Use it up, wear it out, make it do, or do without

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings