Re: [HACKERS] mvcc DML on the same row

2010-12-16 Thread Simon Riggs
On Wed, 2010-12-15 at 20:50 +0100, matteo durighetto wrote:
 if  we continue the transaction and we do for example another update
 on this row (X) , we again redo the same operation:
 
X0  (deleted old row)
X1  (row inserted, NOW deleted) = not needed for rollback
X2  (insert new row  )

This situation has a simple user-space solution: only make one update to
a row, rather than two.

 But why we need all these versions of the same row on table, if for
 rollback we need only the original row X (X0) ?

X1 cannot be removed because X0 points to it, via its t_ctid field.

In order to remove X1 we would need to change X0 to point to X2, which
we don't do because we're not allowed to update in place. Even if we
could, I'm not sure this case is frequent enough to be worth the effort.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mvcc DML on the same row

2010-12-16 Thread Andres Freund
On Thursday 16 December 2010 15:11:01 Simon Riggs wrote:
 In order to remove X1 we would need to change X0 to point to X2, which
 we don't do because we're not allowed to update in place. Even if we
 could, I'm not sure this case is frequent enough to be worth the effort.
Especially as X3 would need to fit into X2's space for it to be beneficial...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] mvcc DML on the same row

2010-12-15 Thread matteo durighetto
Hi,
I have an idea about mvcc and different DML of the same row in the
same transaction.
Normally when a backend do an unpdate on a row ( call it X ) , we done
an insert and logical delete on this row  (0,1,2..N are the version
of the row) :

   X0  (delete old row)
   X1  (insert  new row)

if  we continue the transaction and we do for example another update
on this row (X) , we again redo the same operation:

   X0  (deleted old row)
   X1  (row inserted, NOW deleted) = not needed for rollback
   X2  (insert new row  )


But why we need all these versions of the same row on table, if for
rollback we need only the original row X (X0) ?

So I think we need it in memory, not on physical space of table (ok
there is the cache, but ..) or something similar, or this method is
for transaction with isolation level at read uncommited?

Kind Regards

Matteo Durighetto

---

desmodem...@gmail.com
m.durighe...@miriade.it

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mvcc DML on the same row

2010-12-15 Thread Robert Haas
On Wed, Dec 15, 2010 at 2:50 PM, matteo durighetto
desmodem...@gmail.com wrote:
 Hi,
    I have an idea about mvcc and different DML of the same row in the
 same transaction.
 Normally when a backend do an unpdate on a row ( call it X ) , we done
 an insert and logical delete on this row  (0,1,2..N are the version
 of the row) :

   X0  (delete old row)
   X1  (insert  new row)

 if  we continue the transaction and we do for example another update
 on this row (X) , we again redo the same operation:

   X0  (deleted old row)
   X1  (row inserted, NOW deleted) = not needed for rollback
   X2  (insert new row  )


 But why we need all these versions of the same row on table, if for
 rollback we need only the original row X (X0) ?

The fact that we can't get rid of X1 until after the transaction
commits is an implementation limitation.  But you obviously need both
X0 and X2, because the transaction might either commit or abort.

 So I think we need it in memory, not on physical space of table (ok
 there is the cache, but ..) or something similar, or this method is
 for transaction with isolation level at read uncommited?

I can't figure out exactly what this part is talking about.  It's
completely impractical to add rows to a table without writing them
into shared buffers, which means they will eventually get flushed to
disk if not vacuumed, dropped, etc. first.  We don't support read
uncommitted anyway (well, we do, but it's really still snapshot
isolation).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mvcc DML on the same row

2010-12-15 Thread Nicolas Barbier
2010/12/15 matteo durighetto desmodem...@gmail.com:

 But why we need all these versions of the same row on table, if for
 rollback we need only the original row X (X0) ?

And the previous value of row X during the execution of a statement
(because statements don't see their own changes, think INSERT INTO a
SELECT * FROM a). And any values that we might need to ROLLBACK TO
SAVEPOINT to.

 So I think we need it in memory, not on physical space of table (ok
 there is the cache, but ..) or something similar

It must be possible to push out those changes from memory to disk
anyway, because there is no limit on how many rows a transaction can
update in PostgreSQL (vs. Oracle's snapshot too old problems). But
then, keeping the locally updated rows in some kind of special
per-transaction cache or in the global page cache isn't that
different.

Also, updating the same row many times in one transaction is probably
not regarded a very typical use case.

Note that other DBMSs may implement MVCC more along the lines you
specified; AFAIR, InnoDB uses such an approach. This may mean that
they don't need VACUUM.

I think that the consensus is that there is a trade-off between doing
VACUUM-like things synchronously, or having the possibility to do it
asynchronously at times when load is low. In PostgreSQL, the latter
was chosen.

Btw, this topic has been discussed at length in the past, please check
the archive.

 or this method is for transaction with isolation level at read
 uncommited?

PostgreSQL implements READ UNCOMMITTED as READ COMMITTED (providing a
higher level of isolation than requested is allowed by the standard),
so that is definitely not the reason.

Nicolas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers