Re: [HACKERS] CLUSTER and MVCC

2007-03-22 Thread Bruce Momjian
Bruce Momjian wrote: Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: We wouldn't clean up tuples that are visible to a transaction, but if you have one long-running transaction like pg_dump in a database with otherwise short transaction, you'll have a lot of tuples that

Re: [HACKERS] CLUSTER and MVCC

2007-03-22 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: In researching, I found we already do this by updating PGPROC.xid for every command in non-serialzable transactions: Dunno how you arrived at that conclusion, but it's quite wrong. regards, tom lane

Re: [HACKERS] CLUSTER and MVCC

2007-03-22 Thread Bruce Momjian
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: In researching, I found we already do this by updating PGPROC.xid for every command in non-serialzable transactions: Dunno how you arrived at that conclusion, but it's quite wrong. Looking in the function I now see you are right:

Re: [HACKERS] CLUSTER and MVCC

2007-03-22 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes: So, can't this be improved to allow more aggressive vacuuming? Not at that level. We do not keep track of the oldest still-used snapshot in a transaction. I'm dubious that it'd be worth the bookkeeping trouble to try --- often as not, the problem with a

Re: [HACKERS] CLUSTER and MVCC

2007-03-19 Thread Bruce Momjian
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: We wouldn't clean up tuples that are visible to a transaction, but if you have one long-running transaction like pg_dump in a database with otherwise short transaction, you'll have a lot of tuples that are not vacuumable

Re: [HACKERS] CLUSTER and MVCC

2007-03-15 Thread Heikki Linnakangas
Heikki Linnakangas wrote: One potential issue I'm seeing is that if we rely on the unbroken chain starting from OldestXmin, and that tuple isn't there because of a bug, for example, the later version of the tuple is skipped and the row is lost. After further thought, I don't feel comfortable

Re: [HACKERS] CLUSTER and MVCC

2007-03-15 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: I'm thinking of keeping an in-memory mapping of old and new tids of updated tuples while clustering, instead. That means that cluster requires a little bit of memory for each RECENTLY_DEAD updated tuple. In the worst case that means that you run

Re: [HACKERS] CLUSTER and MVCC

2007-03-15 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I'm thinking of keeping an in-memory mapping of old and new tids of updated tuples while clustering, instead. That means that cluster requires a little bit of memory for each RECENTLY_DEAD updated tuple. In the worst case that means

Re: [HACKERS] CLUSTER and MVCC

2007-03-13 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: In each chain, there must be at least one non-dead tuple with xmin Oldestxmin. Huh? Typically *all* the tuples but the last are dead, for varying values of dead. Please be more specific what you mean. I meant dead as in

Re: [HACKERS] CLUSTER and MVCC

2007-03-12 Thread Heikki Linnakangas
Heikki Linnakangas wrote: Tom Lane wrote: The reason it's not trivial is that you also have to preserve the t_ctid links of update chains. If you look into VACUUM FULL, a very large part of its complexity is that it moves update chains as a unit to make that possible. (BTW, I believe the

Re: [HACKERS] CLUSTER and MVCC

2007-03-12 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: In each chain, there must be at least one non-dead tuple with xmin Oldestxmin. Huh? Typically *all* the tuples but the last are dead, for varying values of dead. Please be more specific what you mean. regards, tom lane

[HACKERS] CLUSTER and MVCC

2007-03-09 Thread Heikki Linnakangas
Is there a particular reason why CLUSTER isn't MVCC-safe? It seems to me that it would be trivial to fix, by using SnapshotAny instead of SnapshotNow, and not overwriting the xmin/xmax with the xid of the cluster command. I feel that I must missing something, or someone would've already fixed

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Csaba Nagy
On Fri, 2007-03-09 at 12:29, Heikki Linnakangas wrote: Csaba, you mentioned recently (http://archives.postgresql.org/pgsql-hackers/2007-03/msg00027.php) that you're actually using the MVCC-violation to clean up tables during a backup. Can you tell us a bit more about that? Would you be

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Hannu Krosing
Ühel kenal päeval, R, 2007-03-09 kell 11:29, kirjutas Heikki Linnakangas: Is there a particular reason why CLUSTER isn't MVCC-safe? It seems to me that it would be trivial to fix, by using SnapshotAny instead of SnapshotNow, and not overwriting the xmin/xmax with the xid of the cluster

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Gregory Stark
Csaba Nagy [EMAIL PROTECTED] writes: Wouldn't be possible to do it like Simon (IIRC) suggested, and add a parameter to enable/disable the current behavior, and use the MVCC behavior as default ? Doing it in CLUSTER would be weird. However perhaps it would be useful to have some sort of

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Csaba Nagy
On Fri, 2007-03-09 at 13:42, Gregory Stark wrote: Csaba Nagy [EMAIL PROTECTED] writes: Wouldn't be possible to do it like Simon (IIRC) suggested, and add a parameter to enable/disable the current behavior, and use the MVCC behavior as default ? Doing it in CLUSTER would be weird.

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Heikki Linnakangas
Csaba Nagy wrote: On Fri, 2007-03-09 at 12:29, Heikki Linnakangas wrote: Csaba, you mentioned recently (http://archives.postgresql.org/pgsql-hackers/2007-03/msg00027.php) that you're actually using the MVCC-violation to clean up tables during a backup. Can you tell us a bit more about that?

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Alvaro Herrera
Gregory Stark wrote: Csaba Nagy [EMAIL PROTECTED] writes: Wouldn't be possible to do it like Simon (IIRC) suggested, and add a parameter to enable/disable the current behavior, and use the MVCC behavior as default ? Doing it in CLUSTER would be weird. However perhaps it would be useful

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Zeugswetter Andreas ADI SD
Is there a particular reason why CLUSTER isn't MVCC-safe? It seems to me that it would be trivial to fix, by using SnapshotAny instead of SnapshotNow, and not overwriting the xmin/xmax with the xid of the cluster command. It's trivial to fix now in this way, but it would break HOT,

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Csaba Nagy
On Fri, 2007-03-09 at 14:00, Alvaro Herrera wrote: But I'm not really seeing the problem here. Why isn't Csaba's problem fixed by the fact that HOT reduces the number of dead tuples in the first place? If it does, then he no longer needs the CLUSTER workaround, or at least, he needs it to a

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Csaba Nagy
Hmm. You could use something along these lines instead: 0. LOCK TABLE queue_table 1. SELECT * INTO queue_table_new FROM queue_table 2. DROP TABLE queue_table 3. ALTER TABLE queue_table_new RENAME queue_table After all, it's not that you care about the clustering of the table, you just

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Heikki Linnakangas
Csaba Nagy wrote: On Fri, 2007-03-09 at 14:00, Alvaro Herrera wrote: But I'm not really seeing the problem here. Why isn't Csaba's problem fixed by the fact that HOT reduces the number of dead tuples in the first place? If it does, then he no longer needs the CLUSTER workaround, or at least,

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Florian G. Pflug
Heikki Linnakangas wrote: Csaba Nagy wrote: On Fri, 2007-03-09 at 14:00, Alvaro Herrera wrote: But I'm not really seeing the problem here. Why isn't Csaba's problem fixed by the fact that HOT reduces the number of dead tuples in the first place? If it does, then he no longer needs the

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Heikki Linnakangas
Csaba Nagy wrote: Hmm. You could use something along these lines instead: 0. LOCK TABLE queue_table 1. SELECT * INTO queue_table_new FROM queue_table 2. DROP TABLE queue_table 3. ALTER TABLE queue_table_new RENAME queue_table After all, it's not that you care about the clustering of the table,

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Alvaro Herrera
Zeugswetter Andreas ADI SD wrote: Is there a particular reason why CLUSTER isn't MVCC-safe? It seems to me that it would be trivial to fix, by using SnapshotAny instead of SnapshotNow, and not overwriting the xmin/xmax with the xid of the cluster command. It's trivial to fix

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Heikki Linnakangas
Florian G. Pflug wrote: Couldn't HOT in principle deal with this? Let's say you have two long-running transactions, which see row versions A and D. While those transactions are running, the row is constantly updated, leading to row versions B, C (before the second long-running transaction

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Mike Rylander
On 3/9/07, Florian G. Pflug [EMAIL PROTECTED] wrote: Heikki Linnakangas wrote: Csaba Nagy wrote: On Fri, 2007-03-09 at 14:00, Alvaro Herrera wrote: But I'm not really seeing the problem here. Why isn't Csaba's problem fixed by the fact that HOT reduces the number of dead tuples in the

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Heikki Linnakangas
Mike Rylander wrote: On 3/9/07, Florian G. Pflug [EMAIL PROTECTED] wrote: Couldn't HOT in principle deal with this? Let's say you have two long-running transactions, which see row versions A and D. While those transactions are running, the row is constantly updated, leading to row versions B,

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: Is there a particular reason why CLUSTER isn't MVCC-safe? It seems to me that it would be trivial to fix, by using SnapshotAny instead of SnapshotNow, and not overwriting the xmin/xmax with the xid of the cluster command. The reason it's not

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Simon Riggs
On Fri, 2007-03-09 at 12:48 -0300, Alvaro Herrera wrote: Zeugswetter Andreas ADI SD wrote: Is there a particular reason why CLUSTER isn't MVCC-safe? It seems to me that it would be trivial to fix, by using SnapshotAny instead of SnapshotNow, and not overwriting the xmin/xmax

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Is there a particular reason why CLUSTER isn't MVCC-safe? It seems to me that it would be trivial to fix, by using SnapshotAny instead of SnapshotNow, and not overwriting the xmin/xmax with the xid of the cluster command. The

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes: We wouldn't clean up tuples that are visible to a transaction, but if you have one long-running transaction like pg_dump in a database with otherwise short transaction, you'll have a lot of tuples that are not vacuumable because of the

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Heikki Linnakangas
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: We wouldn't clean up tuples that are visible to a transaction, but if you have one long-running transaction like pg_dump in a database with otherwise short transaction, you'll have a lot of tuples that are not vacuumable because of

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Florian G. Pflug
Heikki Linnakangas wrote: The implementation problem is that we don't have a global view of all snapshots in the system. If we solve that, we can be more aggressive with vacuuming in presence of long-running transactions. It's not an easy problem, we don't want to add a lot of accounting

Re: [HACKERS] CLUSTER and MVCC

2007-03-09 Thread Florian G. Pflug
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: We wouldn't clean up tuples that are visible to a transaction, but if you have one long-running transaction like pg_dump in a database with otherwise short transaction, you'll have a lot of tuples that are not vacuumable because of