subject:"Re\\\: \\\[HACKERS\\\] \\\[Bizgres\\\-general\\\] WAL bypass for INSERT, UPDATE and"

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-02-07 Thread Simon Riggs

On Mon, 2006-02-06 at 21:07 -0700, Rick Gigger wrote:
 I was thinking the exact same thing.  Except  the and just fsync()  
 dirty pages on commit part.  Wouldn't that actually make the  
 situation worse?  I thought the whole point of WAL was that it was  
 more efficient to fsync all of the changes in one sequential write in  
 one file rather than fsyncing all of the separate dirty pages.

This would apply to only a single relation, so would be just as
efficient a write to the database as to WAL. The proposed route is to
sync to the database, but not to WAL, thus halving the required I/O.

Yes, its designed for large data loads.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-02-07 Thread Christopher Kings-Lynne


This would apply to only a single relation, so would be just as
efficient a write to the database as to WAL. The proposed route is to
sync to the database, but not to WAL, thus halving the required I/O.

Yes, its designed for large data loads.



A question - would setting fsync=off while restoring a multi-gig dump 
(during an upgrade) improve performance?


Chris


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-02-07 Thread Jim C. Nasby

On Wed, Feb 08, 2006 at 09:44:24AM +0800, Christopher Kings-Lynne wrote:
 This would apply to only a single relation, so would be just as
 efficient a write to the database as to WAL. The proposed route is to
 sync to the database, but not to WAL, thus halving the required I/O.
 
 Yes, its designed for large data loads.
 
 
 A question - would setting fsync=off while restoring a multi-gig dump 
 (during an upgrade) improve performance?

Yes.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-02-06 Thread Simon Riggs

On Fri, 2006-02-03 at 22:29 -0500, Bruce Momjian wrote:

 Based on this, I think we should just implement the TRUNCATE/DROP option
 for the table, and avoid the idea of allowing non-logged operations on a
 table that has any data we want recovered after a crash.

Well the only other option is this:

Implement an UNDO at abort like we do for DDL, which would truncate the
table back down to the starting point. That can be made to work for both
cases.

In addition if the starting point was  0 then we'd need to perform a
VACUUM style operation to remove any index pointers with tids into the
to-be-truncated blocks. That would then make it work for the
with-indexes and/or with-toast cases.

If starting point == 0 we would just truncate the indexes and toast
stuff too.

Most importantly we'd need to do this at recovery time. That bit will
take a bit of work to make it happen right, but seems doable.

So we cover both cases at once, using one lot of logic. But there is a
fair amount of work there, so I'll need to consider whether its 8.2
material or not.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-02-06 Thread Bruce Momjian


I have split up the TODO items as:

* Allow WAL logging to be turned off for a table, but the table
  might be dropped or truncated during crash recovery [walcontrol]

  Allow tables to bypass WAL writes and just fsync() dirty pages on
  commit.  This should be implemented using ALTER TABLE, e.g. ALTER
  TABLE PERSISTENCE [ DROP | TRUNCATE | DEFAULT ].  Tables using
  non-default logging should not use referential integrity with
  default-logging tables.  A table without dirty buffers during a
  crash could perhaps avoid the drop/truncate.

* Allow WAL logging to be turned off for a table, but the table would
  avoid being truncated/dropped [walcontrol]

  To do this, only a single writer can modify the table, and writes
  must happen only on new pages so the new pages can be removed during
  crash recovery.  Readers can continue accessing the table.  Such
  tables probably cannot have indexes.  One complexity is the handling
  of indexes on TOAST tables.

I think the first one is possible, while the second has many
complexities that make its implementation suspect.

---

Simon Riggs wrote:
 On Fri, 2006-02-03 at 22:29 -0500, Bruce Momjian wrote:
 
  Based on this, I think we should just implement the TRUNCATE/DROP option
  for the table, and avoid the idea of allowing non-logged operations on a
  table that has any data we want recovered after a crash.
 
 Well the only other option is this:
 
 Implement an UNDO at abort like we do for DDL, which would truncate the
 table back down to the starting point. That can be made to work for both
 cases.
 
 In addition if the starting point was  0 then we'd need to perform a
 VACUUM style operation to remove any index pointers with tids into the
 to-be-truncated blocks. That would then make it work for the
 with-indexes and/or with-toast cases.
 
 If starting point == 0 we would just truncate the indexes and toast
 stuff too.
 
 Most importantly we'd need to do this at recovery time. That bit will
 take a bit of work to make it happen right, but seems doable.
 
 So we cover both cases at once, using one lot of logic. But there is a
 fair amount of work there, so I'll need to consider whether its 8.2
 material or not.
 
 Best Regards, Simon Riggs
 
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-02-06 Thread Christopher Kings-Lynne


* Allow WAL logging to be turned off for a table, but the table
  might be dropped or truncated during crash recovery [walcontrol]

  Allow tables to bypass WAL writes and just fsync() dirty pages on
  commit.  This should be implemented using ALTER TABLE, e.g. ALTER
  TABLE PERSISTENCE [ DROP | TRUNCATE | DEFAULT ].  Tables using
  non-default logging should not use referential integrity with
  default-logging tables.  A table without dirty buffers during a
  crash could perhaps avoid the drop/truncate.


This would be such a sweet feature for website session tables...

Chris


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-02-06 Thread Doug McNaught

Bruce Momjian pgman@candle.pha.pa.us writes:

 TABLE PERSISTENCE [ DROP | TRUNCATE | DEFAULT ].  Tables using
 non-default logging should not use referential integrity with
 default-logging tables.

I have to say this smells way too much like MySQL for me to feel
comfortable.  But that's just my opinion.  :)

-Doug

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-02-06 Thread Rick Gigger

I was thinking the exact same thing.  Except  the and just fsync()  
dirty pages on commit part.  Wouldn't that actually make the  
situation worse?  I thought the whole point of WAL was that it was  
more efficient to fsync all of the changes in one sequential write in  
one file rather than fsyncing all of the separate dirty pages.




On Feb 6, 2006, at 7:24 PM, Christopher Kings-Lynne wrote:


* Allow WAL logging to be turned off for a table, but the table
  might be dropped or truncated during crash recovery [walcontrol]

  Allow tables to bypass WAL writes and just fsync() dirty pages on
  commit.  This should be implemented using ALTER TABLE, e.g. ALTER
  TABLE PERSISTENCE [ DROP | TRUNCATE | DEFAULT ].  Tables using
  non-default logging should not use referential integrity with
  default-logging tables.  A table without dirty buffers during a
  crash could perhaps avoid the drop/truncate.


This would be such a sweet feature for website session tables...

Chris


---(end of  
broadcast)---

TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-02-06 Thread Bruce Momjian

Rick Gigger wrote:
 I was thinking the exact same thing.  Except  the and just fsync()  
 dirty pages on commit part.  Wouldn't that actually make the  
 situation worse?  I thought the whole point of WAL was that it was  
 more efficient to fsync all of the changes in one sequential write in  
 one file rather than fsyncing all of the separate dirty pages.

Uh, supposedly the WAL traffic is not as efficient as fsyncing whole
pages if you are doing lots of full pages.

---


 
 On Feb 6, 2006, at 7:24 PM, Christopher Kings-Lynne wrote:
 
 * Allow WAL logging to be turned off for a table, but the table
   might be dropped or truncated during crash recovery [walcontrol]
 
   Allow tables to bypass WAL writes and just fsync() dirty pages on
   commit.  This should be implemented using ALTER TABLE, e.g. ALTER
   TABLE PERSISTENCE [ DROP | TRUNCATE | DEFAULT ].  Tables using
   non-default logging should not use referential integrity with
   default-logging tables.  A table without dirty buffers during a
   crash could perhaps avoid the drop/truncate.
 
  This would be such a sweet feature for website session tables...
 
  Chris
 
 
  ---(end of  
  broadcast)---
  TIP 2: Don't 'kill -9' the postmaster
 
 
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-02-06 Thread Rick Gigger


Rick Gigger wrote:

I was thinking the exact same thing.  Except  the and just fsync()
dirty pages on commit part.  Wouldn't that actually make the
situation worse?  I thought the whole point of WAL was that it was
more efficient to fsync all of the changes in one sequential write in
one file rather than fsyncing all of the separate dirty pages.


Uh, supposedly the WAL traffic is not as efficient as fsyncing whole
pages if you are doing lots of full pages.


So then you would want to use this particular feature only when  
updating/inserting large amounts of info at a time then?


For instance if you have a table with rows that are 12 bytes wide  
using this feature would be very bad because it would fsync out a  
whole bunch of extra data just to get those 12 bytes written.  But on  
a table that wrote out several k of data at a time it would help  
because it would be filling up entire pages and not doing any  
wasteful fsyncing?


I guess that probably would help session tables because it would  
avoid writing the data twice.


In the case of session tables though I for one don't care if that  
data is recoverable or not.  If my database just crashed I've  
probably got bigger problems then just dropped sessions.  Would it be  
possible to a) Not WAL log that table, b) not fsync that table at  
all.  Then if that table is in an inconsistent state just recreate  
the table?


I'm guessing that is getting a little too crazy but I have been  
thinking about setting up a separate postgres instance listening on a  
different port with fsync off as a possible way to avoid the fsync  
overhead for the sessions.



-- 
-





On Feb 6, 2006, at 7:24 PM, Christopher Kings-Lynne wrote:


* Allow WAL logging to be turned off for a table, but the table
  might be dropped or truncated during crash recovery [walcontrol]

	  Allow tables to bypass WAL writes and just fsync() dirty  
pages on
	  commit.  This should be implemented using ALTER TABLE, e.g.  
ALTER

  TABLE PERSISTENCE [ DROP | TRUNCATE | DEFAULT ].  Tables using
  non-default logging should not use referential integrity with
  default-logging tables.  A table without dirty buffers during a
  crash could perhaps avoid the drop/truncate.


This would be such a sweet feature for website session tables...

Chris


---(end of
broadcast)---
TIP 2: Don't 'kill -9' the postmaster




---(end of  
broadcast)---

TIP 4: Have you searched our list archives?

   http://archives.postgresql.org



--
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square,  
Pennsylvania 19073


---(end of  
broadcast)---

TIP 5: don't forget to increase your free space map settings




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-02-03 Thread Bruce Momjian


Based on this, I think we should just implement the TRUNCATE/DROP option
for the table, and avoid the idea of allowing non-logged operations on a
table that has any data we want recovered after a crash.

---

Simon Riggs wrote:
 On Thu, 2006-01-05 at 12:27 -0500, Bruce Momjian wrote:
 
  Seems like a nice optimization.
 
 Negative thoughts: Toast tables have a toast index on them, yes? We have
 agreed that we cannot use the optimization if we have indexes on the
 main table. It follows that we cannot use the optimization if we have
 *any* toasted data, since that would require a pointer between two
 blocks, which would not be correctly recovered following a crash. If we
 log the toast table then there could be a mismatch between heap and
 toast table; if we don't log the toast table there could be a mismatch
 between toast table and toast index.
 
 We can test to see if the toast table is empty when we do ALTER TABLE,
 but loading operations may try to create toasted data rows.
 
 Presumably that means we must either:
 i) abort a COPY if we get a toastable value
 ii) if we get a toastable value, insert the row into a new block, which
 we do logging of, then also log the toast insert and the toast index
 insert - i.e. some blocks we log, others not
 
 This is still useful for many applications, IMHO, but the list of
 restrictions seems to be growing. Worse, we wouldn't know that the toast
 tables were empty until after we did the COPY TO for a pg_dump, so we
 wouldn't be able to retrospectively add an ALTER TABLE command ahead of
 the COPY. 
 
 Thoughts? Hopefully there are some flaws in my thinking here,
 
 Best Regards, Simon Riggs
 
 
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-05 Thread Bruce Momjian

Simon Riggs wrote:
  So, we need a name for EXCLUSIVE mode that suggests how it is different
  from TRUNCATE, and in this case, the difference is that EXCLUSIVE
  preserves the previous contents of the table on recovery, while TRUNCATE
  does not.  Do you want to call the mode PRESERVE, or EXCLUSIVE WRITER?
  Anyway, the keywords are easy to modify, even after the patch is
  submitted.  FYI, I usually go through keywords.c looking for a keyword
  we already use.
 
 I'm very happy for suggestions on what these new modes are called.
 
So, to summarize, I think we should add DROP/TRUNCATE, and use that by
default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
   
   Would you mind stating again what you mean, just so I can understand
   this? Your summary isn't enough.
  
  New ALTER TABLE mode, perhaps call it PERSISTENCE:
  
  ALTER TABLE tab PERSISTENCE DROP ON RECOVERY
  ALTER TABLE tab PERSISTENCE TRUNCATE ON RECOVERY
  
  These would drop or truncate all tables with this flag on a non-clean
  start of the postmaster, and write something in the server logs. 
  However, I don't know that we have the code in place to DROP/TRUNCATE in
  recovery mode, and it would affect all databases, so it could be quite
  complex to implement.  In this mode, no WAL logs would be written for
  table modifications, though DDL commands would have to be logged.
 
 Right now, this will be a TODO item... it looks like it will take some
 thought to implement correctly.

OK, I know my suggestions have made it more complicated.

TODO added:

* Allow control over which tables are WAL-logged

  Allow tables to bypass WAL writes and just fsync() dirty pages on
  commit.  To do this, only a single writer can modify the table, and
  writes must happen only on new pages.  Readers can continue accessing
  the table.  This would affect COPY, and perhaps INSERT/UPDATE too.
  Another option is to avoid transaction logging entirely and truncate
  or drop the table on crash recovery.  These should be implemented
  using ALTER TABLE, e.g. ALTER TABLE PERSISTENCE [ DROP | TRUNCATE |
  STABLE | DEFAULT ].  Tables using non-default logging should not use
  referential integrity with default-logging tables, and tables using
  stable logging probably can not have indexes.  [walcontrol]


  ALTER TABLE tab PERSISTENCE PRESERVE (or STABLE?)
  
  Table contents are preserved across recoveries, but data modifications
  can happen only one at a time.  I don't think we have a lock mode that
  does this, so I am worried a new lock mode will have to be created.  A
  simplified solution at this stage would be to take an exclusive lock on
  the table, but really we just need a single-writer table lock, which I
  don't think we have. initially this can implemented to only affect COPY
  but later can be done for other commands. 
 
 ExclusiveLock locks out everything apart from readers, no new lock mode
 AFAICS. Implementing that is little additional work for COPY.

Nice.

 Tom had a concern about setting this for I, U, D commands via the
 executor. Not sure what the details of that are, as yet.

That is much more complicated than the COPY-only idea, for sure.  I am
thinking we could add the ALTER syntax and just do COPY at this stage,
meaning that I/U/D still do full logging until we get to improving them.
The big benefit is that the user API doesn't need to change when we
improve the code.  In fact I think we could do the TRUNCATE/DROP easily
for I/U/D, but the STABLE option would require work and we don't need to
implement it in the first patch.

 We can use either of the unlogged modes for pg_dump, so I'd suggest its
 this one. Everybody happy with this being the new default in pg_dump, or
 should it be an option?
 
  ALTER TABLE tab PERSISTENCE DEFAULT
  
  This would be our current default mode, which is full concurrency and
  persistence.
 
 I'm thinking whether the ALTER TABLE statement might be better with two
 bool flags rather than a 3-state char.
 
 flag 1: ENABLE LOGGING | DISABLE LOGGING
 
 flag 2: FULL RECOVERY | TRUNCATE ON RECOVERY
 
 Giving 3 possible sets of options:
 
 -- the default
 ALTER TABLE mytable ENABLE LOGGING FULL RECOVERY; (default)
 
 -- EXCLUSIVE mode
 ALTER TABLE mytable DISABLE LOGGING FULL RECOVERY;
 ...which would be used like this
   ALTER TABLE mytable DISABLE LOGGING;
   COPY or other bulk data manipulation SQL
   ALTER TABLE mytable ENABLE LOGGING;
 ...since FULL RECOVERY is the default.
 
 -- multiuser temp table mode
 ALTER TABLE mytable DISABLE LOGGING TRUNCATE ON RECOVERY;
 ...which would usually be left on all the time
 
 which only uses one new keyword LOGGING and yet all the modes are fairly
 explicit as to what they do.
 
 An alternative might be the slightly more verbose:
   ALTER TABLE mytable DISABLE LOGGING FORCE EXCLUSIVE TABLE LOCK;
 which would be turned off by
   ALTER TABLE

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-05 Thread Bruce Momjian

Simon Riggs wrote:
 On Thu, 2005-12-29 at 11:37 -0500, Bruce Momjian wrote:
  Having COPY behave differently because it is
  in a transaction is fine as long as it is user-invisible, but once you
  require users to do that to get the speedup, it isn't user-invisible
  anymore.
 
 Since we're agreed on adding ALTER TABLE rather than COPY LOCK, we have
 our explicit mechanism for speedup.
 
 However, it costs a single line of code and very very little execution
 time to add in the optimization to COPY to make it bypass WAL when
 executed in the same transaction that created the table. Everything else
 is already there.
 
 As part of the use_wal test:
 + if (resultRelInfo-ri_NumIndices == 0  
 + !XLogArchivingActive()
  (cstate-rel-rd_createSubid != InvalidSubTransactionId ))
 + use_wal = false;
 
 the value is already retrieved from cache...
 
 Can anyone see a reason *not* to put that change in also? We just don't
 advertise it as the suggested route to gaining performance, nor would
 we rely on it for pg_dump/restore performance. 

Seems like a nice optimization.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-05 Thread Simon Riggs

On Thu, 2006-01-05 at 12:27 -0500, Bruce Momjian wrote:

 Seems like a nice optimization.

Negative thoughts: Toast tables have a toast index on them, yes? We have
agreed that we cannot use the optimization if we have indexes on the
main table. It follows that we cannot use the optimization if we have
*any* toasted data, since that would require a pointer between two
blocks, which would not be correctly recovered following a crash. If we
log the toast table then there could be a mismatch between heap and
toast table; if we don't log the toast table there could be a mismatch
between toast table and toast index.

We can test to see if the toast table is empty when we do ALTER TABLE,
but loading operations may try to create toasted data rows.

Presumably that means we must either:
i) abort a COPY if we get a toastable value
ii) if we get a toastable value, insert the row into a new block, which
we do logging of, then also log the toast insert and the toast index
insert - i.e. some blocks we log, others not

This is still useful for many applications, IMHO, but the list of
restrictions seems to be growing. Worse, we wouldn't know that the toast
tables were empty until after we did the COPY TO for a pg_dump, so we
wouldn't be able to retrospectively add an ALTER TABLE command ahead of
the COPY. 

Thoughts? Hopefully there are some flaws in my thinking here,

Best Regards, Simon Riggs




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Jim C. Nasby

On Thu, Dec 29, 2005 at 11:24:28AM -0500, Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  Andrew Dunstan wrote:
  I an horribly scared that this will be used as a performance boost for
  normal use. I would at least like to see some restrictions that make it
  harder to mis-use. Perhaps restrict to superuser?
 
  Certainly restrict to table owner.
 
 I can see the argument for superuser-only: decisions about data
 integrity tradeoffs should be reserved to the DBA, who is the one who
 will get blamed if the database loses data, no matter how stupid his
 users are.
 
 But I'm not wedded to that.  I could live with table-owner.

I dislike restricting to super-user, and to some extent even table
owner. The reason is that if you have some automated batch process, you
don't want that process running as a superuser. Also, it is often
awkward to require that the user running that batch own the table.

I'd much rather see this as a grantable permission on the table. (The
same is true with truncate, btw). This way, if a DBA knew he could trust
a specific role, he could allow for these operations on a specific
table.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Jim C. Nasby

On Fri, Dec 30, 2005 at 12:58:15PM -0500, Bruce Momjian wrote:
 Andrew Dunstan wrote:
  My concern is more about making plain that this is for special operations,
  not normal operations. Or maybe I have misunderstood the purpose.
  
  
  
  Rephrase that as full ownership rights must be obtained to load data in
  a way that requires dropping any existing indexes and locking out other
  users of the table.  I don't think the use-case for this will be very
  large for non-owners, or indeed even for owners except during initial
  table creation; and so I don't think the above argument is strong.
  
  
  Those restrictions aren't true of Bruce's proposed drop and
  delete/truncate recovery modes, are they?
 
 Only the owner could do the ALTER, for sure, but once the owner sets it,
 any user with permission to write to the table would have those
 characteristics.

Dumb question: if the ALTER is done inside a transaction, and then
reverted at the end of the transaction, does that mean that no other
transactions would have those permissions? I think the general use-case
is that you only one the session doing the ALTER to be able to use these
special modes, not anyone else who happens to be hitting the table at
that time...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Stephen Frost

* Jim C. Nasby ([EMAIL PROTECTED]) wrote:
 I dislike restricting to super-user, and to some extent even table
 owner. The reason is that if you have some automated batch process, you
 don't want that process running as a superuser. Also, it is often
 awkward to require that the user running that batch own the table.

The owner of the table could be a role which the batch runner is part of
(along with whatever other roles you wish to have 'owner'-level
permissions on the table).

 I'd much rather see this as a grantable permission on the table. (The
 same is true with truncate, btw). This way, if a DBA knew he could trust
 a specific role, he could allow for these operations on a specific
 table.

In general, I do prefer that permissions be seperably grantable.  Being
able to grant 'truncate' permissions would be really nice.  Is the only
reason such permission doesn't exist due to no one working on it, or is
there other disagreement about it?

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Tom Lane

Jim C. Nasby [EMAIL PROTECTED] writes:
 Dumb question: if the ALTER is done inside a transaction, and then
 reverted at the end of the transaction, does that mean that no other
 transactions would have those permissions? I think the general use-case
 is that you only one the session doing the ALTER to be able to use these
 special modes, not anyone else who happens to be hitting the table at
 that time...

Such an ALTER would certainly require exclusive lock on the table,
so I'm not sure that I see much use-case for doing it like that.
You'd want to do the ALTER and commit so as not to lock other people
out of the table entirely while doing the bulk data-pushing.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Tom Lane

Stephen Frost [EMAIL PROTECTED] writes:
 In general, I do prefer that permissions be seperably grantable.  Being
 able to grant 'truncate' permissions would be really nice.  Is the only
 reason such permission doesn't exist due to no one working on it, or is
 there other disagreement about it?

Lack of appetite for having forty nonstandard kinds of privilege,
I suppose ;-)

Given that we now have roles, it's fairly easy to grant table owner
to trusted people, so the use-case for special privilege types has
dropped off dramatically IMHO.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Jim C. Nasby

On Tue, Jan 03, 2006 at 11:29:02AM -0500, Tom Lane wrote:
 Stephen Frost [EMAIL PROTECTED] writes:
  In general, I do prefer that permissions be seperably grantable.  Being
  able to grant 'truncate' permissions would be really nice.  Is the only
  reason such permission doesn't exist due to no one working on it, or is
  there other disagreement about it?
 
 Lack of appetite for having forty nonstandard kinds of privilege,
 I suppose ;-)
 
 Given that we now have roles, it's fairly easy to grant table owner
 to trusted people, so the use-case for special privilege types has
 dropped off dramatically IMHO.

Yeah, I hadn't thought about that. I agree; if you trust some process
enough to have MVCC-affecting rights then you should be able to trust it
with full ownership rights.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Jim C. Nasby

On Tue, Jan 03, 2006 at 11:26:51AM -0500, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  Dumb question: if the ALTER is done inside a transaction, and then
  reverted at the end of the transaction, does that mean that no other
  transactions would have those permissions? I think the general use-case
  is that you only one the session doing the ALTER to be able to use these
  special modes, not anyone else who happens to be hitting the table at
  that time...
 
 Such an ALTER would certainly require exclusive lock on the table,
 so I'm not sure that I see much use-case for doing it like that.
 You'd want to do the ALTER and commit so as not to lock other people
 out of the table entirely while doing the bulk data-pushing.

Maybe this just isn't clear, but would EXCLUSIVE block writes from all
other sessions then? The post I replied to mentioned that the ALTER
would affect all backends is why I'm wondering...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Tom Lane

Jim C. Nasby [EMAIL PROTECTED] writes:
 On Tue, Jan 03, 2006 at 11:26:51AM -0500, Tom Lane wrote:
 Such an ALTER would certainly require exclusive lock on the table,
 so I'm not sure that I see much use-case for doing it like that.
 You'd want to do the ALTER and commit so as not to lock other people
 out of the table entirely while doing the bulk data-pushing.

 Maybe this just isn't clear, but would EXCLUSIVE block writes from all
 other sessions then?

I don't think it should (which implies that EXCLUSIVE is a bad name).
My point is that ALTER RELIABILITY would have to gain exclusive lock
for long enough to change the table's reliability marking --- you have
to synchronize such a change with other transactions' activity on the
table, and table-level locks are the only mechanism we have for that.
It's not different from a schema change such as adding a column.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Stephen Frost

* Tom Lane ([EMAIL PROTECTED]) wrote:
 Stephen Frost [EMAIL PROTECTED] writes:
  In general, I do prefer that permissions be seperably grantable.  Being
  able to grant 'truncate' permissions would be really nice.  Is the only
  reason such permission doesn't exist due to no one working on it, or is
  there other disagreement about it?
 
 Lack of appetite for having forty nonstandard kinds of privilege,
 I suppose ;-)
 
 Given that we now have roles, it's fairly easy to grant table owner
 to trusted people, so the use-case for special privilege types has
 dropped off dramatically IMHO.

The problem is that you might want to grant 'truncate' to people who
*aren't* particularly trusted.  For truncate, at least I have a 
real-world use-case for it.  I've got a number of users who are required
to fill in on the order of 300 seperate tables.  We provide a number of
different ways of doing the uploads: 
  ODBC
  phppgadmin
  Our own phppgadmin-like interface
  Web-based streaming CSV uploader

The CSV uploader is by far the most popular because it allows them to
easily reload a table from the files they have.  The uploader starts off
with a 'delete from x' right now.  I've been looking at (but don't
particularly like) writing a setuid pl/pgsql function so that I can make
the uploader be able to truncate the tables.  This would almost entirely
eliminate the need to do vacuums on the tables.  It's very rare for
there to be multiple things happening on a given database at a given
time too.

What does happen though, is that these tables are often used immediately
after they've been uploaded which means they needs to be analyzed too.
Again, something which would be very nice if the uploader could do.
Same with vacuums, in fact, if the uploader could vacuum the tables I
probably wouldn't need truncate as badly (though it'd still be nice),
since I could just delete from table; vacuum;

What these users *can't* do, by any means, is drop tables, or change the
structure or types or keys or anything else having to do with the table
definitions.

Writing setuid pl/pgsql functions for each of these is circumventing the 
ACL and permission system of the database; working *around* it instead
of *with* it, which makes me somewhat nervous and feels like a lack in
the database. :/

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Stephen Frost

* Jim C. Nasby ([EMAIL PROTECTED]) wrote:
 Yeah, I hadn't thought about that. I agree; if you trust some process
 enough to have MVCC-affecting rights then you should be able to trust it
 with full ownership rights.

About that, I have to disagree. :)  I've got a case where this isn't
true, see my other post please.  Giving someone truncate rights is
*very* different from allowing them to drop a table or change the
definition of the table.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Bruce Momjian

Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  On Tue, Jan 03, 2006 at 11:26:51AM -0500, Tom Lane wrote:
  Such an ALTER would certainly require exclusive lock on the table,
  so I'm not sure that I see much use-case for doing it like that.
  You'd want to do the ALTER and commit so as not to lock other people
  out of the table entirely while doing the bulk data-pushing.
 
  Maybe this just isn't clear, but would EXCLUSIVE block writes from all
  other sessions then?
 
 I don't think it should (which implies that EXCLUSIVE is a bad name).

Agreed, EXCLUSIVE was used to mean an _exclusive_ writer.  The new words
I proposed were PRESERVE or STABLE.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Tom Lane

Stephen Frost [EMAIL PROTECTED] writes:
 The problem is that you might want to grant 'truncate' to people who
 *aren't* particularly trusted.  For truncate, at least I have a
 real-world use-case for it.

I don't find this use-case particularly convincing.  If the users are
allowed to delete all data in a given table, then that table must be
dedicated to them anyway; so it's not that easy to see why you can't
risk giving them ownership rights on it.  The worst they can do is
screw up their own data, no?

In any case, I don't see what's so wrong with the model of using
SECURITY DEFINER interface functions when you want a security
restriction that's finer-grain than the system provides.  I really
*don't* want to see us trying to, say, categorize every variety of
ALTER TABLE as a separately grantable privilege.  I could live with
something like a catchall ADMIN privilege ... except it's not
clear how that would differ from ownership.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Stephen Frost

* Tom Lane ([EMAIL PROTECTED]) wrote:
 I don't find this use-case particularly convincing.  If the users are
 allowed to delete all data in a given table, then that table must be
 dedicated to them anyway; so it's not that easy to see why you can't
 risk giving them ownership rights on it.  The worst they can do is
 screw up their own data, no?

Being able to delete all data in a given table in no way implies
ownership rights.  The tables are part of a specification which the
users are being asked to respond to.  Being able to change the table
types or remove the constraints put on the tables would allow the 
users to upload garbage which would then affect downstream processing.

We can't guarentee this won't happen anyway but we try to confine the
things they can mess up to a reasonable set which we can check for (and
do, through a rather involved error checking system).  There are *alot*
of things built on top of the table structures and having them change
would basically break the whole system (without the appropriate changes
being made to the other parts of the system).

 In any case, I don't see what's so wrong with the model of using
 SECURITY DEFINER interface functions when you want a security
 restriction that's finer-grain than the system provides.  I really
 *don't* want to see us trying to, say, categorize every variety of
 ALTER TABLE as a separately grantable privilege.  I could live with
 something like a catchall ADMIN privilege ... except it's not
 clear how that would differ from ownership.

I don't think anyone's asked for 'ALTER TABLE' privileges to be
seperately grantable.  It seems to me that the privileges which *need*
to be grantable are ones associated with DML statements.  I would 
classify TRUNCATE, VACUUM and ANALYZE as DML statements (along with 
select, insert, update, and delete).  They're PostgreSQL-specific DML 
statements but they still fall into that category.  I don't think 
it's a coincidence that the SQL-defined DML statements are all, 
individually, grantable.

That doesn't mean I think we should get rid of RULE, REFERENCES or
TRIGGER, though honestly I've very rarely needed to grant any of them 
(I don't think I've ever granted RULE or TRIGGER...).  References is
DDL-oriented, but for *other* tables; RULE and TRIGGER are DDL and I
can't really justify why someone other than the owner would need them
but I'm guessing someone's using them.  I don't think their existance
should imply that if we ever change the grants again we have to include
all types of 'ALTER TABLE', etc, though.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Jim C. Nasby

On Tue, Jan 03, 2006 at 12:08:05PM -0500, Bruce Momjian wrote:
 Tom Lane wrote:
  Jim C. Nasby [EMAIL PROTECTED] writes:
   On Tue, Jan 03, 2006 at 11:26:51AM -0500, Tom Lane wrote:
   Such an ALTER would certainly require exclusive lock on the table,
   so I'm not sure that I see much use-case for doing it like that.
   You'd want to do the ALTER and commit so as not to lock other people
   out of the table entirely while doing the bulk data-pushing.
  
   Maybe this just isn't clear, but would EXCLUSIVE block writes from all
   other sessions then?
  
  I don't think it should (which implies that EXCLUSIVE is a bad name).
 
 Agreed, EXCLUSIVE was used to mean an _exclusive_ writer.  The new words
 I proposed were PRESERVE or STABLE.

This seems to seriously limit the usefulness, though. You'll only want
to use EXCLUSIVE/PRESERVE/STABLE when you've got a specific set of DML
to do, that you know you can recover from. But if at the same time some
other part of the system could be doing what it thinks will be ACID DML
to that same table, you're now in trouble.

At a minimum that would need to be clearly spelled out in the docs. I
think it also makes a very strong use-case for exposing table-level
shared locks as well, since that would at least allow other backends to
continue reading from the table.

Idealistically, if EXCLUSIVE/PRESERVE/STABLE does it's thing by only
appending new pages, it would be nice if other backends could continue
performing updates at the same time, assuming there's free space
available elsewhere within the table (and that you'd be able to recover
those logged changes regardless of the non-logged operations). But
that's a pretty lofty goal...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Bruce Momjian

Jim C. Nasby wrote:
   I don't think it should (which implies that EXCLUSIVE is a bad name).
  
  Agreed, EXCLUSIVE was used to mean an _exclusive_ writer.  The new words
  I proposed were PRESERVE or STABLE.
 
 This seems to seriously limit the usefulness, though. You'll only want
 to use EXCLUSIVE/PRESERVE/STABLE when you've got a specific set of DML
 to do, that you know you can recover from. But if at the same time some
 other part of the system could be doing what it thinks will be ACID DML
 to that same table, you're now in trouble.
 
 At a minimum that would need to be clearly spelled out in the docs. I
 think it also makes a very strong use-case for exposing table-level
 shared locks as well, since that would at least allow other backends to
 continue reading from the table.

We would be creating a new lock type for this.

 Idealistically, if EXCLUSIVE/PRESERVE/STABLE does it's thing by only
 appending new pages, it would be nice if other backends could continue
 performing updates at the same time, assuming there's free space
 available elsewhere within the table (and that you'd be able to recover
 those logged changes regardless of the non-logged operations). But
 that's a pretty lofty goal...

Idealistically, yep.  It would be great if we could put a helmet on
and the computer would read your mind.  :-)

Basically meaning your idea of update while EXCLUSIVE/PRESERVE/STABLE is
happening is never going to be implemented because it is just too hard
to do, and too prone to error.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Jim C. Nasby

On Tue, Jan 03, 2006 at 04:20:47PM -0500, Bruce Momjian wrote:
 Jim C. Nasby wrote:
I don't think it should (which implies that EXCLUSIVE is a bad name).
   
   Agreed, EXCLUSIVE was used to mean an _exclusive_ writer.  The new words
   I proposed were PRESERVE or STABLE.
  
  This seems to seriously limit the usefulness, though. You'll only want
  to use EXCLUSIVE/PRESERVE/STABLE when you've got a specific set of DML
  to do, that you know you can recover from. But if at the same time some
  other part of the system could be doing what it thinks will be ACID DML
  to that same table, you're now in trouble.
  
  At a minimum that would need to be clearly spelled out in the docs. I
  think it also makes a very strong use-case for exposing table-level
  shared locks as well, since that would at least allow other backends to
  continue reading from the table.
 
 We would be creating a new lock type for this.

Sorry if I've just missed this in the thread, but what would  the new
lock type do? My impression is that as it stands you can either do:

BEGIN;
ALTER TABLE EXCLUSIVE;
...
ALTER TABLE SHARE; --fsync
COMMIT;

Which would block all other access to the table as soon as the first
ALTER TABLE happens. Or you can:

ALTER TABLE EXCLUSIVE;
...
ALTER TABLE SHARE;

Which means that between the two ALTER TABLES every backend that does
DML on that table will not have that DML logged, but because there's no
exclusive lock that DML would be allowed to occur.

BTW, there might be some usecase for the second scenario, in which case
it would probably be better to tell the user to aquire a table-lock on
their own rather than do it automatically as part of the update...

 Basically meaning your idea of update while EXCLUSIVE/PRESERVE/STABLE is
 happening is never going to be implemented because it is just too hard
 to do, and too prone to error.

What I figured. Never hurts to ask though. :)
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Bruce Momjian

Jim C. Nasby wrote:
  We would be creating a new lock type for this.
 
 Sorry if I've just missed this in the thread, but what would  the new
 lock type do? My impression is that as it stands you can either do:
 
 BEGIN;
 ALTER TABLE EXCLUSIVE;
 ...
 ALTER TABLE SHARE; --fsync
 COMMIT;
 
 Which would block all other access to the table as soon as the first
 ALTER TABLE happens. Or you can:
 
 ALTER TABLE EXCLUSIVE;
 ...
 ALTER TABLE SHARE;
 
 Which means that between the two ALTER TABLES every backend that does
 DML on that table will not have that DML logged, but because there's no
 exclusive lock that DML would be allowed to occur.

Right, the DML will be single-threaded and fsync of all dirty pages will
happen before commit of each transaction.

 BTW, there might be some usecase for the second scenario, in which case
 it would probably be better to tell the user to aquire a table-lock on
 their own rather than do it automatically as part of the update...

  Basically meaning your idea of update while EXCLUSIVE/PRESERVE/STABLE is
  happening is never going to be implemented because it is just too hard
  to do, and too prone to error.
 
 What I figured. Never hurts to ask though. :)

Actually, it does hurt because it generates discussion volume for no
purpose.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Simon Riggs

On Fri, 2005-12-30 at 16:14 -0500, Bruce Momjian wrote:
 Simon Riggs wrote:
  The implications of EXCLUSIVE are:
  - there will be a check on each and every I, U, D to check the state of
  the relation
  - *every* operation that attempts a write lock will attempt to acquire
  an EXCLUSIVE full table lock instead
  - following successful completion of *each* DML statement, the relation
  will be heap_sync'd involving a full scan of the buffer cache
 
 Yes, I think that is it.  What we can do is implement EXCLUSIVE to
 affect only COPY at this point, and document that, and later add other
 commands.
 
  Can I clarify the wording of the syntax? Is EXCLUSIVE the right word?
  How about FASTLOAD or BULKLOAD? Those words seem less likely to be
  misused in the future - i.e. we are invoking a special mode, rather than
  invoking a special go faster option.
 
 The problem with the FASTLOAD/BULKLOAD words is that EXCLUSIVE mode is
 probably not the best for loading.  I would think TRUNCATE would be a
 better option.
 
 In fact, in loading a table, I think both EXCLUSIVE and TRUNCATE would be
 the same, mostly.  You would create the table, set its RELIABILITY to
 TRUNCATE, COPY into the table, then set the RELIABILITY to SHARE or
 DEFAULT.  The second ALTER has to sync all the dirty data blocks, which
 the same thing EXCLUSIVE does at the conclusion of COPY.
 
 So, we need a name for EXCLUSIVE mode that suggests how it is different
 from TRUNCATE, and in this case, the difference is that EXCLUSIVE
 preserves the previous contents of the table on recovery, while TRUNCATE
 does not.  Do you want to call the mode PRESERVE, or EXCLUSIVE WRITER?
 Anyway, the keywords are easy to modify, even after the patch is
 submitted.  FYI, I usually go through keywords.c looking for a keyword
 we already use.

I'm very happy for suggestions on what these new modes are called.

   So, to summarize, I think we should add DROP/TRUNCATE, and use that by
   default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
   for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
  
  Would you mind stating again what you mean, just so I can understand
  this? Your summary isn't enough.
 
 New ALTER TABLE mode, perhaps call it PERSISTENCE:
 
   ALTER TABLE tab PERSISTENCE DROP ON RECOVERY
   ALTER TABLE tab PERSISTENCE TRUNCATE ON RECOVERY
 
 These would drop or truncate all tables with this flag on a non-clean
 start of the postmaster, and write something in the server logs. 
 However, I don't know that we have the code in place to DROP/TRUNCATE in
 recovery mode, and it would affect all databases, so it could be quite
 complex to implement.  In this mode, no WAL logs would be written for
 table modifications, though DDL commands would have to be logged.

Right now, this will be a TODO item... it looks like it will take some
thought to implement correctly.

   ALTER TABLE tab PERSISTENCE PRESERVE (or STABLE?)
 
 Table contents are preserved across recoveries, but data modifications
 can happen only one at a time.  I don't think we have a lock mode that
 does this, so I am worried a new lock mode will have to be created.  A
 simplified solution at this stage would be to take an exclusive lock on
 the table, but really we just need a single-writer table lock, which I
 don't think we have. initially this can implemented to only affect COPY
 but later can be done for other commands. 

ExclusiveLock locks out everything apart from readers, no new lock mode
AFAICS. Implementing that is little additional work for COPY.

Tom had a concern about setting this for I, U, D commands via the
executor. Not sure what the details of that are, as yet.

We can use either of the unlogged modes for pg_dump, so I'd suggest its
this one. Everybody happy with this being the new default in pg_dump, or
should it be an option?

   ALTER TABLE tab PERSISTENCE DEFAULT
 
 This would be our current default mode, which is full concurrency and
 persistence.

I'm thinking whether the ALTER TABLE statement might be better with two
bool flags rather than a 3-state char.

flag 1: ENABLE LOGGING | DISABLE LOGGING

flag 2: FULL RECOVERY | TRUNCATE ON RECOVERY

Giving 3 possible sets of options:

-- the default
ALTER TABLE mytable ENABLE LOGGING FULL RECOVERY; (default)

-- EXCLUSIVE mode
ALTER TABLE mytable DISABLE LOGGING FULL RECOVERY;
...which would be used like this
ALTER TABLE mytable DISABLE LOGGING;
COPY or other bulk data manipulation SQL
ALTER TABLE mytable ENABLE LOGGING;
...since FULL RECOVERY is the default.

-- multiuser temp table mode
ALTER TABLE mytable DISABLE LOGGING TRUNCATE ON RECOVERY;
...which would usually be left on all the time

which only uses one new keyword LOGGING and yet all the modes are fairly
explicit as to what they do.

An alternative might be the slightly more verbose:
ALTER TABLE mytable DISABLE LOGGING FORCE EXCLUSIVE TABLE LOCK;
which would be turned off by
ALTER

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Simon Riggs

On Tue, 2006-01-03 at 16:20 -0500, Bruce Momjian wrote:
 Jim C. Nasby wrote:

  Idealistically, if EXCLUSIVE/PRESERVE/STABLE does it's thing by only
  appending new pages, it would be nice if other backends could continue
  performing updates at the same time, assuming there's free space
  available elsewhere within the table (and that you'd be able to recover
  those logged changes regardless of the non-logged operations). But
  that's a pretty lofty goal...
 
 Idealistically, yep.  It would be great if we could put a helmet on
 and the computer would read your mind.  :-)
 
 Basically meaning your idea of update while EXCLUSIVE/PRESERVE/STABLE is
 happening is never going to be implemented because it is just too hard
 to do, and too prone to error.

The reason for locking the whole table was to ensure that we do not have
a mixture of logged and non-logged writers writing to the same data
blocks, since that could damage blocks unrecoverably in the event of a
crash. (Though perhaps only if full_block_writes is on)

The ALTER TABLE .. EXCLUSIVE/(insert name) mode would mean that *any*
backend who took a write lock on the table, would lock out the whole
table. So this new mode is not restricted to the job/user who ran the
ALTER TABLE command. (I would note that that is how Oracle and Teradata
do this for pre-load utility table locking, but why should we follow
them on that?)

Currently, when we add a new row when the FSM is empty, we check the
last block of the table. That would cause multiple writers to access the
same blocks and so we would be in danger. The only way to avoid that
would be for logged writers (who would use the FSM if it were not empty)
to notify back to the FSM that they have just added a block - and remove
the behaviour to look for the last block.

Anyway, one step at a time. *Maybe* we can do that in the future, but
right now I'd like to add the basic fast write/load functionality.

Also, I think I will do the docs first this time, just so everyone can
read what we're getting ahead of time, to ensure we all agree.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Simon Riggs

On Tue, 2006-01-03 at 17:38 -0500, Bruce Momjian wrote:

 Right, the DML will be single-threaded and fsync of all dirty pages will
 happen before commit of each transaction.

heap_sync() would occur at end of statement, as it does with CTAS. We
could delay until EOT but I'm not sure I see why; in most cases they'd
be the same point anyway.

I'd been toying with the idea of making the freshly added blocks live
only in temp_buffers to avoid the shared_buffers overhead, but that was
starting to sounds too wierd for my liking.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Simon Riggs

On Sat, 2005-12-31 at 12:59 +0100, Michael Paesold wrote:
 Bruce Momjian wrote:
 
   The --single-transaction mode would apply even if the dump was created
   using an earlier version of pg_dump. pg_dump has *not* been altered at
   all. (And I would again add that the idea was not my own)
  
  I assume you mean this:
  
  http://archives.postgresql.org/pgsql-patches/2005-12/msg00257.php
  
  I guess with the ALTER commands I don't see much value in the
  --single-transaction flag.  I am sure others suggested it, but would
  they suggest it now given our current direction.
 
 I just want to add that --single-transaction has a value of it's own. There
 were times when I wanted to restore parts of a dump all-or-nothing. 
 
 This is possible with PostgreSQL, unlike many other DBM systems, because
 people like Tom Lane have invested in ensuring that all DDL is working
 without implicitly committing an enclosing transaction.
 
 Using pg_restore directly into a database, it is not possible to get a
 single transaction right now. One has to restore to a file and manually
 added BEGIN/COMMIT. Just for that I think --single-transaction is a great
 addition and a missing feature.
 
 I think more people have a use-case for that.

I did originally separate the --single-transaction patch for this
reason. I think its a valid patch on its own and its wrapped and ready
to go, with some deletions from the doc patch.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Simon Riggs

On Thu, 2005-12-29 at 11:37 -0500, Bruce Momjian wrote:
 Having COPY behave differently because it is
 in a transaction is fine as long as it is user-invisible, but once you
 require users to do that to get the speedup, it isn't user-invisible
 anymore.

Since we're agreed on adding ALTER TABLE rather than COPY LOCK, we have
our explicit mechanism for speedup.

However, it costs a single line of code and very very little execution
time to add in the optimization to COPY to make it bypass WAL when
executed in the same transaction that created the table. Everything else
is already there.

As part of the use_wal test:
+   if (resultRelInfo-ri_NumIndices == 0  
+ !XLogArchivingActive()
 (cstate-rel-rd_createSubid != InvalidSubTransactionId ))
+ use_wal = false;

the value is already retrieved from cache...

Can anyone see a reason *not* to put that change in also? We just don't
advertise it as the suggested route to gaining performance, nor would
we rely on it for pg_dump/restore performance. 

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Josh Berkus

Bruce,

   Basically meaning your idea of update while
   EXCLUSIVE/PRESERVE/STABLE is happening is never going to be
   implemented because it is just too hard to do, and too prone to
   error.
 
  What I figured. Never hurts to ask though. :):)

 Actually, it does hurt because it generates discussion volume for no
 purpose.

Zowie!!

Surely you didn't mean that the way it sounded?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2006-01-03 Thread Bruce Momjian

Josh Berkus wrote:
 Bruce,
 
Basically meaning your idea of update while
EXCLUSIVE/PRESERVE/STABLE is happening is never going to be
implemented because it is just too hard to do, and too prone to
error.
  
   What I figured. Never hurts to ask though. :):)
 
  Actually, it does hurt because it generates discussion volume for no
  purpose.
 
 Zowie!!
 
 Surely you didn't mean that the way it sounded?

Actually, I did.  Throwing out random ideas that have little useful
purpose and just confuse the discussion is not helpful.

Wouldn't it be nice if PostgreSQL allowed commands to be typed
backwards so people could program by looking at the screen through a
mirror is not helpful.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-31 Thread Michael Paesold

Bruce Momjian wrote:

  The --single-transaction mode would apply even if the dump was created
  using an earlier version of pg_dump. pg_dump has *not* been altered at
  all. (And I would again add that the idea was not my own)
 
 I assume you mean this:
 
   http://archives.postgresql.org/pgsql-patches/2005-12/msg00257.php
 
 I guess with the ALTER commands I don't see much value in the
 --single-transaction flag.  I am sure others suggested it, but would
 they suggest it now given our current direction.

I just want to add that --single-transaction has a value of it's own. There
were times when I wanted to restore parts of a dump all-or-nothing. 

This is possible with PostgreSQL, unlike many other DBM systems, because
people like Tom Lane have invested in ensuring that all DDL is working
without implicitly committing an enclosing transaction.

Using pg_restore directly into a database, it is not possible to get a
single transaction right now. One has to restore to a file and manually
added BEGIN/COMMIT. Just for that I think --single-transaction is a great
addition and a missing feature.

I think more people have a use-case for that.

Best Regards,
Michael Paesold

-- 
Telefonieren Sie schon oder sparen Sie noch?
NEU: GMX Phone_Flat http://www.gmx.net/de/go/telefonie

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-31 Thread August Zajonc

As a user and a list lurker I very much like Bruce's proposed ALTER 
TABLE syntax. COPY LOCK (and the variants I can imagine being required 
for all the other types of cases) don't seem as appealing.


And ALTER TABLE seems to make it clear it is an object level change, 
feels like it fits the internal model of the change better.


As a user a quick note that I've really found the documentation strong, 
which makes a big difference. Probably a low glory thing but much 
appreciated.


Big props everyone I see posting for keeping an eye (and code) on the 
various optimizations, a lot of the use cases are ones I can relate too 
(beyond even the ETL ones). I'm always impressed at the range of areas 
folks are looking at improving, and hope to see it encouraged, even with 
something as useless as thanks :)


To a good new year... and a great past one.

- August

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-30 Thread Simon Riggs

On Thu, 2005-12-29 at 11:37 -0500, Bruce Momjian wrote:
 Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
   My view would be that this thread has been complex because everybody has
   expressed a somewhat different requirement, which could be broken down
   as:
   1. The need for a multi-user-accessible yet temporary table
   2. Loading data into a table immediately after it is created (i.e. in
   same transaction), including but not limited to a reload from pg_dump
   3. How to load data quickly into an existing table (COPY)
   4. How to add/modify data quickly in an existing table (INSERT SELECT,
   UPDATE)

  However, you then seem to be arguing for still using the COPY LOCK
  syntax, which I think Bruce intended would go away in favor of using
  these ALTER commands.  Certainly that's what I'd prefer --- COPY has
  got too darn many options already.

COPY LOCK was Tom's suggestion at the end of a long discussion thread on
this precise issue. Nobody objected to it at that point; I implemented
it *exactly* that way because I wanted to very visibly follow the
consensus of the community, after informed debate.
http://archives.postgresql.org/pgsql-hackers/2005-06/msg00068.php

Please re-read the links to previous discussions.
http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php
There are points there, not made by me, that still apply and need to be
considered here, yet have not been.

Just to restate my current thinking:
- agree we should have ALTER TABLE ... RELIABILITY DELETE ROWS
- we should have COPY LOCK rather than 
ALTER TABLE  RELIABILITY EXCLUSIVE
(Though I welcome better wording and syntax in either case; it is the
behaviour only that I discuss).

It seems now that we have agreed approaches for (1), (2) and (4). Please
note that I have listened to the needs of others with regard to
requirement (1), as espoused by earlier by Hannu and again now by
Martijn. Some of the points about requirement (3) I made in my previous
post have not yet been addressed, IMHO.

My mind is not fixed. AFAICS there are valid points remaining on both
sides of the discussion about loading data quickly into an existing
table.

 I do think it is valid concern about someone use the table between the
 CREATE and the ALTER TABLE RELIABILITY.  One solution would be to allow
 the RELIABILITY as part of the CREATE TABLE, another is to tell users to
 create the table inside a transaction.

Neither solution works for this use case:

  3. How to load data quickly into an existing table (COPY)

This is the only use case for which ALTER TABLE ... EXCLUSIVE makes
sense. That option means that any write lock held upon the table would
be an EXCLUSIVE table lock, so would never be a performance gain with
single row INSERT, UPDATE or DELETEs. 

Following Andrew's concerns, I'd also note that ALTER TABLE requires a
much higher level of privilege to operate than does COPY. That sounds
like it will make things more secure, but all it does is open up the
administrative rights, since full ownership rights must be obtained
merely to load data. 

 Having COPY behave differently because it is
 in a transaction is fine as long as it is user-invisible

Good

 I think there is great utility in giving users one API, namely
 RELIABILITY (or some other keyword), and telling them that is where they
 control logging.  I realize adding one keyword, LOCK, to an existing
 command isn't a big deal, but once you decentralize your API enough
 times, you end up with a terribly complex database system.  It is this
 design rigidity that helps make PostgreSQL so much easier to use than
 other database systems.

I do see the appeal of your suggestion...

TRUNCATE is a special command to delete quickly. There is no requirement
to do an ALTER TABLE statement before that command executes.

Balance would suggest that a special command to load data quickly would
be reasonably accepted by users.




Minor points below:

   In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
   pg_dump *will* work with any other version of PostgreSQL, which *would
   not* be the case if we added ALTER TABLE ... RELIABILITY statements into
   it.
  
  Wrong --- the good thing about ALTER TABLE is that an old version of
  Postgres would simply reject it and keep going.  Therefore we could get
  the speedup in dumps without losing compatibility, which is not true
  of COPY LOCK.

That was pointing out one of Bruce's objections was not relevant because
it assumed COPY LOCK was required to make pg_restore go faster; that was
not the case - so there is no valid objection either way now.

  BTW, this is a perfect example of the use-case for not abandoning a
  dump-file load simply because one command fails.  (We have relied on
  this sort of reasoning many times before, too, for example by using
  SET default_with_oids in preference to CREATE TABLE WITH/WITHOUT OIDS.)
  I don't think that wrap the whole load into begin/end is really a very
  workable

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-30 Thread Andrew Dunstan

Simon Riggs said:


 Following Andrew's concerns, I'd also note that ALTER TABLE requires a
 much higher level of privilege to operate than does COPY. That sounds
 like it will make things more secure, but all it does is open up the
 administrative rights, since full ownership rights must be obtained
 merely to load data.


Yeah. And since a role can own a table you could have a role and add lots of
users to it 

My concern is more about making plain that this is for special operations,
not normal operations. Or maybe I have misunderstood the purpose.

cheers

andrew





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-30 Thread Tom Lane

Andrew Dunstan [EMAIL PROTECTED] writes:
 Simon Riggs said:
 Following Andrew's concerns, I'd also note that ALTER TABLE requires a
 much higher level of privilege to operate than does COPY. That sounds
 like it will make things more secure, but all it does is open up the
 administrative rights, since full ownership rights must be obtained
 merely to load data.

 My concern is more about making plain that this is for special operations,
 not normal operations. Or maybe I have misunderstood the purpose.

Rephrase that as full ownership rights must be obtained to load data in
a way that requires dropping any existing indexes and locking out other
users of the table.  I don't think the use-case for this will be very
large for non-owners, or indeed even for owners except during initial
table creation; and so I don't think the above argument is strong.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-30 Thread Bruce Momjian

Simon Riggs wrote:
 On Thu, 2005-12-29 at 11:37 -0500, Bruce Momjian wrote:
  Tom Lane wrote:
   Simon Riggs [EMAIL PROTECTED] writes:
My view would be that this thread has been complex because everybody has
expressed a somewhat different requirement, which could be broken down
as:
1. The need for a multi-user-accessible yet temporary table
2. Loading data into a table immediately after it is created (i.e. in
same transaction), including but not limited to a reload from pg_dump
3. How to load data quickly into an existing table (COPY)
4. How to add/modify data quickly in an existing table (INSERT SELECT,
UPDATE)
 
   However, you then seem to be arguing for still using the COPY LOCK
   syntax, which I think Bruce intended would go away in favor of using
   these ALTER commands.  Certainly that's what I'd prefer --- COPY has
   got too darn many options already.
 
 COPY LOCK was Tom's suggestion at the end of a long discussion thread on
 this precise issue. Nobody objected to it at that point; I implemented
 it *exactly* that way because I wanted to very visibly follow the
 consensus of the community, after informed debate.
 http://archives.postgresql.org/pgsql-hackers/2005-06/msg00068.php
 
 Please re-read the links to previous discussions.
 http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php
 There are points there, not made by me, that still apply and need to be
 considered here, yet have not been.

Yes, I know we agreed to the COPY LOCK, but new features now being
requested, so we have to re-evaluate where we are going with COPY LOCK
to get a more consistent solution.

 Just to restate my current thinking:
 - agree we should have ALTER TABLE ... RELIABILITY DELETE ROWS
 - we should have COPY LOCK rather than 
 ALTER TABLE  RELIABILITY EXCLUSIVE
 (Though I welcome better wording and syntax in either case; it is the
 behaviour only that I discuss).
 
 It seems now that we have agreed approaches for (1), (2) and (4). Please
 note that I have listened to the needs of others with regard to
 requirement (1), as espoused by earlier by Hannu and again now by
 Martijn. Some of the points about requirement (3) I made in my previous
 post have not yet been addressed, IMHO.
 
 My mind is not fixed. AFAICS there are valid points remaining on both
 sides of the discussion about loading data quickly into an existing
 table.
 
  I do think it is valid concern about someone use the table between the
  CREATE and the ALTER TABLE RELIABILITY.  One solution would be to allow
  the RELIABILITY as part of the CREATE TABLE, another is to tell users to
  create the table inside a transaction.
 
 Neither solution works for this use case:
 
   3. How to load data quickly into an existing table (COPY)
 
 This is the only use case for which ALTER TABLE ... EXCLUSIVE makes
 sense. That option means that any write lock held upon the table would
 be an EXCLUSIVE table lock, so would never be a performance gain with
 single row INSERT, UPDATE or DELETEs. 

Ah, but people wanted fast INSERT INTO ... SELECT, and that would use
EXCLUSIVE too.  What about a massive UPDATE?  Perhaps that could use
EXCLUSIVE?  We don't want to add LOCK to every command that might use
EXCLUSIVE.  ALTER is much better for this.

I agree if we thought EXCLUSIVE would only be used for COPY, we could
use LOCK, but I am thinking it will be used for other commands as well.

 Following Andrew's concerns, I'd also note that ALTER TABLE requires a
 much higher level of privilege to operate than does COPY. That sounds
 like it will make things more secure, but all it does is open up the
 administrative rights, since full ownership rights must be obtained
 merely to load data. 

True, but as pointed out by others, I don't see that happening too
often.

  Having COPY behave differently because it is
  in a transaction is fine as long as it is user-invisible
 
 Good
 
  I think there is great utility in giving users one API, namely
  RELIABILITY (or some other keyword), and telling them that is where they
  control logging.  I realize adding one keyword, LOCK, to an existing
  command isn't a big deal, but once you decentralize your API enough
  times, you end up with a terribly complex database system.  It is this
  design rigidity that helps make PostgreSQL so much easier to use than
  other database systems.
 
 I do see the appeal of your suggestion...
 
 TRUNCATE is a special command to delete quickly. There is no requirement
 to do an ALTER TABLE statement before that command executes.

The TRUNCATE happens during recovery.  There is no user interaction.  It
happens because we can't restore the contents of the table in a
consistent state because no logging was used.  Basically, a table marked
RELIABILITY TRUNCATE would be truncated on a recovery start of the
postmaster.

 Balance would suggest that a special command to load data quickly would
 be reasonably accepted by users.
 
 
 
 
 Minor points below:

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-30 Thread Andrew Dunstan




Tom Lane wrote:


Andrew Dunstan [EMAIL PROTECTED] writes:
 


Simon Riggs said:
   


Following Andrew's concerns, I'd also note that ALTER TABLE requires a
much higher level of privilege to operate than does COPY. That sounds
like it will make things more secure, but all it does is open up the
administrative rights, since full ownership rights must be obtained
merely to load data.
 



 


My concern is more about making plain that this is for special operations,
not normal operations. Or maybe I have misunderstood the purpose.
   



Rephrase that as full ownership rights must be obtained to load data in
a way that requires dropping any existing indexes and locking out other
users of the table.  I don't think the use-case for this will be very
large for non-owners, or indeed even for owners except during initial
table creation; and so I don't think the above argument is strong.


 



Those restrictions aren't true of Bruce's proposed drop and
delete/truncate recovery modes, are they?

People do crazy things in pursuit of performance. Illustration: a few
months ago I was instrumenting an app (based on MySQL/ISAM) and I
noticed that under load it simply didn't update the inventory properly -
of 1000 orders placed within a few seconds it might reduce inventory by
3 or 4. I reported this and they shrugged their shoulders and said
well, we'd have to lock the table and that would slow everything down


I just want to be sure we aren't providing a footgun. Oh, just set
recovery mode to delete. It won't make any difference unless you crash
and you'll run faster.

cheers

andrew




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-30 Thread Bruce Momjian

Andrew Dunstan wrote:
 My concern is more about making plain that this is for special operations,
 not normal operations. Or maybe I have misunderstood the purpose.
 
 
 
 Rephrase that as full ownership rights must be obtained to load data in
 a way that requires dropping any existing indexes and locking out other
 users of the table.  I don't think the use-case for this will be very
 large for non-owners, or indeed even for owners except during initial
 table creation; and so I don't think the above argument is strong.
 
  
   
 
 
 Those restrictions aren't true of Bruce's proposed drop and
 delete/truncate recovery modes, are they?

Only the owner could do the ALTER, for sure, but once the owner sets it,
any user with permission to write to the table would have those
characteristics.

 People do crazy things in pursuit of performance. Illustration: a few
 months ago I was instrumenting an app (based on MySQL/ISAM) and I
 noticed that under load it simply didn't update the inventory properly -
 of 1000 orders placed within a few seconds it might reduce inventory by
 3 or 4. I reported this and they shrugged their shoulders and said
 well, we'd have to lock the table and that would slow everything down
 
 
 I just want to be sure we aren't providing a footgun. Oh, just set
 recovery mode to delete. It won't make any difference unless you crash
 and you'll run faster.

I think we have to trust the object owner in this case.  I don't know of
any super-user-only ALTER commands, but I suppose we could set it up
that way if we wanted.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-30 Thread Simon Riggs

On Fri, 2005-12-30 at 11:49 -0500, Bruce Momjian wrote:

 Yes, I know we agreed to the COPY LOCK, but new features now being
 requested, so we have to re-evaluate where we are going with COPY LOCK
 to get a more consistent solution.

Thank you. 

 Ah, but people wanted fast INSERT INTO ... SELECT, and that would use
 EXCLUSIVE too.  What about a massive UPDATE?  Perhaps that could use
 EXCLUSIVE?  We don't want to add LOCK to every command that might use
 EXCLUSIVE.  ALTER is much better for this.

 I agree if we thought EXCLUSIVE would only be used for COPY, we could
 use LOCK, but I am thinking it will be used for other commands as well.

Agreed, I will look to implement this.

Could the internals of my recent patch be reviewed? Changing the user
interface is less of a problem than changing the internals, which is
where the hard work takes place. I do not want to extend this work
further only to have that part rejected later. 

The implications of EXCLUSIVE are:
- there will be a check on each and every I, U, D to check the state of
the relation
- *every* operation that attempts a write lock will attempt to acquire
an EXCLUSIVE full table lock instead
- following successful completion of *each* DML statement, the relation
will be heap_sync'd involving a full scan of the buffer cache

Can I clarify the wording of the syntax? Is EXCLUSIVE the right word?
How about FASTLOAD or BULKLOAD? Those words seem less likely to be
misused in the future - i.e. we are invoking a special mode, rather than
invoking a special go faster option.

 I don't consider the single-transaction to be a no-cost solution.  You
 are adding flags to commands, and you are using a dump layout for
 performance where the purpose for the layout is not clear.  The ALTER is
 clear to the user, and it allows nologging operations to happen after
 the table is created.
 
 In fact, for use in pg_dump, I think DROP is the proper operation for
 loading, not your transaction wrapping solution.  We already agree we
 need DROP (or TRUNCATE), so why not use that rather than the transaction
 wrap idea?

This was discussed on-list by 2 core team members, a committer and
myself, but I see no requirements change here. You even accepted the
invisible COPY optimization in your last post - why unpick that now?
Please forgive my tone, but I am lost for reasonable yet expressive
words. 

The --single-transaction mode would apply even if the dump was created
using an earlier version of pg_dump. pg_dump has *not* been altered at
all. (And I would again add that the idea was not my own)

 So, to summarize, I think we should add DROP/TRUNCATE, and use that by
 default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
 for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.

Would you mind stating again what you mean, just so I can understand
this? Your summary isn't enough.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-30 Thread Bruce Momjian

Simon Riggs wrote:
 On Fri, 2005-12-30 at 11:49 -0500, Bruce Momjian wrote:
 
  Yes, I know we agreed to the COPY LOCK, but new features now being
  requested, so we have to re-evaluate where we are going with COPY LOCK
  to get a more consistent solution.
 
 Thank you. 

Good.  I think we can be happy that COPY LOCK didn't get into a release,
so we don't have to support it forever.  When we are adding features, we
have to consider not only the current release, but future releases and
what people will ask for in the future so the syntax can be expanded
without breaking previous usage.

  Ah, but people wanted fast INSERT INTO ... SELECT, and that would use
  EXCLUSIVE too.  What about a massive UPDATE?  Perhaps that could use
  EXCLUSIVE?  We don't want to add LOCK to every command that might use
  EXCLUSIVE.  ALTER is much better for this.
 
  I agree if we thought EXCLUSIVE would only be used for COPY, we could
  use LOCK, but I am thinking it will be used for other commands as well.
 
 Agreed, I will look to implement this.
 
 Could the internals of my recent patch be reviewed? Changing the user
 interface is less of a problem than changing the internals, which is
 where the hard work takes place. I do not want to extend this work
 further only to have that part rejected later. 

OK, I will look it over this week or next.

 The implications of EXCLUSIVE are:
 - there will be a check on each and every I, U, D to check the state of
 the relation
 - *every* operation that attempts a write lock will attempt to acquire
 an EXCLUSIVE full table lock instead
 - following successful completion of *each* DML statement, the relation
 will be heap_sync'd involving a full scan of the buffer cache

Yes, I think that is it.  What we can do is implement EXCLUSIVE to
affect only COPY at this point, and document that, and later add other
commands.

 Can I clarify the wording of the syntax? Is EXCLUSIVE the right word?
 How about FASTLOAD or BULKLOAD? Those words seem less likely to be
 misused in the future - i.e. we are invoking a special mode, rather than
 invoking a special go faster option.

The problem with the FASTLOAD/BULKLOAD words is that EXCLUSIVE mode is
probably not the best for loading.  I would think TRUNCATE would be a
better option.

In fact, in loading a table, I think both EXCLUSIVE and TRUNCATE would be
the same, mostly.  You would create the table, set its RELIABILITY to
TRUNCATE, COPY into the table, then set the RELIABILITY to SHARE or
DEFAULT.  The second ALTER has to sync all the dirty data blocks, which
the same thing EXCLUSIVE does at the conclusion of COPY.

So, we need a name for EXCLUSIVE mode that suggests how it is different
from TRUNCATE, and in this case, the difference is that EXCLUSIVE
preserves the previous contents of the table on recovery, while TRUNCATE
does not.  Do you want to call the mode PRESERVE, or EXCLUSIVE WRITER?
Anyway, the keywords are easy to modify, even after the patch is
submitted.  FYI, I usually go through keywords.c looking for a keyword
we already use.

  I don't consider the single-transaction to be a no-cost solution.  You
  are adding flags to commands, and you are using a dump layout for
  performance where the purpose for the layout is not clear.  The ALTER is
  clear to the user, and it allows nologging operations to happen after
  the table is created.
  
  In fact, for use in pg_dump, I think DROP is the proper operation for
  loading, not your transaction wrapping solution.  We already agree we
  need DROP (or TRUNCATE), so why not use that rather than the transaction
  wrap idea?
 
 This was discussed on-list by 2 core team members, a committer and
 myself, but I see no requirements change here. You even accepted the
 invisible COPY optimization in your last post - why unpick that now?
 Please forgive my tone, but I am lost for reasonable yet expressive
 words. 

Do you think you are the only one who has rewritten a patch multiple
times?  We all have.  The goal is to get the functionality into the
system in the most seamless way possible.  Considering the number of
people who use PostgreSQL, if it takes use 10 tries, it is worth it
considering the thousands of people who will use it.   Would you have us
include a sub-optimal patch and have thousands of people adjust to its
non-optimal functionality?  I am sure you would not.  Perhaps a company
would say, Oh, just ship it, but we don't.

 The --single-transaction mode would apply even if the dump was created
 using an earlier version of pg_dump. pg_dump has *not* been altered at
 all. (And I would again add that the idea was not my own)

I assume you mean this:

http://archives.postgresql.org/pgsql-patches/2005-12/msg00257.php

I guess with the ALTER commands I don't see much value in the
--single-transaction flag.  I am sure others suggested it, but would
they suggest it now given our current direction.  The fact that the
patch was submitted does not give it any more weight --- the question is

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-30 Thread Greg Stark


As far as EXCLUSIVE or COPY LOCK goes, I think this would be useful
functionality but perhaps there doesn't have to be any proprietary user
interface to it at all. Why not just check if the conditions are already
present to allow the optimization and if so go ahead.

That is, if the current transaction already has an exclusive lock on the table
and there are no indexes (and PITR isn't active) then Postgres could go ahead
and use the same WAL skipping logic as the other operations that already so
so. This would work for inserts whether coming from COPY or plain SQL INSERTs.

The nice thing about this is that the user's SQL wouldn't need any proprietary
extensions at all. Just tell people to do

BEGIN;
LOCK TABLE foo;
COPY foo from ...
COMMIT;

There could be a COPY LOCK option to obtain a lock, but it would be purely for
user convenience so they don't have to bother with BEGIN and COMMIt.

The only downside is a check to see if an exclusive table lock is present on
every copy and insert. That might be significant but perhaps there are ways to
finess that. If not perhaps only doing it on COPY would be a good compromise.

-- 
greg


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-30 Thread Bruce Momjian

Greg Stark wrote:
 
 As far as EXCLUSIVE or COPY LOCK goes, I think this would be useful
 functionality but perhaps there doesn't have to be any proprietary user
 interface to it at all. Why not just check if the conditions are already
 present to allow the optimization and if so go ahead.
 
 That is, if the current transaction already has an exclusive lock on the table
 and there are no indexes (and PITR isn't active) then Postgres could go ahead
 and use the same WAL skipping logic as the other operations that already so
 so. This would work for inserts whether coming from COPY or plain SQL INSERTs.
 
 The nice thing about this is that the user's SQL wouldn't need any proprietary
 extensions at all. Just tell people to do
 
 BEGIN;
 LOCK TABLE foo;
 COPY foo from ...
 COMMIT;
 
 There could be a COPY LOCK option to obtain a lock, but it would be purely for
 user convenience so they don't have to bother with BEGIN and COMMIt.
 
 The only downside is a check to see if an exclusive table lock is present on
 every copy and insert. That might be significant but perhaps there are ways to
 finess that. If not perhaps only doing it on COPY would be a good compromise.

Well, again, if we wanted to use EXCLUSIVE only for COPY, this might
make sense.  However, also consider that the idea for EXCLUSIVE was that
users could continue read-only queries on the table while it is being
loaded (like COPY allows now), and that in EXCLUSIVE mode, we are only
going to write into new pages.  

If someone has an exclusive lock on the table and does a COPY or SELECT
INTO do we want to assume we are only going to write into new pages, and
do we want to force an exclusive lock rather than a single-writer lock? 
I don't think so.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-30 Thread Simon Riggs

On Fri, 2005-12-30 at 16:14 -0500, Bruce Momjian wrote:

  This was discussed on-list by 2 core team members, a committer and
  myself, but I see no requirements change here. You even accepted the
  invisible COPY optimization in your last post - why unpick that now?
  Please forgive my tone, but I am lost for reasonable yet expressive
  words. 
 
 Do you think you are the only one who has rewritten a patch multiple
 times?  We all have.  The goal is to get the functionality into the
 system in the most seamless way possible.  Considering the number of
 people who use PostgreSQL, if it takes use 10 tries, it is worth it
 considering the thousands of people who will use it.   Would you have us
 include a sub-optimal patch and have thousands of people adjust to its
 non-optimal functionality?  I am sure you would not.  Perhaps a company
 would say, Oh, just ship it, but we don't.

You're right. 

Not like we've not been here before, eh?

[I'll look at the tech another day]

Best Regards, Simon Riggs






---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-30 Thread Bruce Momjian

Simon Riggs wrote:
 On Fri, 2005-12-30 at 16:14 -0500, Bruce Momjian wrote:
 
   This was discussed on-list by 2 core team members, a committer and
   myself, but I see no requirements change here. You even accepted the
   invisible COPY optimization in your last post - why unpick that now?
   Please forgive my tone, but I am lost for reasonable yet expressive
   words. 
  
  Do you think you are the only one who has rewritten a patch multiple
  times?  We all have.  The goal is to get the functionality into the
  system in the most seamless way possible.  Considering the number of
  people who use PostgreSQL, if it takes use 10 tries, it is worth it
  considering the thousands of people who will use it.   Would you have us
  include a sub-optimal patch and have thousands of people adjust to its
  non-optimal functionality?  I am sure you would not.  Perhaps a company
  would say, Oh, just ship it, but we don't.
 
 You're right. 
 
 Not like we've not been here before, eh?
 
 [I'll look at the tech another day]

I know it is discouraging. I have felt it many times myself.  However, I
have to keep my eye on the greater good that we are doing as a project,
and that my frustration is a small price to pay for the greater
usability we will give to our users.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-30 Thread Greg Stark

Bruce Momjian pgman@candle.pha.pa.us writes:

  BEGIN;
  LOCK TABLE foo;
  COPY foo from ...
  COMMIT;
  
  There could be a COPY LOCK option to obtain a lock, but it would be purely 
  for
  user convenience so they don't have to bother with BEGIN and COMMIt.
  
  The only downside is a check to see if an exclusive table lock is present on
  every copy and insert. That might be significant but perhaps there are ways 
  to
  finess that. If not perhaps only doing it on COPY would be a good 
  compromise.
 
 Well, again, if we wanted to use EXCLUSIVE only for COPY, this might
 make sense.  However, also consider that the idea for EXCLUSIVE was that
 users could continue read-only queries on the table while it is being
 loaded (like COPY allows now), and that in EXCLUSIVE mode, we are only
 going to write into new pages.  

Well I pictured the above kicking in for any insert. You can't do it on
deletes and updates anyways since torn pages could cause the table to become
corrupt.

We could add a LOCK TABLE SHARED feature to allow the appropriate type of lock
to be acquired.

But now that I think further on this that doesn't really make this free.
fsyncing a table isn't free. If some other transaction has come and done some
massive updates on the table then I come along and do a single quick insert I
don't necessarily want to fsync all those pending writes, it's cheaper to
fsync the WAL log.

 If someone has an exclusive lock on the table and does a COPY or SELECT
 INTO do we want to assume we are only going to write into new pages, and
 do we want to force an exclusive lock rather than a single-writer lock? 
 I don't think so.

And only using new pages is itself a cost as well. Though I think the fact
that it would tend to mean a lot less seeking and more sequential i/o would
tend to make it a worth the extra garbage in the table.

It might be useful having some kind of LOCK TABLE SHARED anyways. It seems
silly to have the functionality in the database and not expose it for users.

-- 
greg


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-30 Thread Bruce Momjian

Greg Stark wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
 
   BEGIN;
   LOCK TABLE foo;
   COPY foo from ...
   COMMIT;
   
   There could be a COPY LOCK option to obtain a lock, but it would be 
   purely for
   user convenience so they don't have to bother with BEGIN and COMMIt.
   
   The only downside is a check to see if an exclusive table lock is present 
   on
   every copy and insert. That might be significant but perhaps there are 
   ways to
   finess that. If not perhaps only doing it on COPY would be a good 
   compromise.
  
  Well, again, if we wanted to use EXCLUSIVE only for COPY, this might
  make sense.  However, also consider that the idea for EXCLUSIVE was that
  users could continue read-only queries on the table while it is being
  loaded (like COPY allows now), and that in EXCLUSIVE mode, we are only
  going to write into new pages.  
 
 Well I pictured the above kicking in for any insert. You can't do it on
 deletes and updates anyways since torn pages could cause the table to become
 corrupt.

True, but UPDATE could save 1/2 the log I/O because the new rows would
not have to be logged.

 We could add a LOCK TABLE SHARED feature to allow the appropriate type of lock
 to be acquired.

Do you really want this behavior to be controlled by the locking mode? 
That seem strange to me, as well as confusing, especially if you already
have some other kind of lock on the table.

 But now that I think further on this that doesn't really make this free.
 fsyncing a table isn't free. If some other transaction has come and done some
 massive updates on the table then I come along and do a single quick insert I
 don't necessarily want to fsync all those pending writes, it's cheaper to
 fsync the WAL log.

That is true.

  If someone has an exclusive lock on the table and does a COPY or SELECT
  INTO do we want to assume we are only going to write into new pages, and
  do we want to force an exclusive lock rather than a single-writer lock? 
  I don't think so.
 
 And only using new pages is itself a cost as well. Though I think the fact
 that it would tend to mean a lot less seeking and more sequential i/o would
 tend to make it a worth the extra garbage in the table.
 
 It might be useful having some kind of LOCK TABLE SHARED anyways. It seems
 silly to have the functionality in the database and not expose it for users.

We could if there is a use-case for it.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-29 Thread Simon Riggs

On Wed, 2005-12-28 at 20:58 -0500, Bruce Momjian wrote:
 Having read through this thread, I would like to propose a
 syntax/behavior.
 
 I think we all now agree that the logging is more part of the table than
 the command itself.  Right now we have a COPY LOCK patch, but people are
 going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
 and all sorts of other things, so I think we are best adding an ALTER
 TABLE capability.  I am thinking of this syntax:
 
   ALTER TABLE name RELIABILITY option
 
 where option is:
 
   DROP [ TABLE ON CRASH ]
   DELETE [ ROWS ON CRASH ]
   EXCLUSIVE
   SHARE
 
 Let me explain each option.  DROP would drop the table on a restart
 after a non-clean shutdown.  It would do _no_ logging on the table and
 allow concurrent access, plus index access.  DELETE is the same as DROP,
 but it just truncates the table (perhaps TRUNCATE is a better word).
 
 EXCLUSIVE would allow only a single session to modify the table, and
 would do all changes by appending to the table, similar to COPY LOCK. 
 EXCLUSIVE would also not allow indexes because those can not be isolated
 like appending to the heap.  EXCLUSIVE would write all dirty shared
 buffers for the table and fsync them before committing.  SHARE is the
 functionality we have now, with full logging.
 
 Does this get us any closer to a TODO item?  It isn't great, but I think
 it is pretty clear, and I assume pg_dump would use ALTER to load each
 table.  The advanage is that the COPY statements themselves are
 unchanged so they would work in loading into older versions of
 PostgreSQL.

First off, thanks for summarising a complex thread.

My view would be that this thread has been complex because everybody has
expressed a somewhat different requirement, which could be broken down
as:
1. The need for a multi-user-accessible yet temporary table
2. Loading data into a table immediately after it is created (i.e. in
same transaction), including but not limited to a reload from pg_dump
3. How to load data quickly into an existing table (COPY)
4. How to add/modify data quickly in an existing table (INSERT SELECT,
UPDATE)

I can see the need for all of those individually; my existing patch
submission covers (2) and (3) only. I very much like your thought to
coalesce these various requirements into a single coherent model.

For requirement (1), table level options make sense. We would:
- CREATE TABLE ALLTHINGS
- ALTER TABLE ALLTHINGS RELIABILITY DELETE ROWS ON RECOVERY
- lots of SQL, all fast because not logged

(2) is catered for adequately by the existing COPY patch i.e. it will
detect whether a table has just been created and then avoid writing WAL.
In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
pg_dump *will* work with any other version of PostgreSQL, which *would
not* be the case if we added ALTER TABLE ... RELIABILITY statements into
it. Also, a pg_dump created at an earlier version could also be loaded
faster using the patch. The only requirement is to issue all SQL as part
of the same transaction - which is catered for by the
--single-transaction option on pg_restore and psql. So (2) is catered
for fully without the need for an ALTER TABLE ... RELIABILITY statement
or COPY LOCK.

For requirement (3), I would use table level options like this:
(the table already exists and is reasonably big; we should not assume
that everybody can and does use partitioning)
- ALTER TABLE RELIABILITY ALLTHINGS2 EXCLUSIVE
- COPY
- ALTER TABLE RELIABILITY ALLTHINGS2 SHARE

For a load into an existing table I would always do all three actions
together. COPY LOCK does exactly that *and* does it atomically. 

The two ways of doing (3) have a few pros/cons either way:
Pro for ALTER TABLE:
- same syntax as req (1)
- doesn't need the keyword LOCK 
- allows INSERT SELECT, UPDATE operations also (req 4)
Cons:
- existing programs have to add additional statements to take advantage
of this; with COPY LOCK we would add just a single keyword
- operation is not atomic, which might lead to some operations waiting
for a lock to operate as unlogged, since they would execute before the
second ALTER TABLE gets there
- operation will be understood by some, but not others. They will forget
to switch the RELIABILITY back on and then lose their whole table when
the database crashes. (watch...)

...but would it be a problem to have both?


So, my thinking would be to separate things into two:
a) Add a TODO item shared temp tables that caters for (1) and (4)

ALTER TABLE name RELIABILITY 
{DELETE ROWS AT RECOVERY | FULL RECOVERY}
(syntax TBD)

which would 
- truncate all rows and remove all index entries during recovery
- use shared_buffers, not temp_buffers
- never write xlog records, even when in PITR mode
- would avoid writing WAL for both heap *and* index tuples

b) Leave the COPY patch as is, since it caters for reqs (2) and (3) as
*separate* optimizations (but using a common infrastructure in

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-29 Thread Rod Taylor


 So, my thinking would be to separate things into two:
 a) Add a TODO item shared temp tables that caters for (1) and (4)
 
   ALTER TABLE name RELIABILITY 
   {DELETE ROWS AT RECOVERY | FULL RECOVERY}
 (syntax TBD)

DELETE ROWS AT RECOVERY would need to be careful or disallowed when
referenced via a foreign key to ensure the database is not restored in
an inconsistent state.

-- 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-29 Thread Simon Riggs

On Thu, 2005-12-29 at 09:35 -0500, Rod Taylor wrote:
  So, my thinking would be to separate things into two:
  a) Add a TODO item shared temp tables that caters for (1) and (4)
  
  ALTER TABLE name RELIABILITY 
  {DELETE ROWS AT RECOVERY | FULL RECOVERY}
  (syntax TBD)
 
 DELETE ROWS AT RECOVERY would need to be careful or disallowed when
 referenced via a foreign key to ensure the database is not restored in
 an inconsistent state.

I think we'd need to apply the same rule as we do for temp tables: they
cannot be referenced by a permanent table.

There are possibly some other restrictions also. Anyone?

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-29 Thread Andrew Dunstan

Bruce Momjian said:
 DROP would drop the table on a restart
 after a non-clean shutdown.  It would do _no_ logging on the table and
 allow concurrent access, plus index access.  DELETE is the same as
 DROP, but it just truncates the table (perhaps TRUNCATE is a better
 word).

 EXCLUSIVE would allow only a single session to modify the table, and
 would do all changes by appending to the table, similar to COPY LOCK.
 EXCLUSIVE would also not allow indexes because those can not be
 isolated like appending to the heap.  EXCLUSIVE would write all dirty
 shared buffers for the table and fsync them before committing.  SHARE
 is the functionality we have now, with full logging.


I an horribly scared that this will be used as a performance boost for
normal use. I would at least like to see some restrictions that make it
harder to mis-use. Perhaps restrict to superuser?

cheers

andrew





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-29 Thread Bruce Momjian

Andrew Dunstan wrote:
 Bruce Momjian said:
  DROP would drop the table on a restart
  after a non-clean shutdown.  It would do _no_ logging on the table and
  allow concurrent access, plus index access.  DELETE is the same as
  DROP, but it just truncates the table (perhaps TRUNCATE is a better
  word).
 
  EXCLUSIVE would allow only a single session to modify the table, and
  would do all changes by appending to the table, similar to COPY LOCK.
  EXCLUSIVE would also not allow indexes because those can not be
  isolated like appending to the heap.  EXCLUSIVE would write all dirty
  shared buffers for the table and fsync them before committing.  SHARE
  is the functionality we have now, with full logging.
 
 
 I an horribly scared that this will be used as a performance boost for
 normal use. I would at least like to see some restrictions that make it
 harder to mis-use. Perhaps restrict to superuser?

Certainly restrict to table owner.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-29 Thread Tom Lane

Simon Riggs [EMAIL PROTECTED] writes:
 My view would be that this thread has been complex because everybody has
 expressed a somewhat different requirement, which could be broken down
 as:
 1. The need for a multi-user-accessible yet temporary table
 2. Loading data into a table immediately after it is created (i.e. in
 same transaction), including but not limited to a reload from pg_dump
 3. How to load data quickly into an existing table (COPY)
 4. How to add/modify data quickly in an existing table (INSERT SELECT,
 UPDATE)

 I can see the need for all of those individually; my existing patch
 submission covers (2) and (3) only. I very much like your thought to
 coalesce these various requirements into a single coherent model.

However, you then seem to be arguing for still using the COPY LOCK
syntax, which I think Bruce intended would go away in favor of using
these ALTER commands.  Certainly that's what I'd prefer --- COPY has
got too darn many options already.

 In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
 pg_dump *will* work with any other version of PostgreSQL, which *would
 not* be the case if we added ALTER TABLE ... RELIABILITY statements into
 it.

Wrong --- the good thing about ALTER TABLE is that an old version of
Postgres would simply reject it and keep going.  Therefore we could get
the speedup in dumps without losing compatibility, which is not true
of COPY LOCK.

BTW, this is a perfect example of the use-case for not abandoning a
dump-file load simply because one command fails.  (We have relied on
this sort of reasoning many times before, too, for example by using
SET default_with_oids in preference to CREATE TABLE WITH/WITHOUT OIDS.)
I don't think that wrap the whole load into begin/end is really a very
workable answer, because there are far too many scenarios where you
can't do that.  Another one where it doesn't help is a data-only dump.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-29 Thread Tom Lane

Bruce Momjian pgman@candle.pha.pa.us writes:
 Andrew Dunstan wrote:
 I an horribly scared that this will be used as a performance boost for
 normal use. I would at least like to see some restrictions that make it
 harder to mis-use. Perhaps restrict to superuser?

 Certainly restrict to table owner.

I can see the argument for superuser-only: decisions about data
integrity tradeoffs should be reserved to the DBA, who is the one who
will get blamed if the database loses data, no matter how stupid his
users are.

But I'm not wedded to that.  I could live with table-owner.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-29 Thread Bruce Momjian

Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  My view would be that this thread has been complex because everybody has
  expressed a somewhat different requirement, which could be broken down
  as:
  1. The need for a multi-user-accessible yet temporary table
  2. Loading data into a table immediately after it is created (i.e. in
  same transaction), including but not limited to a reload from pg_dump
  3. How to load data quickly into an existing table (COPY)
  4. How to add/modify data quickly in an existing table (INSERT SELECT,
  UPDATE)
 
  I can see the need for all of those individually; my existing patch
  submission covers (2) and (3) only. I very much like your thought to
  coalesce these various requirements into a single coherent model.
 
 However, you then seem to be arguing for still using the COPY LOCK
 syntax, which I think Bruce intended would go away in favor of using
 these ALTER commands.  Certainly that's what I'd prefer --- COPY has
 got too darn many options already.
 
  In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
  pg_dump *will* work with any other version of PostgreSQL, which *would
  not* be the case if we added ALTER TABLE ... RELIABILITY statements into
  it.
 
 Wrong --- the good thing about ALTER TABLE is that an old version of
 Postgres would simply reject it and keep going.  Therefore we could get
 the speedup in dumps without losing compatibility, which is not true
 of COPY LOCK.
 
 BTW, this is a perfect example of the use-case for not abandoning a
 dump-file load simply because one command fails.  (We have relied on
 this sort of reasoning many times before, too, for example by using
 SET default_with_oids in preference to CREATE TABLE WITH/WITHOUT OIDS.)
 I don't think that wrap the whole load into begin/end is really a very
 workable answer, because there are far too many scenarios where you
 can't do that.  Another one where it doesn't help is a data-only dump.

Yep, Tom is echoing my reaction.  There is a temptation to add things up
onto existing commands, e.g. LOCK, and while it works, it makes for some
very complex user API's.  Having COPY behave differently because it is
in a transaction is fine as long as it is user-invisible, but once you
require users to do that to get the speedup, it isn't user-invisible
anymore.

(I can see it now, Why is pg_dump putting things in transactions?,
Because it prevents it from being logged.  Oh, should I be doing that
in my code?  Perhaps, if you want ...  You can see where that
discussion is going.  Having them see ATER TABLE ... RELIBILITY
TRUNCATE is very clear, and very clear on how it can be used in user
code.)

I think there is great utility in giving users one API, namely
RELIABILITY (or some other keyword), and telling them that is where they
control logging.  I realize adding one keyword, LOCK, to an existing
command isn't a big deal, but once you decentralize your API enough
times, you end up with a terribly complex database system.  It is this
design rigidity that helps make PostgreSQL so much easier to use than
other database systems.

I do think it is valid concern about someone use the table between the
CREATE and the ALTER TABLE RELIABILITY.  One solution would be to allow
the RELIABILITY as part of the CREATE TABLE, another is to tell users to
create the table inside a transaction.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-29 Thread Greg Stark

Andrew Dunstan [EMAIL PROTECTED] writes:

 Bruce Momjian said:
  DROP would drop the table on a restart
  after a non-clean shutdown.  It would do _no_ logging on the table and
  allow concurrent access, plus index access.  DELETE is the same as
  DROP, but it just truncates the table (perhaps TRUNCATE is a better
  word).
 
  EXCLUSIVE would allow only a single session to modify the table, and
  would do all changes by appending to the table, similar to COPY LOCK.
  EXCLUSIVE would also not allow indexes because those can not be
  isolated like appending to the heap.  EXCLUSIVE would write all dirty
  shared buffers for the table and fsync them before committing.  SHARE
  is the functionality we have now, with full logging.
 
 I an horribly scared that this will be used as a performance boost for
 normal use. I would at least like to see some restrictions that make it
 harder to mis-use. Perhaps restrict to superuser?

Well that's its whole purpose. At least you can hardly argue that you didn't
realize the consequences of DELETE ROWS ON RECOVERY... :)

Some thoughts:

a) I'm not sure I understand the purpose of EXCLUSIVE. When would I ever want to
   use it instead of DELETE ROWS?

b) It seems like the other feature people were talking about of not logging
   for a table created within the same transaction should be handled by
   having this flag implicitly set for any such newly created table.
   Ie, the test for whether to log would look like:

   if (!table-logged  table-xid != myxid) ...

c) Every option in ALTER TABLE should be in CREATE TABLE as well.

d) Yes as someone else mentioned, this should only be allowable on a table
   with no foreign keys referencing it. 

-- 
greg


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-29 Thread Bruce Momjian

Greg Stark wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
 
  Bruce Momjian said:
   DROP would drop the table on a restart
   after a non-clean shutdown.  It would do _no_ logging on the table and
   allow concurrent access, plus index access.  DELETE is the same as
   DROP, but it just truncates the table (perhaps TRUNCATE is a better
   word).
  
   EXCLUSIVE would allow only a single session to modify the table, and
   would do all changes by appending to the table, similar to COPY LOCK.
   EXCLUSIVE would also not allow indexes because those can not be
   isolated like appending to the heap.  EXCLUSIVE would write all dirty
   shared buffers for the table and fsync them before committing.  SHARE
   is the functionality we have now, with full logging.
  
  I an horribly scared that this will be used as a performance boost for
  normal use. I would at least like to see some restrictions that make it
  harder to mis-use. Perhaps restrict to superuser?
 
 Well that's its whole purpose. At least you can hardly argue that you didn't
 realize the consequences of DELETE ROWS ON RECOVERY... :)

True.  I think we are worried about non-owners using it, but the owner
had to grant permissions for others to modify it, so we might be OK.

 Some thoughts:
 
 a) I'm not sure I understand the purpose of EXCLUSIVE. When would I ever want 
 to
use it instead of DELETE ROWS?

Good question.  The use case is doing COPY into a table that already had
data.  EXCLUSIVE allows additions to the table but preserves the
existing data on a crash.

 b) It seems like the other feature people were talking about of not logging
for a table created within the same transaction should be handled by
having this flag implicitly set for any such newly created table.
Ie, the test for whether to log would look like:
 
if (!table-logged  table-xid != myxid) ...

Yes, the question is whether we want to limit users to having this
optimization _only_ when they have created the table in the same
transaction, and the short answer is we don't.

 c) Every option in ALTER TABLE should be in CREATE TABLE as well.

I looked into that and see that things like:

ALTER [ COLUMN ] column SET STATISTICS integer
ALTER [ COLUMN ] column SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN }

are not supported by CREATE TABLE, and probably shouldn't be because the
value can be changed after the table is created.  I think the only
things we usually support in CREATE TABLE are those that cannot be
altered.

 d) Yes as someone else mentioned, this should only be allowable on a table
with no foreign keys referencing it. 

Right, and EXCLUSIVE can not have an index either.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-28 Thread Martijn van Oosterhout

On Wed, Dec 28, 2005 at 12:47:31AM +0200, Hannu Krosing wrote:
  I've thought of one other possibility, which is kind of at the extreme
  end of system implementation. Given the suggestion about not losing a
  whole table on unclean shutdown, how about using a single table, split.

snip

 How would it work for indexes ?

Hmm, it wouldn't, so scrap that.

If any anyone ever gets indexes that work across multiple tables of an
inheritence hierarchy, you could probably make it work for this.
Otherwise... OTOH, triggers will trip you up also.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpIkidj3ZXQh.pgp
Description: PGP signature

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-28 Thread Bruce Momjian


Having read through this thread, I would like to propose a
syntax/behavior.

I think we all now agree that the logging is more part of the table than
the command itself.  Right now we have a COPY LOCK patch, but people are
going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
and all sorts of other things, so I think we are best adding an ALTER
TABLE capability.  I am thinking of this syntax:

ALTER TABLE name RELIABILITY option

where option is:

DROP [ TABLE ON CRASH ]
DELETE [ ROWS ON CRASH ]
EXCLUSIVE
SHARE

Let me explain each option.  DROP would drop the table on a restart
after a non-clean shutdown.  It would do _no_ logging on the table and
allow concurrent access, plus index access.  DELETE is the same as DROP,
but it just truncates the table (perhaps TRUNCATE is a better word).

EXCLUSIVE would allow only a single session to modify the table, and
would do all changes by appending to the table, similar to COPY LOCK. 
EXCLUSIVE would also not allow indexes because those can not be isolated
like appending to the heap.  EXCLUSIVE would write all dirty shared
buffers for the table and fsync them before committing.  SHARE is the
functionality we have now, with full logging.

Does this get us any closer to a TODO item?  It isn't great, but I think
it is pretty clear, and I assume pg_dump would use ALTER to load each
table.  The advanage is that the COPY statements themselves are
unchanged so they would work in loading into older versions of
PostgreSQL.

---

Martijn van Oosterhout wrote:
-- Start of PGP signed section.
 On Mon, Dec 26, 2005 at 12:03:27PM +, Simon Riggs wrote:
  I would not be against such a table-level switch, but the exact
  behaviour would need to be specified more closely before this became a
  TODO item, IMHO.
 
 Well, I think at a per table level is the only sensible level. If a
 table isn't logged, neither are the indexes. After an unclean shutdown
 the data could be anywhere between OK and rubbish, with no way of
 finding out which way.
 
  If someone has a 100 GB table, they would not appreciate the table being
  truncated if a transaction to load 1 GB of data aborts, forcing recovery
  of the 100 GB table.
 
 Ah, but wouldn't such a large table be partitioned in such a way that
 you could have the most recent partition having the loaded data.
 Personally, I think these shared temp tables have more applications
 than meet the eye. I've had systems with cache tables which could be
 wiped on boot. Though I think my preference would be to TRUNCATE rather
 than DROP on unclean shutdown.
 
 Have a nice day,
 -- 
 Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
  Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
  tool for doing 5% of the work and then sitting around waiting for someone
  else to do the other 95% so you can sue them.
-- End of PGP section, PGP failed!

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-28 Thread Joshua D. Drake


 now agree that the logging is more part of the table than

the command itself.  Right now we have a COPY LOCK patch, but people are
going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
and all sorts of other things, so I think we are best adding an ALTER
TABLE capability.  I am thinking of this syntax:

ALTER TABLE name RELIABILITY option

where option is:

DROP [ TABLE ON CRASH ]
DELETE [ ROWS ON CRASH ]
EXCLUSIVE
SHARE


I would say ON FAILURE (Crash just seems way to scary :))

Joshua D. Drake


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-28 Thread Bruce Momjian

Joshua D. Drake wrote:
   now agree that the logging is more part of the table than
  the command itself.  Right now we have a COPY LOCK patch, but people are
  going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
  and all sorts of other things, so I think we are best adding an ALTER
  TABLE capability.  I am thinking of this syntax:
  
  ALTER TABLE name RELIABILITY option
  
  where option is:
  
  DROP [ TABLE ON CRASH ]
  DELETE [ ROWS ON CRASH ]
  EXCLUSIVE
  SHARE
 
 I would say ON FAILURE (Crash just seems way to scary :))

Agreed, maybe ON RECOVERY.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-27 Thread Jim C. Nasby

On Sat, Dec 24, 2005 at 02:01:17AM -0500, Greg Stark wrote:
 But I like the other user's suggestion too. If there's a standards blessed
 feature to have temporary tables that are visible in other sessions then
 perhaps that's what we've arrived at from another direction.

Having a temporary table that is visible to all sessions would have a
lot of use besides just ETL (the E stands for Extract, btw) operations.
One example is storing session data for a webapp; most people would
happily trade off losing that data on a database restart for the
increased performance of not logging.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-27 Thread Martijn van Oosterhout

On Mon, Dec 26, 2005 at 11:00:51AM -0500, Bruce Momjian wrote:
 
 I think this brings up an interesting distinction, that having the NO
 LOGGING switch per command doesn't make sense because it modifying the
 table.  It has to be a per-object switch, or something that operates
 only on empty tables.  This is the exact same distinction we talked
 about for NO LOGGING COPY.

I've thought of one other possibility, which is kind of at the extreme
end of system implementation. Given the suggestion about not losing a
whole table on unclean shutdown, how about using a single table, split.

1. When setting no logging flag, take exclusive lock and record
filesize. This size is X.
2. From now on any data before X is read-only. So no updates. Any new
data needs to be allocated at end, so no FSM either.
3. Any data added after X is not logged to xlog.
4. On unclean shutdown, truncate table to length X.
5. When logging is reenabled, set X back to infinity.

The shared temp tables discussed earlier and basically be above with
X=0. The process of untempifying a table is setting X to infinity. Temp
tables always have X=0.

This allows people to do their ELT, because they can update rows
written after X but without the logging. If you have PITR you simply
never allow changing X of any table.

But this is pretty way out though IMO.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpa6gOlWG0sr.pgp
Description: PGP signature

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-27 Thread Hannu Krosing

Ühel kenal päeval, T, 2005-12-27 kell 19:20, kirjutas Martijn van
Oosterhout:
 On Mon, Dec 26, 2005 at 11:00:51AM -0500, Bruce Momjian wrote:
  
  I think this brings up an interesting distinction, that having the NO
  LOGGING switch per command doesn't make sense because it modifying the
  table.  It has to be a per-object switch, or something that operates
  only on empty tables.  This is the exact same distinction we talked
  about for NO LOGGING COPY.
 
 I've thought of one other possibility, which is kind of at the extreme
 end of system implementation. Given the suggestion about not losing a
 whole table on unclean shutdown, how about using a single table, split.
 
 1. When setting no logging flag, take exclusive lock and record
 filesize. This size is X.
 2. From now on any data before X is read-only. So no updates. Any new
 data needs to be allocated at end, so no FSM either.
 3. Any data added after X is not logged to xlog.
 4. On unclean shutdown, truncate table to length X.
 5. When logging is reenabled, set X back to infinity.

How would it work for indexes ?


Hannu


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-27 Thread Hannu Krosing

Ühel kenal päeval, R, 2005-12-23 kell 23:06, kirjutas Tom Lane:
 Greg Stark [EMAIL PROTECTED] writes:
  It seems to me the only rational way to approach this is to have a per-table
  flag that sets that table to be non-logged. Essentially changing a table's
  behaviour to that of a temporary table except that other transactions can 
  see
  it.
 
 But what's the point?  Nowhere in your scenario is there a reason why
 we need to have multiple sessions working on the data being loaded.
 So a temp table solves the problem perfectly.  (Temp tables do span
 transactions, just not sessions.)

Human operator looking at the date before loading it into final tables
seems to be exactly that usecase. After some script loads the data in
temp tables, you may want an operator to look at the data using some
visual app or some other script. It would be quite tricky to make them
use the same session, especially if different steps use different
technologies (shell script, psql, jdbc, python, ...)

 I've got a fundamental philosophical objection to proposals in this
 line, which I suppose I'd better state for the record.  I don't like
 inventing nonstandard SQL extensions or peculiar semantics just to gain
 performance. 

Out TEMP tables *already* are nonstandard. IIRC the standard temp tables
the word temp applies only for data, not the schema.

 It imposes extra complexity on users that they could do
 without; the first time someone loses critical data because he didn't
 fully understand the tradeoffs involved, he'll have a justifiable gripe.
 I also don't like playing Oracle's or MySQL's game by inventing
 proprietary syntax.  We claim to believe in standards compliance, so we
 should have a pretty high bar for inventing nonstandard syntax.  When
 the proposed change has a narrow use-case and is only going to yield
 marginal improvements even in that case, I think we should just say no.
 
 Bottom line: if we can't figure out how to do it transparently, I think
 we haven't thought hard enough.

I guess we can never make the database so smart that it can quess users
specific needs of durability on per-table basis. What we can do is
making different durability choices more obvious at syntax level.

I have been long musing about having multiple wal files/filesets,
perhaps per-tablespace, perhaps just specified on per-table basis.
This would give both possibility for added performance by using
more/different storages and also a possibility to select different
classes of durability.

making one of WAL files (strategies) be /dev/null would almost get us
non-logged writes, except for a little overhead in write() calls.
fsync() on /dev/null should be instantaneous .

--
Hannu




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-27 Thread Tom Lane

Hannu Krosing [EMAIL PROTECTED] writes:
 making one of WAL files (strategies) be /dev/null would almost get us
 non-logged writes, except for a little overhead in write() calls.
 fsync() on /dev/null should be instantaneous .

No, you really do want to push it up to a higher level than that.
The CPU and contention costs of generating a WAL record are nontrivial,
and doing so only to throw it away is pretty silly.  What's more,
pointing WAL at /dev/null would disable logging for the entire database
cluster, not just selected tables which is what people seem to be
asking for in this thread.  (No, I don't wish to deal with multiple
WAL output streams...)

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-26 Thread Simon Riggs

On Sat, 2005-12-24 at 10:32 -0500, Tom Lane wrote:

 An ALTER TABLE SET LOGGED/UNLOGGED switch might have some merit, but
 please don't muddy the waters by confusing this with temp-table
 status.

I would not be against such a table-level switch, but the exact
behaviour would need to be specified more closely before this became a
TODO item, IMHO.

If someone has a 100 GB table, they would not appreciate the table being
truncated if a transaction to load 1 GB of data aborts, forcing recovery
of the 100 GB table.

Best Regards, Simon Riggs




---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-26 Thread Martijn van Oosterhout

On Mon, Dec 26, 2005 at 12:03:27PM +, Simon Riggs wrote:
 I would not be against such a table-level switch, but the exact
 behaviour would need to be specified more closely before this became a
 TODO item, IMHO.

Well, I think at a per table level is the only sensible level. If a
table isn't logged, neither are the indexes. After an unclean shutdown
the data could be anywhere between OK and rubbish, with no way of
finding out which way.

 If someone has a 100 GB table, they would not appreciate the table being
 truncated if a transaction to load 1 GB of data aborts, forcing recovery
 of the 100 GB table.

Ah, but wouldn't such a large table be partitioned in such a way that
you could have the most recent partition having the loaded data.
Personally, I think these shared temp tables have more applications
than meet the eye. I've had systems with cache tables which could be
wiped on boot. Though I think my preference would be to TRUNCATE rather
than DROP on unclean shutdown.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgptUR80IVCyx.pgp
Description: PGP signature

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-26 Thread Bruce Momjian


I think this brings up an interesting distinction, that having the NO
LOGGING switch per command doesn't make sense because it modifying the
table.  It has to be a per-object switch, or something that operates
only on empty tables.  This is the exact same distinction we talked
about for NO LOGGING COPY.

---

Martijn van Oosterhout wrote:
-- Start of PGP signed section.
 On Mon, Dec 26, 2005 at 12:03:27PM +, Simon Riggs wrote:
  I would not be against such a table-level switch, but the exact
  behaviour would need to be specified more closely before this became a
  TODO item, IMHO.
 
 Well, I think at a per table level is the only sensible level. If a
 table isn't logged, neither are the indexes. After an unclean shutdown
 the data could be anywhere between OK and rubbish, with no way of
 finding out which way.
 
  If someone has a 100 GB table, they would not appreciate the table being
  truncated if a transaction to load 1 GB of data aborts, forcing recovery
  of the 100 GB table.
 
 Ah, but wouldn't such a large table be partitioned in such a way that
 you could have the most recent partition having the loaded data.
 Personally, I think these shared temp tables have more applications
 than meet the eye. I've had systems with cache tables which could be
 wiped on boot. Though I think my preference would be to TRUNCATE rather
 than DROP on unclean shutdown.
 
 Have a nice day,
 -- 
 Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
  Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
  tool for doing 5% of the work and then sitting around waiting for someone
  else to do the other 95% so you can sue them.
-- End of PGP section, PGP failed!

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-24 Thread Martijn van Oosterhout

On Fri, Dec 23, 2005 at 11:19:01PM -0500, Rod Taylor wrote:
  In many cases you could use temporary tables, but sometimes you might want
  multiple processes or multiple transactions to be able to see the data.
 
 Could always implement GLOBAL TEMP tables that have the ability to use
 these kinds of shortcuts.

Indeed, that's the thought I get while reading this thread.

TEMP tables are not xlogged

For global temp tables I'd suggest creating a schema pg_globaltemp or
some such where:

1. No Xlog, like TEMP tables
2. Use shared buffer cache like normal tables.
3. On startup after unclean shutdown, delete anything in that schema

And (AFAIK) we get extra SQL complience to boot. If you really want an
extension, how about one to un-TEMP-ify a temp table.

That would allow the following:

CREATE GLOBAL TEMP TABLE blah INHERITS myhugetable;

-- Load data, munge, etc... no xlog traffic
-- If system crashes here, nothing to recover

ALTER TABLE blah ALTER SCHEMA TO public;  -- untempify

-- If PITR, write Xlog data now
-- If no PITR, just move file and update catalog

How does this not do what people are asking for?
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpOhtfDdTXRH.pgp
Description: PGP signature

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-24 Thread Simon Riggs

On Fri, 2005-12-23 at 22:41 -0500, Greg Stark wrote:
 Tom Lane [EMAIL PROTECTED] writes:
 
  Qingqing Zhou [EMAIL PROTECTED] writes:
   I revised the idea with MINIMAL XLOG (instead of WITHOUT XLOG) like the
   below. I think in this way, we can always gaurantee its correctness and
   can always improve it.
  
  I think the entire idea is a waste of time anyway.  If we have the COPY
  case covered then we have done the only case that has any practical use
  AFAICS --- you don't use UPDATE or DELETE for bulk loads, and you don't
  use retail INSERTs either if you are striving for speed.
 
 Well the industry standard approach for loading data is called ETL. I
 forget what the E stands for, but the idea is to first load the data into a
 table exactly as it appears in the input data.
 
 Then the T stands for transform. This could include just normalizing the
 input data into the correct format but it could also include looking up
 foreign key values for input strings, checking for duplicate records, etc.
 This would be a series of UPDATEs and DELETEs.
 
 I guess the L stands for load, where you insert it into the real tables.

Just for completeness: the whole data load area is typically known as
ETL now, but there are two main approaches: ETL and ELT

ETL stands for 
Extract - take the data from the source system
Transform - the process of mapping from source to target data model
Load - load the transformed data into the target system

In this mode, the only SQL used is COPY, during the L phase.

ETL is the industry standard as set by vendors who sell tools that
work that way: Informatica, Ascential etc. Another, equally viable route
is similar to your description, but is known as ELT, which is supported
by vendors such as Oracle (proprietary-only) and Sunopsis (open):

Extract - take the data from the source system
Load - load the data into the DW staging area in same form as source
Tranform - transform the data using SQL commands

In the latter ELT approach, you use COPY during the L phase, but then
you definitely do use INSERT SELECT and sometimes UPDATE commands,
rarely DELETEs during the T phase.

The ELT approach is the main basis for a *large* number of very large
Oracle, Teradata and DB2 data warehouses that have custom-built load
procedures. Check out this link for a high level explanation,
http://www.sunopsis.com/corporate/us/products/sunopsis/snps_dc.htm
but noting that Sunopsis did not invent this approach, or the name ELT.

 Up until the point where you load it into the real tables you don't really
 care about the transactional integrity of the data. If it's lost you can
 always just restart the process.

 In many cases you could use temporary tables, but sometimes you might want
 multiple processes or multiple transactions to be able to see the data.
 Consider for example a loading process that includes a human operator
 reviewing the data before the data is approved to be loaded into the final
 tables.

Agreed

 But I don't see turning on and off the WAL on a per-transaction basis to be
 useful. Every transaction in the system is affected by the WAL status of every
 other transaction working with the same tables. It doesn't serve any purpose
 to have one transaction bypassing the WAL while everyone else does WAL logging
 for the same table; they're all going to lose if the system crashes.

Yes, I really don't like that way.

 It seems to me the only rational way to approach this is to have a per-table
 flag that sets that table to be non-logged. Essentially changing a table's
 behaviour to that of a temporary table except that other transactions can see
 it. If the system crashes the table is truncated on system restore.

Often, you need to speed up the load into your largest tables. The data
you are loading often comes from a file that you can backed-up, so if
the database crashes during the load, you can always restart the load.
But that doesn't mean you want to lose the data that is already there if
you crash.

That's exactly what COPY LOCK has been designed to do. It is analagous
to Oracle's sql*loader direct path nologging mode.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-24 Thread Greg Stark


Martijn van Oosterhout kleptog@svana.org writes:

 CREATE GLOBAL TEMP TABLE blah INHERITS myhugetable;

I don't think you can have your temp table inherit from a real table. That
would make your real table have temp table semantics. Ie, records in it will
spontaneously disappear on reboot.

But you could use LIKE to define the table which I think does what you want
here.

Partitioned table support would eventually include the ability to take the
table (once you've done the untempify thing below) and move it into the
partitioned table as a partition.

 -- Load data, munge, etc... no xlog traffic
 -- If system crashes here, nothing to recover
 
 ALTER TABLE blah ALTER SCHEMA TO public;  -- untempify

Personally I don't like the idea of tying temporary table status with the
schema. Is this in the standard? It seems to me that you should be able to
create temporary tables in whatever schema you want. People sometimes use
schemas to separate different modules of their application. It's strange to
force them to share a single namespace for temporary tables.

I would think untempify should be a separate special command, not some hidden
feature in ALTER SCHEMA. It makes it less surprising when someone reads the
code. It also makes it more likely someone who needs the functionality will
find it.

ALTER TABLE blah SET NOT TEMPORARY

This also raises the question of whether it should be possible to do:

ALTER TABLE blah SET TEMPORARY

and when it should be possible. At a first cut, any table that isn't involved
in any foreign key relationships and isn't part of any inherited table
structure should be eligible. 

Come to think of it my database would benefit from this feature. I have cache
tables I rebuild nightly. They have to be visible from every session but I
don't care about losing them on a crash since they can always be rebuilt. As
it stands they inflate the daily backups, and if I were using PITR the nightly
build would flood the PITR logs with tons of useless data. Making them global
temporary tables would let me reduce the backup size, reduce PITR volume, and
speed up the nightly cache build in one stroke.

-- 
greg


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-24 Thread Tom Lane

Greg Stark [EMAIL PROTECTED] writes:
 Martijn van Oosterhout kleptog@svana.org writes:
 CREATE GLOBAL TEMP TABLE blah INHERITS myhugetable;

 I don't think you can have your temp table inherit from a real table.

Bear in mind also that this notion of a GLOBAL TEMP table has less than
nothing to do with what the standard says a GLOBAL TEMP table is.  You
can't argue for the idea by claiming it increases our standards
compliance, because it actually is moving even further away.  The spec
says that a temp table is a table that exists but starts empty in every
session, ie, data inserted by one session is not visible to any other
session.  The GLOBAL/LOCAL choice does not affect this, it actually just
governs visibility of the table within different modules (a concept we
don't have at all).

An ALTER TABLE SET LOGGED/UNLOGGED switch might have some merit, but
please don't muddy the waters by confusing this with temp-table status.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-23 Thread Simon Riggs

On Thu, 2005-12-22 at 17:36 -0500, Stephen Frost wrote:
 * Simon Riggs ([EMAIL PROTECTED]) wrote:
  On Thu, 2005-12-22 at 21:18 +0100, Martijn van Oosterhout wrote:
   Considering WAL bypass is code for breaks PITR
  
  No it isn't. All of the WAL bypass logic does *not* operate when PITR is
  active. The WAL bypass logic is aimed at Data Warehouses, which
  typically never operate in PITR mode for performance reasons, however
  the choice is yours.

OK, thanks for saying all of that; you probably speak for many in
raising these concerns. I'll answer each bit as we come to it. Suffice
to say, your concerns are good and so are the answers:

 Eh?  PITR mode is bad for performance?  Maybe I missed something but I
 wouldn't have thought PITR would degrade regular performance all that
 badly.  

PITR mode is *not* bad for performance. On a very heavily loaded
write-intensive test system, the general PITR overhead on regular
performance was around 1% - so almost negligible.

We have been discussing a number of optimizations to specific commands
that would allow them to avoid writing WAL and thus speed up their
performance. If archive_command is set then WAL will always be written;
if it is not set then these commands will (or could) go faster:

- CREATE TABLE AS SELECT (in 8.1)
- COPY LOCK (patch submitted)
- COPY in same transaction as CREATE TABLE (patch submitted)
- INSERT SELECT in same transaction as CREATE TABLE (this discussion)

(There are a number of other conditions also, such as there must be no
indexes on a table. All of which now documented with the patch)

 So long as it doesn't take 15 minutes or some such to move the
 WAL to somewhere else (and I'm not sure that'd even slow things down..).
 For a Data Warehouse, have you got a better way of doing backups such
 that you don't lose at minimum most of a day's work?  

Yes. Don't just use the backup facilities on their own. Think about how
the architecture of your systems will work and see if there is a better
way when you look at very large systems.

 I'm not exactly a
 big fan do doing a pg_dump every night either given that the database is
 360GB.  Much nicer to take a weekly dump of the database and then do
 PITR for a week or two before taking another dump of the db.

e.g. Keep your reference data (low volume) in an Operational Data Store
(ODS) database, protected by archiving. Keep your main fact data (high
volume) in the Data Warehouse, but save the data in slices as you load
it, so that a recovery is simply a reload of the database: no PITR or
pg_dump required, so high performance data transformation and load work
is possible. This is a commonly used architectural design pattern.

 I like the idea of making COPY go faster, but please don't break my
 backup system while you're at it.  

On a personal note, I would only add that I spent a long time working on
PITR and I would never design anything that would intentionally break it
(nor would patches be accepted that did that). That probably gives me
the confidence to approach designs that might look like I'm doing that,
but without actually straying over the edge.

 I'm honestly kind of nervous about
 what you mean by checking it PITR is active- how is that done, exactly?
 Check if you have a script set to rotate the logs elsewhere?  Or is it
 checking if you're in the taking-a-full-database-backup stage?  Or what?

Internally, we use XLogArchivingActive(). Externally this will be set
when the admin sets archive_command to a particular value.

My original preference was for a parameter called archive_mode= ON | OFF
which would allow us to more easily discuss this, but this does not
currently exist.

 What's the performance decrease when using PITR, and what's it from?  Is
 it just that COPY isn't as fast?  Honestly, I could live with COPY being
 not as fast as it could be if my backups work. :)

These commands will not be optimized for speed when archive_command is set:
- CREATE TABLE AS SELECT (in 8.1)
- COPY LOCK (patch submitted)

 Sorry for sounding concerned but, well, backups are very important and
 so is performance and I'm afraid either I've not read all the
 documentation about the issues being discussed here or there isn't
 enough out there to make sense of it all yet. :)

If you choose PITR, then you are safe. If you do not, the crash recovery
of the database is not endangered by these optimizations.

Hope that covers all of your concerns?

I'm just writing a course that explains many of these techniques,
available in the New Year.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-23 Thread Qingqing Zhou

Torn pages (partial page write) are still a problem.

I revised the idea with MINIMAL XLOG (instead of WITHOUT XLOG) like the
below. I think in this way, we can always gaurantee its correctness and
can always improve it.

To Use It
--
A BEGIN TRANSACTION MINIMAL XLOG/END block is a speicial BEGIN/END
transaction block. It tries to avoid unnessary xlogs but still perserves
transaction semantics. It is good for the situation that the user wants to
do a big data load. It is issued like this:

1. BEGIN TRANSACTION MINIMAL XLOG
2. ... /* statements */
3. END;

From user's view, it is almost the same as ordinary transaction: if
everything run smoothly from step 1 to 3, the transaction will be made
durable. If any step failed (including ABORT, transaction error, system
crash), it looks like nothing happened. To make life easier, no
subtransactions is allowed.

To Implement It

At step 1, we will disallow some operations, including vacuum, PITR.
At step 2, only minimal xlog entries are logged. If anything inside
failed, handle it like ordinary transaction.
At step 3, we issue a checkpoint, then mark the transaction commited. If
step 8 itself failed, handle it like ordinary transaction.

The correctness is easy: if we let minimal xlog equal to all xlog,
then it is exactly the same as an ordinary transaction plus a checkpoint
inside the transaction block.

Based on the above proof, we can have the following implementation steps:
1. We first make the framework without revising any XLogInsert() - thus
the implementation is correct;
2. Examine each XLogInsert() and differenciate the content under MINIAML
XLOG is set or not.

The importance of the above steps is that it implies that there is no need
to completely pick up what are the MINIAL XLOG content are, we can do them
gradually in a non-invasive way.

Minimal Xlog
-
The xlog of failed transaction is not totally useless since later
transaction may reply on something it creates - for example, a new page
and its links of a btree. We have to pick up these xlogs.

RM_HEAP_ID: The problem of heap is torn page prevention. We currently copy
the whole page into xlog if it is the first time touched after a
checkpoint. So we can always have this copy to replace the data file page
which might be torn written. I didn't come up with any good method to
handle it so far, so we keep this. (We can possibly avoid copy a P_NEW
page, that's another story though). So what we can avoid xlog at least
include the insert/update/delete happened on a page that's no need to be
copied, which will give us a 50% xlog volumn/contention reduction I think.

RM_BTREE_ID/RM_HASH_ID/RM_GIST_ID: For index, things get more complex. We
need the xlogs to maintain the structure of the btree index, like the
pointers, high key etc, but the content is not necessarily needed. Need
more research here.

RM_XLOG_ID/RM_XACT_ID/RM_SMGR_ID/RM_CLOG_ID/RM_DBASE_ID/RM_TBLSPC_ID/RM_MULTIXACT_ID/RM_SEQ_ID:
It is hard to avoid much here, but they are not the important volume
contribution of xlogs.

Regards,
Qingqing

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-23 Thread Tom Lane

Qingqing Zhou [EMAIL PROTECTED] writes:
 I revised the idea with MINIMAL XLOG (instead of WITHOUT XLOG) like the
 below. I think in this way, we can always gaurantee its correctness and
 can always improve it.

I think the entire idea is a waste of time anyway.  If we have the COPY
case covered then we have done the only case that has any practical use
AFAICS --- you don't use UPDATE or DELETE for bulk loads, and you don't
use retail INSERTs either if you are striving for speed.

I don't want to see us mucking with the WAL logic without a *whole* lot
better rationale than has been presented so far.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-23 Thread Qingqing Zhou



On Fri, 23 Dec 2005, Tom Lane wrote:
 Qingqing Zhou [EMAIL PROTECTED] writes:
  I revised the idea with MINIMAL XLOG (instead of WITHOUT XLOG) like the
  below. I think in this way, we can always gaurantee its correctness and
  can always improve it.

 I think the entire idea is a waste of time anyway.  If we have the COPY
 case covered then we have done the only case that has any practical use
 AFAICS --- you don't use UPDATE or DELETE for bulk loads, and you don't
 use retail INSERTs either if you are striving for speed.


There are several posts on the list asking about NOLOGGING option
presented in Oracle. User may need it to do bulk updates against the
database. I don't think we plan to support it since it does not gaurantee
transaction semantics. But MINIMAL XLOG is something that we are afford to
do and not invasive change needed AFAICS.

Regards,
Qingqing

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-23 Thread Greg Stark


Tom Lane [EMAIL PROTECTED] writes:

 Qingqing Zhou [EMAIL PROTECTED] writes:
  I revised the idea with MINIMAL XLOG (instead of WITHOUT XLOG) like the
  below. I think in this way, we can always gaurantee its correctness and
  can always improve it.
 
 I think the entire idea is a waste of time anyway.  If we have the COPY
 case covered then we have done the only case that has any practical use
 AFAICS --- you don't use UPDATE or DELETE for bulk loads, and you don't
 use retail INSERTs either if you are striving for speed.

Well the industry standard approach for loading data is called ETL. I
forget what the E stands for, but the idea is to first load the data into a
table exactly as it appears in the input data.

Then the T stands for transform. This could include just normalizing the
input data into the correct format but it could also include looking up
foreign key values for input strings, checking for duplicate records, etc.
This would be a series of UPDATEs and DELETEs.

I guess the L stands for load, where you insert it into the real tables.

Up until the point where you load it into the real tables you don't really
care about the transactional integrity of the data. If it's lost you can
always just restart the process.

In many cases you could use temporary tables, but sometimes you might want
multiple processes or multiple transactions to be able to see the data.
Consider for example a loading process that includes a human operator
reviewing the data before the data is approved to be loaded into the final
tables.

But I don't see turning on and off the WAL on a per-transaction basis to be
useful. Every transaction in the system is affected by the WAL status of every
other transaction working with the same tables. It doesn't serve any purpose
to have one transaction bypassing the WAL while everyone else does WAL logging
for the same table; they're all going to lose if the system crashes.

It seems to me the only rational way to approach this is to have a per-table
flag that sets that table to be non-logged. Essentially changing a table's
behaviour to that of a temporary table except that other transactions can see
it. If the system crashes the table is truncated on system restore.

The only problem I have with this is that it smells too much like MySQL MyISAM
tables...

-- 
greg


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-23 Thread Qingqing Zhou


Greg Stark [EMAIL PROTECTED] wrote

 But I don't see turning on and off the WAL on a per-transaction basis to 
 be
 useful. Every transaction in the system is affected by the WAL status of 
 every
 other transaction working with the same tables. It doesn't serve any 
 purpose
 to have one transaction bypassing the WAL while everyone else does WAL 
 logging
 for the same table; they're all going to lose if the system crashes.

Sure, so a minimal amount xlog is required. And to make finished transaction 
durable, issue a checkpoint.

 It seems to me the only rational way to approach this is to have a 
 per-table
 flag that sets that table to be non-logged. Essentially changing a table's
 behaviour to that of a temporary table except that other transactions can 
 see
 it. If the system crashes the table is truncated on system restore.

 The only problem I have with this is that it smells too much like MySQL 
 MyISAM
 tables...

Table are related, so table A references table B. So set a per-table flag is 
hard to use or doesn't work.

Regards,
Qingqing



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-23 Thread Tom Lane

Greg Stark [EMAIL PROTECTED] writes:
 It seems to me the only rational way to approach this is to have a per-table
 flag that sets that table to be non-logged. Essentially changing a table's
 behaviour to that of a temporary table except that other transactions can see
 it.

But what's the point?  Nowhere in your scenario is there a reason why
we need to have multiple sessions working on the data being loaded.
So a temp table solves the problem perfectly.  (Temp tables do span
transactions, just not sessions.)

I've got a fundamental philosophical objection to proposals in this
line, which I suppose I'd better state for the record.  I don't like
inventing nonstandard SQL extensions or peculiar semantics just to gain
performance.  It imposes extra complexity on users that they could do
without; the first time someone loses critical data because he didn't
fully understand the tradeoffs involved, he'll have a justifiable gripe.
I also don't like playing Oracle's or MySQL's game by inventing
proprietary syntax.  We claim to believe in standards compliance, so we
should have a pretty high bar for inventing nonstandard syntax.  When
the proposed change has a narrow use-case and is only going to yield
marginal improvements even in that case, I think we should just say no.

Bottom line: if we can't figure out how to do it transparently, I think
we haven't thought hard enough.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-23 Thread Rod Taylor

 In many cases you could use temporary tables, but sometimes you might want
 multiple processes or multiple transactions to be able to see the data.

Could always implement GLOBAL TEMP tables that have the ability to use
these kinds of shortcuts.
-- 


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-23 Thread Greg Stark


Tom Lane [EMAIL PROTECTED] writes:

 Greg Stark [EMAIL PROTECTED] writes:
  It seems to me the only rational way to approach this is to have a per-table
  flag that sets that table to be non-logged. Essentially changing a table's
  behaviour to that of a temporary table except that other transactions can 
  see
  it.
 
 But what's the point?  Nowhere in your scenario is there a reason why
 we need to have multiple sessions working on the data being loaded.
 So a temp table solves the problem perfectly.  (Temp tables do span
 transactions, just not sessions.)

Well I gave one example of a use case where you want to span sessions. To be
more explicit, if you have some batch job loading data that you then want to
have a human review via some other path such as a web page before continuing
with the process.

There's nothing saying all the data is coming from the same source at all. You
could have some of the data being loaded by a psql COPY command and then other
data being loaded by a perl script. Being tied to having the two executed from
the same session is an arbitrary limit on the architecture.

 I've got a fundamental philosophical objection to proposals in this
 line, which I suppose I'd better state for the record.  I don't like
 inventing nonstandard SQL extensions or peculiar semantics just to gain
 performance.  It imposes extra complexity on users that they could do
 without; the first time someone loses critical data because he didn't
 fully understand the tradeoffs involved, he'll have a justifiable gripe.
 I also don't like playing Oracle's or MySQL's game by inventing
 proprietary syntax.  We claim to believe in standards compliance, so we
 should have a pretty high bar for inventing nonstandard syntax.  When
 the proposed change has a narrow use-case and is only going to yield
 marginal improvements even in that case, I think we should just say no.
 
 Bottom line: if we can't figure out how to do it transparently, I think
 we haven't thought hard enough.

I don't think 'alter table set logged off' complicates users' lives too much.

But I like the other user's suggestion too. If there's a standards blessed
feature to have temporary tables that are visible in other sessions then
perhaps that's what we've arrived at from another direction.

This means the final load into production data would still have to be logged,
so this wouldn't get everything the original posters were looking for. But
getting around that does mean risking someone's production data at some point.

-- 
greg


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and DELETE?

2005-12-22 Thread Jim C. Nasby

On Thu, Dec 22, 2005 at 02:31:33PM +, Simon Riggs wrote:
 Having just optimized COPY to avoid writing WAL during the transaction
 in which a table was first created, it seems worth considering whether
 this should occur for INSERT, UPDATE and DELETE also.
 
 It is fairly common to do data transformation using INSERT SELECTs and
 UPDATEs. This is usually done with temporary tables however. (DELETE
 would most efficiently be handled as an additional NOT clause on the
 insert, so it is uncommonly used in this circumstance.)
 
 However, CREATE TABLE AS SELECT (CTAS) does not allow inheritance, so a
 new permanent partition has to be created using CREATE TABLE, followed
 by an INSERT SELECT or COPY.
 
 Is that sufficient reason to optimise INSERT SELECT and UPDATE also? Or
 should I not bother? Or should I try to teach CTAS to use inheritance
 (which sounds harder and has a few gotchas).
 
 Currently, CTAS optimization requires a heap_sync during ExecEndPlan. It
 would be easy enough to extend this so that it also works for INSERT,
 UPDATE and DELETE.

Well, both UPDATE and DELETE seem like pretty odd use cases to me;
typically I'd do any needed data manipulation during the INSERT SELECT.
But it doesn't make sense to me to set this up for INSERT and ignore
UPDATE and DELETE.

I do think this needs to be something that is made either completely
transparent or must be specifically enabled. As described, I believe
this would break PITR, so users should have to specifically request that
behavior (and they should probably get a WARNING message, too).

Though, what's interesting is that theoretically it should be possible
to do this and still protect PITR, by logging the statements (but not
the actual data) to WAL. This isn't very practical with WAL (though it
might be worth looking at storing a compressed version of what's being
fed in to COPY), but in this case if we end up in a recovery situation
the data that the insert is pulling from should exist in the database in
the same state, so it should be possible to re-create the table. There's
still an issue of how to handle the pages from the new table that will
end up in WAL on subsiquent transactions, since presumably they might be
identical, but someone might be able to come up with a clever solution
for that. In the meantime, breaking WAL recovery needs to be something
that users must specifically request, via something like UPDATE NOWAL.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and DELETE?

2005-12-22 Thread Jim C. Nasby

BTW, this should also probably be moved over to -hackers...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and DELETE?

2005-12-22 Thread Martijn van Oosterhout

On Thu, Dec 22, 2005 at 12:37:51PM -0600, Jim C. Nasby wrote:
 I do think this needs to be something that is made either completely
 transparent or must be specifically enabled. As described, I believe
 this would break PITR, so users should have to specifically request that
 behavior (and they should probably get a WARNING message, too).

This reminds me of a friend who used MSSQL that had replication going
that broke every time you did a certain statement. It may have been
SELECT INTO [1]. His main problem was that the replication would
stop working silently. We need to be waving red flags if we broke
someone's backup procedure.

Considering WAL bypass is code for breaks PITR, I think we really
need to make sure people realise that running such a command breaks
their backups/replication/whatever people are doing. 

[1] 
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/trblsql/tr_reslsyserr_1_1r94.asp

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp19qVFCa9Gc.pgp
Description: PGP signature

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-22 Thread Simon Riggs

On Thu, 2005-12-22 at 21:18 +0100, Martijn van Oosterhout wrote:
 Considering WAL bypass is code for breaks PITR

No it isn't. All of the WAL bypass logic does *not* operate when PITR is
active. The WAL bypass logic is aimed at Data Warehouses, which
typically never operate in PITR mode for performance reasons, however
the choice is yours.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-22 Thread Stephen Frost

* Simon Riggs ([EMAIL PROTECTED]) wrote:
 On Thu, 2005-12-22 at 21:18 +0100, Martijn van Oosterhout wrote:
  Considering WAL bypass is code for breaks PITR
 
 No it isn't. All of the WAL bypass logic does *not* operate when PITR is
 active. The WAL bypass logic is aimed at Data Warehouses, which
 typically never operate in PITR mode for performance reasons, however
 the choice is yours.

Eh?  PITR mode is bad for performance?  Maybe I missed something but I
wouldn't have thought PITR would degrade regular performance all that
badly.  So long as it doesn't take 15 minutes or some such to move the
WAL to somewhere else (and I'm not sure that'd even slow things down..).
For a Data Warehouse, have you got a better way of doing backups such
that you don't lose at minimum most of a day's work?  I'm not exactly a
big fan do doing a pg_dump every night either given that the database is
360GB.  Much nicer to take a weekly dump of the database and then do
PITR for a week or two before taking another dump of the db.

I like the idea of making COPY go faster, but please don't break my
backup system while you're at it.  I'm honestly kind of nervous about
what you mean by checking it PITR is active- how is that done, exactly?
Check if you have a script set to rotate the logs elsewhere?  Or is it
checking if you're in the taking-a-full-database-backup stage?  Or what?
What's the performance decrease when using PITR, and what's it from?  Is
it just that COPY isn't as fast?  Honestly, I could live with COPY being
not as fast as it could be if my backups work. :)

Sorry for sounding concerned but, well, backups are very important and
so is performance and I'm afraid either I've not read all the
documentation about the issues being discussed here or there isn't
enough out there to make sense of it all yet. :)

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-22 Thread Qingqing Zhou


Simon Riggs [EMAIL PROTECTED] wrote

 No it isn't. All of the WAL bypass logic does *not* operate when PITR is
 active. The WAL bypass logic is aimed at Data Warehouses, which
 typically never operate in PITR mode for performance reasons, however
 the choice is yours.


To make things, is it possible to add a GUC to let user disable *all* the 
xlogs?

Regards,
Qingqing 



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-22 Thread Qingqing Zhou


Qingqing Zhou [EMAIL PROTECTED] wrote


 To make things, is it possible to add a GUC to let user disable *all* the 
 xlogs?


It may work like this:

BEGIN TRANSACTION WITHOUT XLOG;/* forbidden vacuum, PITR etc */
BEGIN
.../* no xlog during this peroid */
END;   /* don't mark this transaction committed */
BEGIN
...
END;
END TRANSACTION DO COMMIT;/* at this time issue checkpiont  mark all 
transactions committed */

So during this peroid, if any transaction failed, the only consequence is 
add invisible garbage data. When everything is going well, then END 
TRANSACTION DO COMMIT will mark these transaction permanate. Also, seems 
there is no problem even with XLOG_NO_TRAN updates.

Regards,
Qingqing



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-22 Thread Tom Lane

Qingqing Zhou [EMAIL PROTECTED] writes:
 BEGIN TRANSACTION WITHOUT XLOG;/* forbidden vacuum, PITR etc */

 So during this peroid, if any transaction failed, the only consequence is 
 add invisible garbage data.

No, the likely consequence is irretrievable corruption of any table or
index page touched by the transaction.  You're going to have a very hard
time selling this as a good idea.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-22 Thread Qingqing Zhou


Tom Lane [EMAIL PROTECTED] wrote
 Qingqing Zhou [EMAIL PROTECTED] writes:
 BEGIN TRANSACTION WITHOUT XLOG;/* forbidden vacuum, PITR etc */

 So during this peroid, if any transaction failed, the only consequence is
 add invisible garbage data.

 No, the likely consequence is irretrievable corruption of any table or
 index page touched by the transaction.


I guess I know (at least part) of what you mean. This is because we rely on 
replay all the xlog no matter it belongs to a committed transaction or not. 
Why? Because a failed transaction is not totally useless since later 
transaction may reply on some physical thing it creates - for example, a new 
page and its links of a btree. So for heap, there is(95% sure) no such 
problem. Our heap redo algorithm can automatically add empty pages to a 
heap. For index, there are problems, but I suspect they are solvable by not 
bypassing these records ... if this is not totally-nonstarter, I will 
investigate details of how to do it.

Regards,
Qingqing 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-22 Thread Tom Lane

Qingqing Zhou [EMAIL PROTECTED] writes:
 I guess I know (at least part) of what you mean. This is because we rely on 
 replay all the xlog no matter it belongs to a committed transaction or not. 
 Why? Because a failed transaction is not totally useless since later 
 transaction may reply on some physical thing it creates - for example, a new 
 page and its links of a btree. So for heap, there is(95% sure) no such 
 problem.

Torn pages (partial page write) are still a problem.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

99 matches

Mail list logo