Re: [HACKERS] BufferAccessStrategy for bulk insert
Robert Haas [EMAIL PROTECTED] writes: OK, here's an updated version... Applied with some small stylistic revisions. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BufferAccessStrategy for bulk insert
OK, here's an updated version... 1. Use IncrBufferRefCount() so that we can do unconditional ReleaseBuffers elsewhere. I'm not sure this is really any simpler, and although IncrBufferRefCount() is pretty cheap, it's certainly not as cheap as a NULL pointer test. 2. Consolidate a bunch of logic into a new function RelationReadBuffer. This simpifies the logic in RelationGetBufferForTuple() considerably. 3. Make RelationGetBufferForTuple ignore relation-rd_block in favor of bistate-last_pin whenever possible. Changing this to also not bother setting relation-rd_block didn't seem worthwhile, so I didn't. ...Robert On Tue, Nov 4, 2008 at 4:18 PM, Tom Lane [EMAIL PROTECTED] wrote: Robert Haas [EMAIL PROTECTED] writes: 2. The logic changes in RelationGetBufferForTuple seem bizarre and overcomplicated. ISTM that the buffer saved by the bistate ought to be about equivalent to relation-rd_targblock, ie, it's your first trial location and also a place to save the located buffer on the way out. I'd suggest tossing that part of the patch and starting over. Hmm, would that be safe in the presence of concurrent or recursive bulk inserts into the same relation? As safe as it is now --- you're relying on the bistate to carry the query-local state. Probably the best design is to just ignore rd_targblock when a bistate is provided, and use the bistate's buffer instead. regards, tom lane Index: src/backend/access/heap/heapam.c === RCS file: /projects/cvsroot/pgsql/src/backend/access/heap/heapam.c,v retrieving revision 1.268 diff -c -r1.268 heapam.c *** src/backend/access/heap/heapam.c 31 Oct 2008 19:40:26 - 1.268 --- src/backend/access/heap/heapam.c 6 Nov 2008 03:25:19 - *** *** 1798,1803 --- 1798,1827 } } + /* + * GetBulkInsertState - set up for a bulk insert + */ + BulkInsertState + GetBulkInsertState(void) + { + BulkInsertState bistate; + + bistate = palloc(sizeof(struct BulkInsertStateData)); + bistate-strategy = GetAccessStrategy(BAS_BULKWRITE); + bistate-last_pin = InvalidBuffer; + return bistate; + } + + /* + * FreeBulkInsertState - clean up after finishing a bulk insert + */ + void + FreeBulkInsertState(BulkInsertState bistate) + { + if (bistate-last_pin != InvalidBuffer) + ReleaseBuffer(bistate-last_pin); + FreeAccessStrategy(bistate-strategy); + } /* * heap_insert - insert tuple into a heap *** *** 1805,1821 * The new tuple is stamped with current transaction ID and the specified * command ID. * ! * If use_wal is false, the new tuple is not logged in WAL, even for a ! * non-temp relation. Safe usage of this behavior requires that we arrange ! * that all new tuples go into new pages not containing any tuples from other ! * transactions, and that the relation gets fsync'd before commit. * (See also heap_sync() comments) * ! * use_fsm is passed directly to RelationGetBufferForTuple, which see for ! * more info. * ! * Note that use_wal and use_fsm will be applied when inserting into the ! * heap's TOAST table, too, if the tuple requires any out-of-line data. * * The return value is the OID assigned to the tuple (either here or by the * caller), or InvalidOid if no OID. The header fields of *tup are updated --- 1829,1846 * The new tuple is stamped with current transaction ID and the specified * command ID. * ! * If the HEAP_INSERT_SKIP_WAL option is supplied, the new tuple is not logged ! * in WAL, even for a non-temp relation. Safe usage of this behavior requires ! * that we arrange that all new tuples go into new pages not containing any ! * tuples from other transactions, and that the relation gets fsync'd before ! * commit. * (See also heap_sync() comments) * ! * The HEAP_INSERT_SKIP_FSM option is passed directly to ! * RelationGetBufferForTuple, which see for more info. * ! * Note that options will be applied when inserting into the heap's TOAST ! * table, too, if the tuple requires any out-of-line data. * * The return value is the OID assigned to the tuple (either here or by the * caller), or InvalidOid if no OID. The header fields of *tup are updated *** *** 1825,1831 */ Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid, ! bool use_wal, bool use_fsm) { TransactionId xid = GetCurrentTransactionId(); HeapTuple heaptup; --- 1850,1856 */ Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid, ! int options, BulkInsertState bistate) { TransactionId xid = GetCurrentTransactionId(); HeapTuple heaptup; *** *** 1877,1890 heaptup = tup; } else if (HeapTupleHasExternal(tup) || tup-t_len TOAST_TUPLE_THRESHOLD) ! heaptup = toast_insert_or_update(relation, tup, NULL, ! use_wal, use_fsm); else heaptup = tup; /* Find buffer
Re: [HACKERS] BufferAccessStrategy for bulk insert
Robert Haas [EMAIL PROTECTED] writes: Patch resnapped to HEAD, with straightforward adjustments to compensate for Heikki's changes to the ReadBuffer interface. See attached. I looked this over a bit. A couple of suggestions: 1. You could probably simplify life a bit by treating the BulkInsertState as having an *extra* pin on the buffer, ie, do IncrBufferRefCount when saving a buffer reference in BulkInsertState and ReleaseBuffer when removing one. Changing a buffer's local pin count from 1 to 2 or back again is quite cheap, so you wouldn't need to special-case things to avoid the existing pin and release operations. For instance this diff hunk goes away: *** *** 1963,1969 END_CRIT_SECTION(); ! UnlockReleaseBuffer(buffer); /* * If tuple is cachable, mark it for invalidation from the caches in case --- 1987,1996 END_CRIT_SECTION(); ! /* Release the lock, but keep the buffer pinned if doing bulk insert. */ ! LockBuffer(buffer, BUFFER_LOCK_UNLOCK); ! if (!bistate) ! ReleaseBuffer(buffer); /* * If tuple is cachable, mark it for invalidation from the caches in case 2. The logic changes in RelationGetBufferForTuple seem bizarre and overcomplicated. ISTM that the buffer saved by the bistate ought to be about equivalent to relation-rd_targblock, ie, it's your first trial location and also a place to save the located buffer on the way out. I'd suggest tossing that part of the patch and starting over. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BufferAccessStrategy for bulk insert
2. The logic changes in RelationGetBufferForTuple seem bizarre and overcomplicated. ISTM that the buffer saved by the bistate ought to be about equivalent to relation-rd_targblock, ie, it's your first trial location and also a place to save the located buffer on the way out. I'd suggest tossing that part of the patch and starting over. Hmm, would that be safe in the presence of concurrent or recursive bulk inserts into the same relation? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BufferAccessStrategy for bulk insert
Robert Haas [EMAIL PROTECTED] writes: 2. The logic changes in RelationGetBufferForTuple seem bizarre and overcomplicated. ISTM that the buffer saved by the bistate ought to be about equivalent to relation-rd_targblock, ie, it's your first trial location and also a place to save the located buffer on the way out. I'd suggest tossing that part of the patch and starting over. Hmm, would that be safe in the presence of concurrent or recursive bulk inserts into the same relation? As safe as it is now --- you're relying on the bistate to carry the query-local state. Probably the best design is to just ignore rd_targblock when a bistate is provided, and use the bistate's buffer instead. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BufferAccessStrategy for bulk insert
Can you test whether using the buffer access strategy is a win or a loss? Most of that gain is probably coming from the reduction in pinning. Patch resnapped to HEAD, with straightforward adjustments to compensate for Heikki's changes to the ReadBuffer interface. See attached. New testing results, now with and without BAS: --TRUNK-- Time: 17945.523 ms Time: 18682.172 ms Time: 17047.841 ms Time: 16344.442 ms Time: 18727.417 ms --PATCHED-- Time: 13323.772 ms Time: 13869.724 ms Time: 14043.666 ms Time: 13934.132 ms Time: 13193.702 ms --PATCHED with BAS disabled-- Time: 14460.432 ms Time: 14745.206 ms Time: 14345.973 ms Time: 14601.448 ms Time: 16535.167 ms I'm not sure why the BAS seemed to be slowing things down before. Maybe it's different if we're copying into a pre-existing table, so that WAL is enabled? Or it could have just been a fluke - the numbers were close. I'll try to run some additional tests if time permits. ...Robert Index: src/backend/access/heap/heapam.c === RCS file: /projects/cvsroot/pgsql/src/backend/access/heap/heapam.c,v retrieving revision 1.268 diff -c -r1.268 heapam.c *** src/backend/access/heap/heapam.c 31 Oct 2008 19:40:26 - 1.268 --- src/backend/access/heap/heapam.c 1 Nov 2008 17:17:16 - *** *** 1798,1803 --- 1798,1827 } } + /* + * GetBulkInsertState - set up for a bulk insert + */ + BulkInsertState + GetBulkInsertState(void) + { + BulkInsertState bistate; + + bistate = palloc(sizeof(struct BulkInsertStateData)); + bistate-strategy = GetAccessStrategy(BAS_BULKWRITE); + bistate-last_pin = InvalidBuffer; + return bistate; + } + + /* + * FreeBulkInsertState - clean up after finishing a bulk insert + */ + void + FreeBulkInsertState(BulkInsertState bistate) + { + if (bistate-last_pin != InvalidBuffer) + ReleaseBuffer(bistate-last_pin); + FreeAccessStrategy(bistate-strategy); + } /* * heap_insert - insert tuple into a heap *** *** 1805,1821 * The new tuple is stamped with current transaction ID and the specified * command ID. * ! * If use_wal is false, the new tuple is not logged in WAL, even for a ! * non-temp relation. Safe usage of this behavior requires that we arrange ! * that all new tuples go into new pages not containing any tuples from other ! * transactions, and that the relation gets fsync'd before commit. * (See also heap_sync() comments) * ! * use_fsm is passed directly to RelationGetBufferForTuple, which see for ! * more info. * ! * Note that use_wal and use_fsm will be applied when inserting into the ! * heap's TOAST table, too, if the tuple requires any out-of-line data. * * The return value is the OID assigned to the tuple (either here or by the * caller), or InvalidOid if no OID. The header fields of *tup are updated --- 1829,1846 * The new tuple is stamped with current transaction ID and the specified * command ID. * ! * If the HEAP_INSERT_SKIP_WAL option is supplied, the new tuple is not logged ! * in WAL, even for a non-temp relation. Safe usage of this behavior requires ! * that we arrange that all new tuples go into new pages not containing any ! * tuples from other transactions, and that the relation gets fsync'd before ! * commit. * (See also heap_sync() comments) * ! * The HEAP_INSERT_SKIP_FSM option is passed directly to ! * RelationGetBufferForTuple, which see for more info. * ! * Note that options will be applied when inserting into the heap's TOAST ! * table, too, if the tuple requires any out-of-line data. * * The return value is the OID assigned to the tuple (either here or by the * caller), or InvalidOid if no OID. The header fields of *tup are updated *** *** 1825,1831 */ Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid, ! bool use_wal, bool use_fsm) { TransactionId xid = GetCurrentTransactionId(); HeapTuple heaptup; --- 1850,1856 */ Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid, ! int options, BulkInsertState bistate) { TransactionId xid = GetCurrentTransactionId(); HeapTuple heaptup; *** *** 1877,1890 heaptup = tup; } else if (HeapTupleHasExternal(tup) || tup-t_len TOAST_TUPLE_THRESHOLD) ! heaptup = toast_insert_or_update(relation, tup, NULL, ! use_wal, use_fsm); else heaptup = tup; /* Find buffer to insert this tuple into */ buffer = RelationGetBufferForTuple(relation, heaptup-t_len, ! InvalidBuffer, use_fsm); /* NO EREPORT(ERROR) from here till changes are logged */ START_CRIT_SECTION(); --- 1902,1914 heaptup = tup; } else if (HeapTupleHasExternal(tup) || tup-t_len TOAST_TUPLE_THRESHOLD) ! heaptup = toast_insert_or_update(relation, tup, NULL, options); else heaptup = tup; /* Find buffer to insert this tuple into */
Re: [HACKERS] BufferAccessStrategy for bulk insert
On Sat, 2008-11-01 at 13:23 -0400, Robert Haas wrote: Can you test whether using the buffer access strategy is a win or a loss? Most of that gain is probably coming from the reduction in pinning. --PATCHED-- Time: 13869.724 ms (median) --PATCHED with BAS disabled-- Time: 14460.432 ms (median with outlier removed) That seems a conclusive argument in favour. Small additional performance gain. plus generally beneficial behaviour for concurrent loads. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BufferAccessStrategy for bulk insert
On Thu, 2008-10-30 at 23:05 -0400, Robert Haas wrote: Whatever timings you have are worth publishing. Here are the timings for copying the first ten million integers into a one-column table created in the same transaction, with and without the patch. As you can see, now that I've corrected my previous error of not putting CREATE TABLE and COPY in the same transaction, the savings are quite substantial, about 15%. Nice! I had faith. ;-) Can you test whether using the buffer access strategy is a win or a loss? Most of that gain is probably coming from the reduction in pinning. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BufferAccessStrategy for bulk insert
On Thu, 2008-10-30 at 22:46 -0400, Robert Haas wrote: You should try profiling the patch. You can count the invocations of the buffer access routines to check its all working in the right ratios. *goes and learns how to do profile PostgreSQL* OK, that was a good suggestion. It looks like part of my problem here is that I didn't put the CREATE TABLE and the COPY into the same transaction. As a result, a lot of time was spent on XLogInsert. Modified the test case, new profiling results attached. The CPU time in XLogInsert can be confusing. The WAL writes can make COPY I/O bound and so any savings on CPU may have been masked in the earlier tests. Patched profile shows we can still save a further 20% by writing data block-at-a-time. That's more complex because we'd need to buffer the index inserts also, or it would optimise only for the no-index (initial load) case. So I think this is definitely enough for this release. Using the buffer access strategy is going to be a big win for people running large data loads in production and it will also help with people running parallel load tasks (e.g. Dimitri's pg_loader). That effect is more subtle and harder to measure, but it's an important consideration. Thanks very much for finishing the patch in time for commitfest. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BufferAccessStrategy for bulk insert
You should try profiling the patch. You can count the invocations of the buffer access routines to check its all working in the right ratios. *goes and learns how to do profile PostgreSQL* OK, that was a good suggestion. It looks like part of my problem here is that I didn't put the CREATE TABLE and the COPY into the same transaction. As a result, a lot of time was spent on XLogInsert. Modified the test case, new profiling results attached. ...Robert gmon.trunk.txt.gz Description: GNU Zip compressed data gmon.patched.txt.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BufferAccessStrategy for bulk insert
Whatever timings you have are worth publishing. Here are the timings for copying the first ten million integers into a one-column table created in the same transaction, with and without the patch. As you can see, now that I've corrected my previous error of not putting CREATE TABLE and COPY in the same transaction, the savings are quite substantial, about 15%. Nice! Trunk: Time: 18931.516 ms Time: 18251.732 ms Time: 17284.274 ms Time: 15900.131 ms Time: 16439.617 ms Patch: Time: 14852.123 ms Time: 15673.759 ms Time: 15776.450 ms Time: 14160.266 ms Time: 13374.243 ms ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BufferAccessStrategy for bulk insert
On Tue, 2008-10-28 at 23:45 -0400, Robert Haas wrote: One concern that I have about this approach is that the situation in which people are probably most concerned about COPY performance is restoring a dump. In that case, the COPY will be the only thing running, and using a BufferAccessStrategy is an anti-optimization. I don't think it's a very big effect (any testing anyone can do on real hardware rather than what I have would be appreciated) but I'm sort of unsold of optimizing for what I believe to be the less-common use case. If the consensus is to reverse course on this point I'm happy to rip those changes back out and resubmit; they are a relatively small proportion of the patch. Having COPY use a BAS is mainly to ensure it doesn't swamp the cache. Which is a gain in itself. If you say its a loss you should publish timings to support that. Using a BAS for VACUUM was a performance gain, not a loss. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BufferAccessStrategy for bulk insert
If you say its a loss you should publish timings to support that. Using a BAS for VACUUM was a performance gain, not a loss. Well, I can dig up and publish the timings from my laptop, but I'm not sure where that will get us. Trust me, the numbers were higher with BAS, otherwise I wouldn't be worrying about this. But I pretty much doubt anyone cares how my laptop runs PostgreSQL anyway, which is why I think someone should test this on good hardware and see what happens there. The only change I made to disable the BAS was a one-line change in GetBulkInsertState to replace BAS_BULKWRITE with BAS_NORMAL, so it should be easy for someone to try it both ways. Not at any point in the development of this patch was I able to match the 15-17% copy speedup, 20% CTAS speedup that you cited with your original email. I did get speedups, but they were considerably smaller. So either my testing methodology is different, or my hardware is different, or there is something wrong with my patch. I don't think we're going to find out which it is until someone other than me looks at this. In any event, VACUUM is a read-write workload, and specifically, it tends to write pages that have been written by other writers, and are therefore potentially already in shared buffers. COPY and CTAS are basically write-only workloads, though with COPY on an existing table the FSM might guide you to free space on a page already in shared buffers, or you might find an index page you need there. Still, if you are doing a large bulk data load, those effects are probably pretty small. So, the profile is somewhat. I'm not really trying to argue that the BAS is a bad idea, but it is certainly true that I do not have the data to prove that it is a good idea. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BufferAccessStrategy for bulk insert
On Wed, 2008-10-29 at 21:58 -0400, Robert Haas wrote: If you say its a loss you should publish timings to support that. Using a BAS for VACUUM was a performance gain, not a loss. Well, I can dig up and publish the timings from my laptop, but I'm not sure where that will get us. Trust me, the numbers were higher with BAS, otherwise I wouldn't be worrying about this. But I pretty much doubt anyone cares how my laptop runs PostgreSQL anyway, which is why I think someone should test this on good hardware and see what happens there. The only change I made to disable the BAS was a one-line change in GetBulkInsertState to replace BAS_BULKWRITE with BAS_NORMAL, so it should be easy for someone to try it both ways. Not at any point in the development of this patch was I able to match the 15-17% copy speedup, 20% CTAS speedup that you cited with your original email. I did get speedups, but they were considerably smaller. So either my testing methodology is different, or my hardware is different, or there is something wrong with my patch. I don't think we're going to find out which it is until someone other than me looks at this. In any event, VACUUM is a read-write workload, and specifically, it tends to write pages that have been written by other writers, and are therefore potentially already in shared buffers. COPY and CTAS are basically write-only workloads, though with COPY on an existing table the FSM might guide you to free space on a page already in shared buffers, or you might find an index page you need there. Still, if you are doing a large bulk data load, those effects are probably pretty small. So, the profile is somewhat. I'm not really trying to argue that the BAS is a bad idea, but it is certainly true that I do not have the data to prove that it is a good idea. You should try profiling the patch. You can count the invocations of the buffer access routines to check its all working in the right ratios. Whatever timings you have are worth publishing. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BufferAccessStrategy for bulk insert
And here's the patch, which based on comments thus far does the following: - Replaces the use_wal, use_fsm arguments in various places with a single options argument. - Creates a BAS_BULKWRITE buffer access strategy. - Creates a BulkInsertState object so that COPY and CTAS can use BAS_BULKWRITE and also keep the most recent page pinned. Note that the original purpose of this exercise was to implement the optimization that COPY and CTAS would keep the most recent page pinned to avoid repeated pin/unpin cycles. This change shows a small but measurable performance improvement on short rows. The remaining items were added based on reviewer comments. One concern that I have about this approach is that the situation in which people are probably most concerned about COPY performance is restoring a dump. In that case, the COPY will be the only thing running, and using a BufferAccessStrategy is an anti-optimization. I don't think it's a very big effect (any testing anyone can do on real hardware rather than what I have would be appreciated) but I'm sort of unsold of optimizing for what I believe to be the less-common use case. If the consensus is to reverse course on this point I'm happy to rip those changes back out and resubmit; they are a relatively small proportion of the patch. ...Robert On Sun, Oct 26, 2008 at 8:37 PM, Robert Haas [EMAIL PROTECTED] wrote: Seems sane to me. I don't see the point of the HEAP_INSERT_BULK flag bit --- providing or not providing bistate would cover that, and if you have a bit as well then you have to define what the inconsistent combinations mean. I concur with making all-zeroes be the typical state of the flag bits, too. Thanks for the design review. I had thought to make the inconsistent combinations fail an assertion, but I'm just as happy to leave it out altogether. FWIW, we generally declare bitmask flag variables as int, unless there's some really good reason to do otherwise. OK, thanks for the tip. ...Robert Index: src/backend/access/heap/heapam.c === RCS file: /projects/cvsroot/pgsql/src/backend/access/heap/heapam.c,v retrieving revision 1.266 diff -c -r1.266 heapam.c *** src/backend/access/heap/heapam.c 27 Oct 2008 21:50:12 - 1.266 --- src/backend/access/heap/heapam.c 29 Oct 2008 03:25:41 - *** *** 1798,1803 --- 1798,1827 } } + /* + * GetBulkInsertState - set up for a bulk insert + */ + BulkInsertState + GetBulkInsertState(void) + { + BulkInsertState bistate; + + bistate = palloc(sizeof(struct BulkInsertStateData)); + bistate-strategy = GetAccessStrategy(BAS_BULKWRITE); + bistate-last_pin = InvalidBuffer; + return bistate; + } + + /* + * FreeBulkInsertState - clean up after finishing a bulk insert + */ + void + FreeBulkInsertState(BulkInsertState bistate) + { + if (bistate-last_pin != InvalidBuffer) + ReleaseBuffer(bistate-last_pin); + FreeAccessStrategy(bistate-strategy); + } /* * heap_insert - insert tuple into a heap *** *** 1805,1821 * The new tuple is stamped with current transaction ID and the specified * command ID. * ! * If use_wal is false, the new tuple is not logged in WAL, even for a ! * non-temp relation. Safe usage of this behavior requires that we arrange ! * that all new tuples go into new pages not containing any tuples from other ! * transactions, and that the relation gets fsync'd before commit. * (See also heap_sync() comments) * ! * use_fsm is passed directly to RelationGetBufferForTuple, which see for ! * more info. * ! * Note that use_wal and use_fsm will be applied when inserting into the ! * heap's TOAST table, too, if the tuple requires any out-of-line data. * * The return value is the OID assigned to the tuple (either here or by the * caller), or InvalidOid if no OID. The header fields of *tup are updated --- 1829,1846 * The new tuple is stamped with current transaction ID and the specified * command ID. * ! * If the HEAP_INSERT_SKIP_WAL option is supplied, the new tuple is not logged ! * in WAL, even for a non-temp relation. Safe usage of this behavior requires ! * that we arrange that all new tuples go into new pages not containing any ! * tuples from other transactions, and that the relation gets fsync'd before ! * commit. * (See also heap_sync() comments) * ! * The HEAP_INSERT_SKIP_FSM option is passed directly to ! * RelationGetBufferForTuple, which see for more info. * ! * Note that options will be applied when inserting into the heap's TOAST ! * table, too, if the tuple requires any out-of-line data. * * The return value is the OID assigned to the tuple (either here or by the * caller), or InvalidOid if no OID. The header fields of *tup are updated *** *** 1825,1831 */ Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid, ! bool use_wal, bool
Re: [HACKERS] BufferAccessStrategy for bulk insert
Robert Haas [EMAIL PROTECTED] writes: I am kind of inclined to define flags like this: #define HEAP_INSERT_SKIP_WAL 0x0001 #define HEAP_INSERT_SKIP_FSM 0x0002 #define HEAP_INSERT_BULK 0x0004 /* do we even need this one? */ And then: Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid, unsigned options, BulkInsertState *bistate); BulkInsertState *GetBulkInsertState(void); void FreeBulkInsertState(BulkInsertState *); Seems sane to me. I don't see the point of the HEAP_INSERT_BULK flag bit --- providing or not providing bistate would cover that, and if you have a bit as well then you have to define what the inconsistent combinations mean. I concur with making all-zeroes be the typical state of the flag bits, too. FWIW, we generally declare bitmask flag variables as int, unless there's some really good reason to do otherwise. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BufferAccessStrategy for bulk insert
Seems sane to me. I don't see the point of the HEAP_INSERT_BULK flag bit --- providing or not providing bistate would cover that, and if you have a bit as well then you have to define what the inconsistent combinations mean. I concur with making all-zeroes be the typical state of the flag bits, too. Thanks for the design review. I had thought to make the inconsistent combinations fail an assertion, but I'm just as happy to leave it out altogether. FWIW, we generally declare bitmask flag variables as int, unless there's some really good reason to do otherwise. OK, thanks for the tip. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers