Re: [PATCHES] A little COPY speedup
On Fri, 2007-03-02 at 10:09 +, Heikki Linnakangas wrote: Well, there's one big change: your patch to suppress WAL logging on tables created in the same transaction. OK, just checking thats what you meant. All the page locking related functions account for ~10% in total, including the LWLockAcquire/Release, Pin/UnBuffer, hash_any and so on. And then there's all the memcpying... I think its a great spot that PageAddItem() was so bad. I realise I didn't actually look at what it was doing, just looked at ways to avoid doing it on each individual call to the block for each row. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] A little COPY speedup
Simon Riggs [EMAIL PROTECTED] writes: I'm slightly worried though since that seems to have changed from 8.2, which I oprofiled over Christmas. If you were testing a case with wider rows than Heikki tested, you'd see less impact --- the cost of the old way was O(N^2) in the number of tuples that fit on a page, so the behavior gets rapidly worse as you get down to smaller tuple sizes. (Come to think of it, the cmin/cmax collapse would be a factor here too.) regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] A little COPY speedup
Tom Lane [EMAIL PROTECTED] writes: Simon Riggs [EMAIL PROTECTED] writes: I'm slightly worried though since that seems to have changed from 8.2, which I oprofiled over Christmas. If you were testing a case with wider rows than Heikki tested, you'd see less impact --- the cost of the old way was O(N^2) in the number of tuples that fit on a page, so the behavior gets rapidly worse as you get down to smaller tuple sizes. (Come to think of it, the cmin/cmax collapse would be a factor here too.) Or larger block sizes of course. A 32kb block would be 16x as bad which starts to be pretty serious. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] A little COPY speedup
On Fri, 2007-03-02 at 16:25 +, Gregory Stark wrote: Tom Lane [EMAIL PROTECTED] writes: Simon Riggs [EMAIL PROTECTED] writes: I'm slightly worried though since that seems to have changed from 8.2, which I oprofiled over Christmas. If you were testing a case with wider rows than Heikki tested, you'd see less impact --- the cost of the old way was O(N^2) in the number of tuples that fit on a page, so the behavior gets rapidly worse as you get down to smaller tuple sizes. (Come to think of it, the cmin/cmax collapse would be a factor here too.) Or larger block sizes of course. A 32kb block would be 16x as bad which starts to be pretty serious. Well, I was only using 8kb blocks. But I think the message is clear: we need to profile lots of different combinations. I was using a 2 col table with integer, char(100). IIRC there are issues with delimiter handling when we have lots of columns in the input on COPY FROM, and num of cols on COPY TO. I've not looked at those recently though. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] A little COPY speedup
Simon Riggs wrote: IIRC there are issues with delimiter handling when we have lots of columns in the input on COPY FROM, and num of cols on COPY TO. I've not looked at those recently though. What sort of issues? Anything that breaks on this has catastrophic consequences. cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] A little COPY speedup
On Fri, 2007-03-02 at 11:58 -0500, Andrew Dunstan wrote: Simon Riggs wrote: IIRC there are issues with delimiter handling when we have lots of columns in the input on COPY FROM, and num of cols on COPY TO. I've not looked at those recently though. What sort of issues? Anything that breaks on this has catastrophic consequences. We were talking about performance, not data integrity -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] A little COPY speedup
Simon Riggs wrote: On Fri, 2007-03-02 at 11:58 -0500, Andrew Dunstan wrote: Simon Riggs wrote: IIRC there are issues with delimiter handling when we have lots of columns in the input on COPY FROM, and num of cols on COPY TO. I've not looked at those recently though. What sort of issues? Anything that breaks on this has catastrophic consequences. We were talking about performance, not data integrity OK. I'm still curious to know what the issues are with delimiter handling. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] A little COPY speedup
On Fri, 2007-03-02 at 12:09 -0500, Andrew Dunstan wrote: OK. I'm still curious to know what the issues are with delimiter handling. Rumours only. Feedback from someone else looking to the problem last year. IIRC there was a feeling that if we didn't have to search for delimiters the COPY FROM input parsing could be easier. Vague recollection that the COPY TO uses the slower API for getting heap attributes, but didn't seem to show up when I profiled few months back. I'm not personally proposing to take those thoughts any further. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] A little COPY speedup
Simon Riggs [EMAIL PROTECTED] writes: Feedback from someone else looking to the problem last year. IIRC there was a feeling that if we didn't have to search for delimiters the COPY FROM input parsing could be easier. Of what use is the above comment? You have to parse the input into fields somehow. It can be arbitrarily fast ... if it doesn't have to get the right answer. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] A little COPY speedup
On Fri, 2007-03-02 at 13:24 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Feedback from someone else looking to the problem last year. IIRC there was a feeling that if we didn't have to search for delimiters the COPY FROM input parsing could be easier. Of what use is the above comment? To me, none. Andrew asked me why I was vague - I explained. If I'd written it on IRC it would be wisdom, on email not at all. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] A little COPY speedup
Tom Lane [EMAIL PROTECTED] writes: Simon Riggs [EMAIL PROTECTED] writes: Feedback from someone else looking to the problem last year. IIRC there was a feeling that if we didn't have to search for delimiters the COPY FROM input parsing could be easier. Of what use is the above comment? You have to parse the input into fields somehow. Well those two requirements aren't inconsistent if you're using fixed-width input text files. Not that I'm enamoured of such file formats, just saying. The only file formats I ever wanted when I was doing this kind of stuff is tab separated, all the others (comma separated and (egads) *pipe* separated?!) are just disasters. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] A little COPY speedup
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: Of what use is the above comment? You have to parse the input into fields somehow. Well those two requirements aren't inconsistent if you're using fixed-width input text files. Not that I'm enamoured of such file formats, just saying. [ shrug... ] And of course, shoving a large quantity of padding spaces through the software has zero cost. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] A little COPY speedup
Gregory Stark wrote: The only file formats I ever wanted when I was doing this kind of stuff is tab separated, all the others (comma separated and (egads) *pipe* separated?!) are just disasters. Others of us have to operate in a world where we don't get to choose the format of data we receive. Having Postgres work with common data interchange formats is a Good Thing (tm). (I also have my own opinion of the disutility of tab as a separator.) cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PATCHES] A little COPY speedup
One complaint we've heard from clients trying out EDB or PostgreSQL is that loading data is slower than on other DBMSs. I ran oprofile on a COPY FROM to get an overview of where the CPU time is spent. To my amazement, the function at the top of the list was PageAddItem with 16% of samples. On every row, PageAddItem will scan all the line pointers on the target page, just to see that they're all in use, and create a new line pointer. That adds up, especially with narrow tuples like what I used in the test. Attached is a fix for that. It adds a flag to each heap page that indicates that there isn't any free line pointers on this page, so don't bother trying. Heap pages haven't had any heap-specific per-page data before, so this patch adds a HeapPageOpaqueData-struct that's stored in the special space. My simple test case of a COPY FROM of 1000 tuples took 19.6 s without the patch, and 17.7 s with the patch applied. Your mileage may vary. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com Index: src/backend/access/heap/heapam.c === RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/heap/heapam.c,v retrieving revision 1.228 diff -c -r1.228 heapam.c *** src/backend/access/heap/heapam.c 9 Feb 2007 03:35:33 - 1.228 --- src/backend/access/heap/heapam.c 1 Mar 2007 16:27:38 - *** *** 3291,3296 --- 3291,3297 Relation reln; Buffer buffer; Page page; + HeapPageOpaque opq; if (record-xl_info XLR_BKP_BLOCK_1) return; *** *** 3300,3305 --- 3301,3307 if (!BufferIsValid(buffer)) return; page = (Page) BufferGetPage(buffer); + opq = (HeapPageOpaque) PageGetSpecialPointer(page); if (XLByteLE(lsn, PageGetLSN(page))) { *** *** 3327,3332 --- 3329,3337 PageRepairFragmentation(page, NULL); + /* clear the hint flag since we just freed some line pointers */ + opq-hpo_flags = ~HP_NOFREELINEPOINTERS; + PageSetLSN(page, lsn); PageSetTLI(page, ThisTimeLineID); MarkBufferDirty(buffer); Index: src/backend/access/heap/hio.c === RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/heap/hio.c,v retrieving revision 1.65 diff -c -r1.65 hio.c *** src/backend/access/heap/hio.c 5 Feb 2007 04:22:18 - 1.65 --- src/backend/access/heap/hio.c 1 Mar 2007 16:44:47 - *** *** 33,51 HeapTuple tuple) { Page pageHeader; ! OffsetNumber offnum; ItemId itemId; Item item; - /* Add the tuple to the page */ pageHeader = BufferGetPage(buffer); offnum = PageAddItem(pageHeader, (Item) tuple-t_data, ! tuple-t_len, InvalidOffsetNumber, LP_USED); if (offnum == InvalidOffsetNumber) elog(PANIC, failed to add tuple to page); /* Update tuple-t_self to the actual position where it was stored */ ItemPointerSet((tuple-t_self), BufferGetBlockNumber(buffer), offnum); --- 33,70 HeapTuple tuple) { Page pageHeader; ! OffsetNumber offnum, maxoff; ItemId itemId; Item item; + HeapPageOpaque opq; pageHeader = BufferGetPage(buffer); + opq = (HeapPageOpaque)PageGetSpecialPointer(pageHeader); + maxoff = PageGetMaxOffsetNumber(pageHeader); + + /* If we know there's no free line pointers, don't waste cycles + * searching for one. The flag is set when there definitely isn't + * any free line pointers on the page, but the absence of the flag + * doesn't mean anything. There might still not be any free line + * pointers left. We'll set the flag to save work for future inserts + * when that happens. + */ + if(opq-hpo_flags HP_NOFREELINEPOINTERS) + offnum = OffsetNumberNext(maxoff); + else + offnum = InvalidOffsetNumber; + + /* Add the tuple to the page */ offnum = PageAddItem(pageHeader, (Item) tuple-t_data, ! tuple-t_len, offnum, LP_USED); if (offnum == InvalidOffsetNumber) elog(PANIC, failed to add tuple to page); + if(offnum maxoff) + opq-hpo_flags |= HP_NOFREELINEPOINTERS; + /* Update tuple-t_self to the actual position where it was stored */ ItemPointerSet((tuple-t_self), BufferGetBlockNumber(buffer), offnum); *** *** 309,315 BufferGetBlockNumber(buffer), RelationGetRelationName(relation)); ! PageInit(pageHeader, BufferGetPageSize(buffer), 0); if (len PageGetFreeSpace(pageHeader)) { --- 328,335 BufferGetBlockNumber(buffer), RelationGetRelationName(relation)); ! PageInit(pageHeader, BufferGetPageSize(buffer), ! sizeof(HeapPageOpaqueData)); if (len PageGetFreeSpace(pageHeader)) { Index: src/backend/commands/vacuum.c === RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/commands/vacuum.c,v retrieving revision 1.346 diff -c
Re: [PATCHES] A little COPY speedup
Heikki Linnakangas wrote: Attached is a fix for that. It adds a flag to each heap page that indicates that there isn't any free line pointers on this page, so don't bother trying. Heap pages haven't had any heap-specific per-page data before, so this patch adds a HeapPageOpaqueData-struct that's stored in the special space. I would really like this change. I was thinking on similar lines to optimize some of the HOT code paths Thanks, Pavan ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] A little COPY speedup
Heikki Linnakangas wrote: One complaint we've heard from clients trying out EDB or PostgreSQL is that loading data is slower than on other DBMSs. I ran oprofile on a COPY FROM to get an overview of where the CPU time is spent. To my amazement, the function at the top of the list was PageAddItem with 16% of samples. On every row, PageAddItem will scan all the line pointers on the target page, just to see that they're all in use, and create a new line pointer. That adds up, especially with narrow tuples like what I used in the test. Attached is a fix for that. It adds a flag to each heap page that indicates that there isn't any free line pointers on this page, so don't bother trying. Heap pages haven't had any heap-specific per-page data before, so this patch adds a HeapPageOpaqueData-struct that's stored in the special space. My simple test case of a COPY FROM of 1000 tuples took 19.6 s without the patch, and 17.7 s with the patch applied. Your mileage may vary. What is the speedup with less narrow tuples? 10% improvement is good but not stellar. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] A little COPY speedup
Heikki Linnakangas [EMAIL PROTECTED] writes: On every row, PageAddItem will scan all the line pointers on the target page, just to see that they're all in use, and create a new line pointer. That adds up, especially with narrow tuples like what I used in the test. Attached is a fix for that. This has been proposed before, and rejected before. IIRC the previous patch was quite a lot less invasive than this one (it didn't require making special space on heap pages). I don't recall why it wasn't accepted. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] A little COPY speedup
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: On every row, PageAddItem will scan all the line pointers on the target page, just to see that they're all in use, and create a new line pointer. That adds up, especially with narrow tuples like what I used in the test. Attached is a fix for that. This has been proposed before, and rejected before. IIRC the previous patch was quite a lot less invasive than this one (it didn't require making special space on heap pages). I don't recall why it wasn't accepted. Ahh, found that thread: http://archives.postgresql.org/pgsql-hackers/2005-07/msg00609.php The main differences between that patch and mine is that - the previous patch used an offset to the first free line pointer, and I used just a flag. - the previous patch stored the offset in the page header, and I used the special space I think using the special space is a cleaner approach; the field is only meaningful in heap pages. However, now that I think of it, if we could squeeze the flag into one of the existing fields in the page header, we could put it there without decreasing the amount of space available for tuples. We could use the unused pd_tli field, as you suggested later in that thread. At the end of the thread, Bruce added the patch to his hold-queue, but I couldn't find a trace of it after that so I'm not clear why it was rejected in the end. This comment (by you) seems most relevant: I tried making a million-row table with just two int4 columns and then duplicating it with CREATE TABLE AS SELECT. In this context gprof shows PageAddItem as taking 7% of the runtime, which your patch knocks down to 1.5%. This seems to be about the best possible real-world case, though (the wider the rows, the fewer times PageAddItem can loop), and so I'm still unconvinced that there's a generic gain here. Adding an additional word to page headers has a very definite cost --- we can assume about a .05% increase in net I/O demands across *every* application, whether they do a lot of inserts or not --- and so a patch that provides a noticeable improvement in only a very small set of circumstances is going to have to be rejected. I believe the PageAddItem overhead has become more noticeable since then because of other improvements to COPY. In 8.3, we're also going to reduce the tuple length (combocids and the varvarlen thing), so we can fit more tuples per page, again making it slightly more significant. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] A little COPY speedup
Heikki Linnakangas [EMAIL PROTECTED] writes: At the end of the thread, Bruce added the patch to his hold-queue, but I couldn't find a trace of it after that so I'm not clear why it was rejected in the end. This comment (by you) seems most relevant: I believe we concluded that the distributed cost of enlarging the page headers would probably outweigh a gain that seemed to have fairly narrow application. I wouldn't be surprised if the tradeoffs have changed since then, but I'm still loath to increase the overhead space. If we can do it without that, the argument to reject gets much weaker. As you say, pd_tli is not really pulling its weight, but I'm also loath to remove it, as in a multi-timeline situation the page LSN is really not well defined if you don't know which timeline it refers to. Now we'd only need 16 bits to store the last-used offset, or a flags field if you'd prefer that, so one possible compromise is to store only the 16 least significant bits of TLI (which ought to be enough to disambiguate in any real-world situation), and insert the new field where the MSBs had been. I'm not sure whether I like your flag approach better than the last-used-offset one. The previous patch probably buys some teeny amount more performance, but the flag seems more robust (noting in passing that neither patch attempts to WAL-log its changes, so we really need to treat the values as hints). And a change as sketched here would leave us 15 free bits for future expansion, which might be nice. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] A little COPY speedup
Tom Lane wrote: As you say, pd_tli is not really pulling its weight, but I'm also loath to remove it, as in a multi-timeline situation the page LSN is really not well defined if you don't know which timeline it refers to. Now we'd only need 16 bits to store the last-used offset, or a flags field if you'd prefer that, so one possible compromise is to store only the 16 least significant bits of TLI (which ought to be enough to disambiguate in any real-world situation), and insert the new field where the MSBs had been. Sounds good to me. It's nice to keep the TLI for debugging/forensics purposes even if it's not used at the moment, but 16 bits is enough for that. Another possibility would be to use the unused bits in pd_upper/lower/special, but that requires more bit-trickery. I'm not sure whether I like your flag approach better than the last-used-offset one. The previous patch probably buys some teeny amount more performance, but the flag seems more robust (noting in passing that neither patch attempts to WAL-log its changes, so we really need to treat the values as hints). And a change as sketched here would leave us 15 free bits for future expansion, which might be nice. I'll post a patch along those lines. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] A little COPY speedup
Heikki Linnakangas wrote: Tom Lane wrote: I'm not sure whether I like your flag approach better than the last-used-offset one. The previous patch probably buys some teeny amount more performance, but the flag seems more robust (noting in passing that neither patch attempts to WAL-log its changes, so we really need to treat the values as hints). And a change as sketched here would leave us 15 free bits for future expansion, which might be nice. I'll post a patch along those lines. Here it is. I'm not fond of the macro names for the flag, but couldn't think of anything shorter yet descriptive. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com Index: src/backend/access/heap/heapam.c === RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/heap/heapam.c,v retrieving revision 1.228 diff -c -r1.228 heapam.c *** src/backend/access/heap/heapam.c 9 Feb 2007 03:35:33 - 1.228 --- src/backend/access/heap/heapam.c 1 Mar 2007 21:29:14 - *** *** 3327,3332 --- 3327,3335 PageRepairFragmentation(page, NULL); + /* clear the hint flag since we just freed some line pointers */ + HeapPageClearNoFreeLinePointers(page); + PageSetLSN(page, lsn); PageSetTLI(page, ThisTimeLineID); MarkBufferDirty(buffer); Index: src/backend/access/heap/hio.c === RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/heap/hio.c,v retrieving revision 1.65 diff -c -r1.65 hio.c *** src/backend/access/heap/hio.c 5 Feb 2007 04:22:18 - 1.65 --- src/backend/access/heap/hio.c 1 Mar 2007 21:33:14 - *** *** 33,51 HeapTuple tuple) { Page pageHeader; ! OffsetNumber offnum; ItemId itemId; Item item; - /* Add the tuple to the page */ pageHeader = BufferGetPage(buffer); offnum = PageAddItem(pageHeader, (Item) tuple-t_data, ! tuple-t_len, InvalidOffsetNumber, LP_USED); if (offnum == InvalidOffsetNumber) elog(PANIC, failed to add tuple to page); /* Update tuple-t_self to the actual position where it was stored */ ItemPointerSet((tuple-t_self), BufferGetBlockNumber(buffer), offnum); --- 33,65 HeapTuple tuple) { Page pageHeader; ! OffsetNumber offnum, maxoff; ItemId itemId; Item item; pageHeader = BufferGetPage(buffer); + maxoff = PageGetMaxOffsetNumber(pageHeader); + + /* If we know there's no free line pointers, don't waste cycles + * searching for one. We'll set the flag to save work for future + * inserts when that happens. + */ + if(HeapPageNoFreeLinePointers(pageHeader)) + offnum = OffsetNumberNext(maxoff); + else + offnum = InvalidOffsetNumber; + + /* Add the tuple to the page */ offnum = PageAddItem(pageHeader, (Item) tuple-t_data, ! tuple-t_len, offnum, LP_USED); if (offnum == InvalidOffsetNumber) elog(PANIC, failed to add tuple to page); + if(offnum maxoff) + HeapPageSetNoFreeLinePointers(pageHeader); + /* Update tuple-t_self to the actual position where it was stored */ ItemPointerSet((tuple-t_self), BufferGetBlockNumber(buffer), offnum); Index: src/backend/commands/vacuum.c === RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/commands/vacuum.c,v retrieving revision 1.346 diff -c -r1.346 vacuum.c *** src/backend/commands/vacuum.c 15 Feb 2007 23:23:22 - 1.346 --- src/backend/commands/vacuum.c 1 Mar 2007 21:41:20 - *** *** 2453,2458 --- 2453,2460 START_CRIT_SECTION(); uncnt = PageRepairFragmentation(page, unused); + if(uncnt 0) + HeapPageClearNoFreeLinePointers(page); MarkBufferDirty(buf); *** *** 2920,2925 --- 2922,2929 } uncnt = PageRepairFragmentation(page, unused); + if(uncnt 0) + HeapPageClearNoFreeLinePointers(page); MarkBufferDirty(buffer); Index: src/backend/commands/vacuumlazy.c === RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/commands/vacuumlazy.c,v retrieving revision 1.85 diff -c -r1.85 vacuumlazy.c *** src/backend/commands/vacuumlazy.c 21 Feb 2007 22:47:45 - 1.85 --- src/backend/commands/vacuumlazy.c 1 Mar 2007 21:27:15 - *** *** 605,610 --- 605,612 } uncnt = PageRepairFragmentation(page, unused); + if(uncnt 0) + HeapPageClearNoFreeLinePointers(page); MarkBufferDirty(buffer); Index: src/include/access/htup.h === RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/access/htup.h,v retrieving revision 1.92 diff -c -r1.92 htup.h *** src/include/access/htup.h 27 Feb 2007 23:48:09 - 1.92 --- src/include/access/htup.h 1 Mar 2007 22:13:25 - *** ***
Re: [PATCHES] A little COPY speedup
Heikki Linnakangas [EMAIL PROTECTED] writes: I'll post a patch along those lines. Here it is. I'm not fond of the macro names for the flag, but couldn't think of anything shorter yet descriptive. Let's reverse the sense of the flag bit; this seems a good idea since the initial state should be 0 = there are no free pointers. Also I'd go with PD_ as the prefix, for consistency with the field names. This brings us to PD_HAS_FREE_LINE_POINTERS or some abbreviation thereof (maybe PD_HAS_FREE_LINES is sufficient). I'd also go with pd_flags as the field name; amprivate seems to imply way too much about what we might later use the flags for. Barring objections, I'll tweak this as above and apply. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] A little COPY speedup
I wrote: Barring objections, I'll tweak this as above and apply. I've applied the attached modified version of this patch. It seemed better to me to centralize the handling of this flag bit in PageAddItem and PageRepairFragmentation, instead of having it in the callers as you did. This means that the bit applies to all pages not only heap pages, but at least for the moment that has no downside. I note that GIN indexes do use PageAddItem with offsetNumber = InvalidOffsetNumber, so they will get some performance benefit too. regards, tom lane Index: doc/src/sgml/storage.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/storage.sgml,v retrieving revision 1.14 diff -c -r1.14 storage.sgml *** doc/src/sgml/storage.sgml 31 Jan 2007 20:56:19 - 1.14 --- doc/src/sgml/storage.sgml 2 Mar 2007 00:48:25 - *** *** 427,434 The first 20 bytes of each page consists of a page header (PageHeaderData). Its format is detailed in xref linkend=pageheaderdata-table. The first two fields track the most ! recent WAL entry related to this page. They are followed by three 2-byte ! integer fields (structfieldpd_lower/structfield, structfieldpd_upper/structfield, and structfieldpd_special/structfield). These contain byte offsets from the page start to the start --- 427,434 The first 20 bytes of each page consists of a page header (PageHeaderData). Its format is detailed in xref linkend=pageheaderdata-table. The first two fields track the most ! recent WAL entry related to this page. Next is a 2-byte field ! containing flag bits. This is followed by three 2-byte integer fields (structfieldpd_lower/structfield, structfieldpd_upper/structfield, and structfieldpd_special/structfield). These contain byte offsets from the page start to the start *** *** 437,448 The last 2 bytes of the page header, structfieldpd_pagesize_version/structfield, store both the page size and a version indicator. Beginning with ! productnamePostgreSQL/productname 8.1 the version number is 3; productnamePostgreSQL/productname 8.0 used version number 2; productnamePostgreSQL/productname 7.3 and 7.4 used version number 1; prior releases used version number 0. ! (The basic page layout and header format has not changed in these versions, ! but the layout of heap row headers has.) The page size is basically only present as a cross-check; there is no support for having more than one page size in an installation. --- 437,449 The last 2 bytes of the page header, structfieldpd_pagesize_version/structfield, store both the page size and a version indicator. Beginning with ! productnamePostgreSQL/productname 8.3 the version number is 4; ! productnamePostgreSQL/productname 8.1 and 8.2 used version number 3; productnamePostgreSQL/productname 8.0 used version number 2; productnamePostgreSQL/productname 7.3 and 7.4 used version number 1; prior releases used version number 0. ! (The basic page layout and header format has not changed in most of these ! versions, but the layout of heap row headers has.) The page size is basically only present as a cross-check; there is no support for having more than one page size in an installation. *** *** 470,478 /row row entrypd_tli/entry !entryTimeLineID/entry !entry4 bytes/entry !entryTLI of last change/entry /row row entrypd_lower/entry --- 471,485 /row row entrypd_tli/entry !entryuint16/entry !entry2 bytes/entry !entryTimeLineID of last change (only its lowest 16 bits)/entry ! /row ! row !entrypd_flags/entry !entryuint16/entry !entry2 bytes/entry !entryFlag bits/entry /row row entrypd_lower/entry Index: src/backend/storage/page/bufpage.c === RCS file: /cvsroot/pgsql/src/backend/storage/page/bufpage.c,v retrieving revision 1.71 diff -c -r1.71 bufpage.c *** src/backend/storage/page/bufpage.c 21 Feb 2007 20:02:17 - 1.71 --- src/backend/storage/page/bufpage.c 2 Mar 2007 00:48:26 - *** *** 39,44 --- 39,45 /* Make sure all fields of page are zero, as well as unused space */ MemSet(p, 0, pageSize); + /* p-pd_flags = 0; done by above MemSet */ p-pd_lower = SizeOfPageHeaderData; p-pd_upper = pageSize - specialSize; p-pd_special = pageSize - specialSize; *** *** 73,78 --- 74,80 /* Check normal case */ if (PageGetPageSize(page) == BLCKSZ PageGetPageLayoutVersion(page) == PG_PAGE_LAYOUT_VERSION + (page-pd_flags ~PD_VALID_FLAG_BITS) == 0 page-pd_lower = SizeOfPageHeaderData