Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
I wrote: > ... BTW, a quick grep for other calls of PageIndexTupleDelete suggests > that SPGiST could make very good use of PageIndexTupleOverwrite, and > there may be use for it in GIN too. I've pushed this patch with some editorializing and some followup work. The above remains as possible material for a future patch, but I'm going to mark this commitfest entry as closed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
... BTW, a quick grep for other calls of PageIndexTupleDelete suggests that SPGiST could make very good use of PageIndexTupleOverwrite, and there may be use for it in GIN too. I see one place in btree where there's a PageIndexTupleDelete and then an insert, but it's unlikely to be worth touching because the page is expected to be empty after the PageIndexTupleDelete; so there's no data movement we could avoid in that case. I don't plan to investigate these cases myself, but somebody should. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Here's a draft patch that replaces BRIN's use of PageIndexDeleteNoCompact and an immediately following PageAddItemExtended with PageIndexTupleOverwrite. It turns out that there are two use-cases in BRIN for PageIndexDeleteNoCompact, and the other one can't be replaced because while there is a PageAddItem call beside it, that one is for a different page. But this is still enough to let us get rid of PAI_ALLOW_FAR_OFFSET, which seems like a good thing. Actually, with this patch, PageAddItemExtended isn't directly referenced anywhere, so we could revert that API and fold it back into PageAddItem, saving a nanosecond or two per call. I'm inclined to leave that alone, though, since I agree with the underlying thought that any future extensions to PageAddItem's functionality would be better handled by more bits in a flags bitmask. Maybe a better idea is to get rid of the call overhead by turning PageAddItem into a macro. In any case, doing something about that could be a separate patch. It's also kind of tempting to rewrite PageIndexDeleteNoCompact into a function that only deletes one tuple, and presumably is simpler and faster than what's there. But maybe there's something coming down the pike that needs the extra generality? That should also be a separate patch, anyway. This passes make check-world but I'm not sure how heavily that stresses BRIN ... are there any other tests you'd want to see run before committing? regards, tom lane diff --git a/src/backend/access/brin/brin_pageops.c b/src/backend/access/brin/brin_pageops.c index 6ebfedd..24ae38b 100644 *** a/src/backend/access/brin/brin_pageops.c --- b/src/backend/access/brin/brin_pageops.c *** brin_doupdate(Relation idxrel, BlockNumb *** 178,187 } START_CRIT_SECTION(); ! PageIndexDeleteNoCompact(oldpage, , 1); ! if (PageAddItemExtended(oldpage, (Item) newtup, newsz, oldoff, ! PAI_OVERWRITE | PAI_ALLOW_FAR_OFFSET) == InvalidOffsetNumber) ! elog(ERROR, "failed to add BRIN tuple"); MarkBufferDirty(oldbuf); /* XLOG stuff */ --- 178,185 } START_CRIT_SECTION(); ! if (!PageIndexTupleOverwrite(oldpage, oldoff, (Item) newtup, newsz)) ! elog(ERROR, "failed to replace BRIN tuple"); MarkBufferDirty(oldbuf); /* XLOG stuff */ diff --git a/src/backend/access/brin/brin_xlog.c b/src/backend/access/brin/brin_xlog.c index 27ba0a9..ed14729 100644 *** a/src/backend/access/brin/brin_xlog.c --- b/src/backend/access/brin/brin_xlog.c *** brin_xlog_samepage_update(XLogReaderStat *** 189,202 page = (Page) BufferGetPage(buffer); offnum = xlrec->offnum; - if (PageGetMaxOffsetNumber(page) + 1 < offnum) - elog(PANIC, "brin_xlog_samepage_update: invalid max offset number"); ! PageIndexDeleteNoCompact(page, , 1); ! offnum = PageAddItemExtended(page, (Item) brintuple, tuplen, offnum, ! PAI_OVERWRITE | PAI_ALLOW_FAR_OFFSET); ! if (offnum == InvalidOffsetNumber) ! elog(PANIC, "brin_xlog_samepage_update: failed to add tuple"); PageSetLSN(page, lsn); MarkBufferDirty(buffer); --- 189,197 page = (Page) BufferGetPage(buffer); offnum = xlrec->offnum; ! if (!PageIndexTupleOverwrite(page, offnum, (Item) brintuple, tuplen)) ! elog(PANIC, "brin_xlog_samepage_update: failed to replace tuple"); PageSetLSN(page, lsn); MarkBufferDirty(buffer); diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c index bce0d53..08e222e 100644 *** a/src/backend/storage/page/bufpage.c --- b/src/backend/storage/page/bufpage.c *** PageIsVerified(Page page, BlockNumber bl *** 166,186 * inserted, or InvalidOffsetNumber if the item is not inserted for any * reason. A WARNING is issued indicating the reason for the refusal. * ! * If flag PAI_OVERWRITE is set, we just store the item at the specified ! * offsetNumber (which must be either a currently-unused item pointer, ! * or one past the last existing item). Otherwise, ! * if offsetNumber is valid and <= current max offset in the page, ! * insert item into the array at that position by shuffling ItemId's ! * down to make room. ! * If offsetNumber is not valid, then assign one by finding the first * one that is both unused and deallocated. * * If flag PAI_IS_HEAP is set, we enforce that there can't be more than * MaxHeapTuplesPerPage line pointers on the page. * - * If flag PAI_ALLOW_FAR_OFFSET is not set, we disallow placing items - * beyond one past the last existing item. - * * !!! EREPORT(ERROR) IS DISALLOWED HERE !!! */ OffsetNumber --- 166,189 * inserted, or InvalidOffsetNumber if the item is not inserted for any * reason. A WARNING is issued indicating the reason for the refusal. * ! * offsetNumber must be either InvalidOffsetNumber to specify finding a ! * free item pointer, or a value between FirstOffsetNumber and one past ! * the last existing
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Andrey Borodinwrites: > That storage assertion fired during usual update table set x=random() without > conditions. Also Make check fails without it (for brin usage, gist is ok with > it). I'm confused by that assertion, because the patch-as-submitted didn't touch BRIN. Based on Alvaro's comments, yes it would fail if we'd modified BRIN to use this function ... but we hadn't yet. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
That storage assertion fired during usual update table set x=random() without conditions. Also Make check fails without it (for brin usage, gist is ok with it). Cannot type quotation properly, I'm on a phone for a long time. Regards, Andrey Borodin. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Tom Lane wrote: > Alvaro Herrerawrites: > > Tom Lane wrote: > >> That comment seems utterly wrong to me, because both PageIndexTupleDelete > >> and PageIndexMultiDelete certainly contain assertions that every item on > >> the page has storage. Are you expecting that any BRIN index items > >> wouldn't? If they wouldn't, is adjusting their lp_off as if they did > >> have storage sensible? > > > It is possible in BRIN to have empty intermediate tuples; for example it > > is possible for lp 1 and 3 to contain index tuples, while lp 2 does not. > > Hm. So apparently, the only reason this stuff works at all is that > BRIN isn't using either PageIndexTupleDelete or PageIndexMultiDelete. Yes, this is why the NoCompact variant was introduced in the first place. > > Now if this loop is concerned only with live lps and does not move lps, > > then it should be fine to add the assertion. > > No, it iterates over all lps on the page. I'm inclined to think it > should be written like > > if (ItemIdHasStorage(ii) && ItemIdGetOffset(ii) <= offset) > ii->lp_off += size_diff; > > because futzing with the lp_off field in an item that isn't really > pointing at storage feels wrong. We might need to do that to > PageIndexTupleDelete and/or PageIndexMultiDelete someday, too. I suppose it is conceivable that we start using lp_off for other purposes in the future, so I don't disagree. I don't think index pages currently do any funny business with it. > I notice that PageIndexDeleteNoCompact, which seems to express what > BRIN is expecting in a rather underdocumented way, forces any > items without storage into "unused" state. I don't really think > it's bufpage.c's place to do that, though. Do you think that code > is actually supposed to fire, or is it just there for lack of a > better idea? I just put it there only because I didn't see any reason not to, really. I don't think BRIN relies on it. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Alvaro Herrerawrites: > Tom Lane wrote: >> That comment seems utterly wrong to me, because both PageIndexTupleDelete >> and PageIndexMultiDelete certainly contain assertions that every item on >> the page has storage. Are you expecting that any BRIN index items >> wouldn't? If they wouldn't, is adjusting their lp_off as if they did >> have storage sensible? > It is possible in BRIN to have empty intermediate tuples; for example it > is possible for lp 1 and 3 to contain index tuples, while lp 2 does not. Hm. So apparently, the only reason this stuff works at all is that BRIN isn't using either PageIndexTupleDelete or PageIndexMultiDelete. > Now if this loop is concerned only with live lps and does not move lps, > then it should be fine to add the assertion. No, it iterates over all lps on the page. I'm inclined to think it should be written like if (ItemIdHasStorage(ii) && ItemIdGetOffset(ii) <= offset) ii->lp_off += size_diff; because futzing with the lp_off field in an item that isn't really pointing at storage feels wrong. We might need to do that to PageIndexTupleDelete and/or PageIndexMultiDelete someday, too. I notice that PageIndexDeleteNoCompact, which seems to express what BRIN is expecting in a rather underdocumented way, forces any items without storage into "unused" state. I don't really think it's bufpage.c's place to do that, though. Do you think that code is actually supposed to fire, or is it just there for lack of a better idea? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Tom Lane wrote: > Hey Alvaro, can you comment on this bit in the proposed patch? > > + for (i = FirstOffsetNumber; i <= itemcount; i++) > + { > + ItemId ii = PageGetItemId(phdr, i); > + > + /* Normally here was following assertion > + * Assert(ItemIdHasStorage(ii)); > + * This assertion was introduced in PageIndexTupleDelete > + * But if this function will be used from BRIN index > + * this assertion will fail. Thus, here we do not check > that > + * item has the storage. > + */ > + if (ItemIdGetOffset(ii) <= offset) > + ii->lp_off += size_diff; > + } > + } > > That comment seems utterly wrong to me, because both PageIndexTupleDelete > and PageIndexMultiDelete certainly contain assertions that every item on > the page has storage. Are you expecting that any BRIN index items > wouldn't? If they wouldn't, is adjusting their lp_off as if they did > have storage sensible? It is possible in BRIN to have empty intermediate tuples; for example it is possible for lp 1 and 3 to contain index tuples, while lp 2 does not. This is a tricky case to reproduce, but I think this should do it: consider a table with pages_per_range=1. Page 1 is summarized by index tuple 1, page 2 is summarized by itup 2, page 3 is summarized by index tuple 3. On heap page 2 a new tuple is inserted and the summary data is now much larger, such that the summary tuple no longer fits in the index page, so it is moved to a new index page. Then index tuple 2 in the first index page becomes unused. Revmap still points to lp 3, so it can't be moved to lp 2. The lp_offs can be adjusted, as long as the lp itself is not moved from its original position. Now if this loop is concerned only with live lps and does not move lps, then it should be fine to add the assertion. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Hey Alvaro, can you comment on this bit in the proposed patch? + for (i = FirstOffsetNumber; i <= itemcount; i++) + { + ItemId ii = PageGetItemId(phdr, i); + + /* Normally here was following assertion +* Assert(ItemIdHasStorage(ii)); +* This assertion was introduced in PageIndexTupleDelete +* But if this function will be used from BRIN index +* this assertion will fail. Thus, here we do not check that +* item has the storage. +*/ + if (ItemIdGetOffset(ii) <= offset) + ii->lp_off += size_diff; + } + } That comment seems utterly wrong to me, because both PageIndexTupleDelete and PageIndexMultiDelete certainly contain assertions that every item on the page has storage. Are you expecting that any BRIN index items wouldn't? If they wouldn't, is adjusting their lp_off as if they did have storage sensible? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Tom Lane wrote: > Andrew Borodinwrites: > > Thank you for your corrections. > > Here is the patch with suggestions taken into account, except 6th. > > I'll pick this up, unless some other committer is already on it. > > I think that we should buy back the addition of PageIndexTupleOverwrite > to bufpage.c by getting rid of PageIndexDeleteNoCompact, as suggested > earlier by Alvaro. The latter seems a lot messier in concept, and it's > more general than BRIN needs. Yeah, BRIN only wants to remove one item nowadays; that function was written when the BRIN code wanted to remove multiple items in one go. > It also kinda looks like we could get rid of PAI_ALLOW_FAR_OFFSET, > which is a messy kluge in itself. That'd be nice. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Andrew Borodinwrites: > Thank you for your corrections. > Here is the patch with suggestions taken into account, except 6th. I'll pick this up, unless some other committer is already on it. I think that we should buy back the addition of PageIndexTupleOverwrite to bufpage.c by getting rid of PageIndexDeleteNoCompact, as suggested earlier by Alvaro. The latter seems a lot messier in concept, and it's more general than BRIN needs. It also kinda looks like we could get rid of PAI_ALLOW_FAR_OFFSET, which is a messy kluge in itself. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Thank you for your corrections. Here is the patch with suggestions taken into account, except 6th. >6) I'd rather use alignednewsize here. > +ItemIdSetNormal(tupid, offset + size_diff, newsize); This behavior is accroding to ubiquitous PageAddItem. Best regards, Andrey Borodin, Octonica & Ural Federal University. 2016-08-16 20:23 GMT+05:00 Anastasia Lubennikova: > 09.08.2016 19:45, Andrew Borodin: >> >> Here is new version of the patch, now it includes recommendations from >> Anastasia Lubennikova. >> >> >> I've investigated anamalous index size decrease. Most probable version >> appeared to be true. >> Cube extension, as some others, use Guttman's polynomial time split >> algorithm. It is known to generate "needle-like" MBBs (MBRs) for >> sorted data due to imbalanced splits (splitting 100 tuples as 98 to >> 2). Due to previous throw-to-the-end behavior of GiST this imbalance >> was further amplificated (most of inserts were going to bigger part >> after split). So GiST inserts were extremely slow for sorted data. >> There is no need to do exact sorting to trigger this behavior. >> This behavior can be fused by implementation of small-m restriction in >> split (AFAIR this is described in original R-tree paper from 84), >> which prescribes to do a split in a parts no smaller than m, where m >> is around 20% of a page capacity (in tuples number). After application >> of this patch performance for sorted data is roughly the same as >> performance for randomized data. > > > Thank you for explanation. Now it's clear to me. I did some more testing and > found nothing special. The declared feature is implemented correctly. > It passes all regression tests and improves performance. > > I still have a few minor nitpicks about the patch style. > You can find a style guide on > https://www.postgresql.org/docs/9.6/static/source.html > > 1) remove extra whitespace in README > > 2) add whitespace: if(ntup == 1) > > 3) fix comments in accordance with coding conventions > > /* In case of single tuple update GiST calls overwrite > * In all other cases function gistplacetopage deletes > * old tuples and place updated at the end > * */ > > > +/* Normally here was following assertion > + * Assert(ItemIdHasStorage(ii)); > + * This assertion was introduced in PageIndexTupleDelete > + * But if this function will be used from BRIN index > + * this assertion will fail. Thus, here we do not check that > + * item has the storage. > + * */ > > 4) remove unrelated changes > -data += sizeof(OffsetNumber) * xldata->ntodelete; > +data += sizeof(OffsetNumber) *xldata->ntodelete; > > 5) If the comment is correct, maxalignment is not necessary. > +/* tuples on a page are always maxaligned */ > +oldsize = MAXALIGN(oldsize); > > 6) I'd rather use alignednewsize here. > +ItemIdSetNormal(tupid, offset + size_diff, newsize); > > > After the cleanup you can change status to "Ready for Committer" > without waiting for the response. > >> If data is randomized this effect of Guttman poly-time split is not in >> effect; test from the start of the thread will show no statistically >> confident improvement by the patch. >> To prove performance increase in randomized case I've composed >> modified test https://gist.github.com/x4m/856b2e1a5db745f8265c76b9c195f2e1 >> This test with five runs shows following: >> Before patch >> Insert Time AVG 78 seconds STD 9.5 >> Afer patch >> Insert Time AVG 68 seconds STD 7.7 >> This is marginal but statistically viable performance improvement. >> >> For sorted data performance is improved by a factor of 3. >> Best regards, Andrey Borodin, Octonica & Ural Federal University. >> > > -- > Anastasia Lubennikova > Postgres Professional: http://www.postgrespro.com > The Russian Postgres Company > PageIndexTupleOverwrite v8.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
09.08.2016 19:45, Andrew Borodin: Here is new version of the patch, now it includes recommendations from Anastasia Lubennikova. I've investigated anamalous index size decrease. Most probable version appeared to be true. Cube extension, as some others, use Guttman's polynomial time split algorithm. It is known to generate "needle-like" MBBs (MBRs) for sorted data due to imbalanced splits (splitting 100 tuples as 98 to 2). Due to previous throw-to-the-end behavior of GiST this imbalance was further amplificated (most of inserts were going to bigger part after split). So GiST inserts were extremely slow for sorted data. There is no need to do exact sorting to trigger this behavior. This behavior can be fused by implementation of small-m restriction in split (AFAIR this is described in original R-tree paper from 84), which prescribes to do a split in a parts no smaller than m, where m is around 20% of a page capacity (in tuples number). After application of this patch performance for sorted data is roughly the same as performance for randomized data. Thank you for explanation. Now it's clear to me. I did some more testing and found nothing special. The declared feature is implemented correctly. It passes all regression tests and improves performance. I still have a few minor nitpicks about the patch style. You can find a style guide on https://www.postgresql.org/docs/9.6/static/source.html 1) remove extra whitespace in README 2) add whitespace: if(ntup == 1) 3) fix comments in accordance with coding conventions /* In case of single tuple update GiST calls overwrite * In all other cases function gistplacetopage deletes * old tuples and place updated at the end * */ +/* Normally here was following assertion + * Assert(ItemIdHasStorage(ii)); + * This assertion was introduced in PageIndexTupleDelete + * But if this function will be used from BRIN index + * this assertion will fail. Thus, here we do not check that + * item has the storage. + * */ 4) remove unrelated changes -data += sizeof(OffsetNumber) * xldata->ntodelete; +data += sizeof(OffsetNumber) *xldata->ntodelete; 5) If the comment is correct, maxalignment is not necessary. +/* tuples on a page are always maxaligned */ +oldsize = MAXALIGN(oldsize); 6) I'd rather use alignednewsize here. +ItemIdSetNormal(tupid, offset + size_diff, newsize); After the cleanup you can change status to "Ready for Committer" without waiting for the response. If data is randomized this effect of Guttman poly-time split is not in effect; test from the start of the thread will show no statistically confident improvement by the patch. To prove performance increase in randomized case I've composed modified test https://gist.github.com/x4m/856b2e1a5db745f8265c76b9c195f2e1 This test with five runs shows following: Before patch Insert Time AVG 78 seconds STD 9.5 Afer patch Insert Time AVG 68 seconds STD 7.7 This is marginal but statistically viable performance improvement. For sorted data performance is improved by a factor of 3. Best regards, Andrey Borodin, Octonica & Ural Federal University. -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Here is new version of the patch, now it includes recommendations from Anastasia Lubennikova. I've investigated anamalous index size decrease. Most probable version appeared to be true. Cube extension, as some others, use Guttman's polynomial time split algorithm. It is known to generate "needle-like" MBBs (MBRs) for sorted data due to imbalanced splits (splitting 100 tuples as 98 to 2). Due to previous throw-to-the-end behavior of GiST this imbalance was further amplificated (most of inserts were going to bigger part after split). So GiST inserts were extremely slow for sorted data. There is no need to do exact sorting to trigger this behavior. This behavior can be fused by implementation of small-m restriction in split (AFAIR this is described in original R-tree paper from 84), which prescribes to do a split in a parts no smaller than m, where m is around 20% of a page capacity (in tuples number). After application of this patch performance for sorted data is roughly the same as performance for randomized data. If data is randomized this effect of Guttman poly-time split is not in effect; test from the start of the thread will show no statistically confident improvement by the patch. To prove performance increase in randomized case I've composed modified test https://gist.github.com/x4m/856b2e1a5db745f8265c76b9c195f2e1 This test with five runs shows following: Before patch Insert Time AVG 78 seconds STD 9.5 Afer patch Insert Time AVG 68 seconds STD 7.7 This is marginal but statistically viable performance improvement. For sorted data performance is improved by a factor of 3. Best regards, Andrey Borodin, Octonica & Ural Federal University. PageIndexTupleOverwrite v7.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
06.08.2016 19:51, Andrew Borodin: Anastasia , thank you for your attentive code examine. 2016-08-05 21:19 GMT+05:00 Anastasia Lubennikova: First of all, shouldn't we use MAXALIGN(oldsize) instead of oldsize? Although, I'm quite sure that it was already aligned somewhere before. I doubt that the check (size_diff != MAXALIGN(size_diff)) is necessary. I'd rather add the check: (offset+size_diff < pd_lower). Actually, that's a tricky question. There may be different code expectations. 1. If we expect that not-maxaligned tuple may be placed by any other routine, we should remove check (size_diff != MAXALIGN(size_diff)), since it will fail for not-maxaligned tuple. 2. If we expect that noone should use PageIndexTupleOverwrite with not-maxaligned tuples, than current checks are OK: we will break execution if we find non-maxaligned tuples. With an almost correct message. I suggest that this check may be debug-only assertion: in a production code routine will work with not-maxaligned tuples if they already reside on the page, in a dev build it will inform dev that something is going wrong. Is this behavior Postgres-style compliant? I agree that pd_lower check makes sense. Pointing out to this comparison I worried about possible call of MAXALIGN for negative size_diff value. Although I don't sure if it can cause any problem. Tuples on a page are always maxaligned. But, as far as I recall, itemid->lp_len can contain non-maxaligned value. So, I'd suggest you to perform MAXALIGN(oldsize) before computing size_diff: size_diff = oldsize - alignednewsize; Since both arguments are maxaligned the check of size_diff is not needed. BTW, I'm very surprised that it improves performance so much. And also size is reduced significantly. 89MB against 289MB without patch. Could you explain in details, why does it happen? Size reduction is unexpected for me. There might be 4 plausible explanations. I'll list them ordered by descending of probability: 1. Before this patch every update was throwing recently updated tuple to the end of a page. Thus, in some-how ordered data, recent best-fit will be discovered last. This can cause increase of MBB's overlap in spatial index and slightly higher tree. But 3 times size decrease is unlikely. How did you obtained those results? Can I look at a test case? Nothing special, I've just run the test you provided in this thread. And check index size via select pg_size_pretty(pg_relation_size('idx')); 2. Bug in PageIndexTupleDelete causing unused space emersion. I've searched for it, unsuccessfully. 3. Bug in PageIndexTupleOVerwrite. I cannot imagine nature of such a bug. May be we are not placing something not very important and big on a page? 4. Magic. I do not see what one should do with the R-tree to change it's size by a factor of 3. First three explanations are not better that forth, actually. Those 89 MB, they do not include WAL, right? -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Anastasia , thank you for your attentive code examine. 2016-08-05 21:19 GMT+05:00 Anastasia Lubennikova: > First of all, shouldn't we use MAXALIGN(oldsize) instead of oldsize? > Although, I'm quite sure that it was already aligned somewhere before. > I doubt that the check (size_diff != MAXALIGN(size_diff)) is necessary. > I'd rather add the check: (offset+size_diff < pd_lower). Actually, that's a tricky question. There may be different code expectations. 1. If we expect that not-maxaligned tuple may be placed by any other routine, we should remove check (size_diff != MAXALIGN(size_diff)), since it will fail for not-maxaligned tuple. 2. If we expect that noone should use PageIndexTupleOverwrite with not-maxaligned tuples, than current checks are OK: we will break execution if we find non-maxaligned tuples. With an almost correct message. I suggest that this check may be debug-only assertion: in a production code routine will work with not-maxaligned tuples if they already reside on the page, in a dev build it will inform dev that something is going wrong. Is this behavior Postgres-style compliant? I agree that pd_lower check makes sense. > BTW, I'm very surprised that it improves performance so much. > And also size is reduced significantly. 89MB against 289MB without patch. > Could you explain in details, why does it happen? Size reduction is unexpected for me. There might be 4 plausible explanations. I'll list them ordered by descending of probability: 1. Before this patch every update was throwing recently updated tuple to the end of a page. Thus, in some-how ordered data, recent best-fit will be discovered last. This can cause increase of MBB's overlap in spatial index and slightly higher tree. But 3 times size decrease is unlikely. How did you obtained those results? Can I look at a test case? 2. Bug in PageIndexTupleDelete causing unused space emersion. I've searched for it, unsuccessfully. 3. Bug in PageIndexTupleOVerwrite. I cannot imagine nature of such a bug. May be we are not placing something not very important and big on a page? 4. Magic. I do not see what one should do with the R-tree to change it's size by a factor of 3. First three explanations are not better that forth, actually. Those 89 MB, they do not include WAL, right? Thank you for the review. Best regards, Andrey Borodin, Octonica & Ural Federal University. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
30.07.2016 14:00, Andrew Borodin: Here is BRIN-compatible version of patch. Now function PageIndexTupleOverwrite consumes tuple size as a parameter, this behavior is coherent with PageAddItem. Also, i had to remove asserstion that item has a storage in the loop that adjusts line pointers. It would be great if someone could check that place (commented Assert(ItemIdHasStorage(ii)); in PageIndexTupleOverwrite). I suspect that there might be a case when linp's should not be adjusted. Hi, I reviewed the code and I have couple of questions about following code: /* may have negative size here if new tuple is larger */ size_diff = oldsize - alignednewsize; offset = ItemIdGetOffset(tupid); if (offset < phdr->pd_upper || (offset + size_diff) > phdr->pd_special || offset != MAXALIGN(offset) || size_diff != MAXALIGN(size_diff)) ereport(ERROR, (errcode(ERRCODE_DATA_CORRUPTED), errmsg("corrupted item offset: offset = %u, size = %u", offset, (unsigned int) size_diff))); First of all, shouldn't we use MAXALIGN(oldsize) instead of oldsize? Although, I'm quite sure that it was already aligned somewhere before. I doubt that the check (size_diff != MAXALIGN(size_diff)) is necessary. I'd rather add the check: (offset+size_diff < pd_lower). Besides that moment, the patch seems pretty good. BTW, I'm very surprised that it improves performance so much. And also size is reduced significantly. 89MB against 289MB without patch. Could you explain in details, why does it happen? -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Here is BRIN-compatible version of patch. Now function PageIndexTupleOverwrite consumes tuple size as a parameter, this behavior is coherent with PageAddItem. Also, i had to remove asserstion that item has a storage in the loop that adjusts line pointers. It would be great if someone could check that place (commented Assert(ItemIdHasStorage(ii)); in PageIndexTupleOverwrite). I suspect that there might be a case when linp's should not be adjusted. Best regards, Andrey Borodin, Octonica & Ural Federal University. PageIndexTupleOverwrite v6.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
>Can you please patch BRIN to use this new function too? On my machine replacement of both BRIN update cases (see https://github.com/x4m/pggistopt/commit/a6d301ff79104b977619339d53aebf748045418a ) showed no performance changes on folowing update query (6 seconds of updates avg): create table dataTable(x int, y int); insert into dataTable(x,y) select x,y from generate_series(1,1e3,1) x,generate_series(1,1e3,1) y; create index idx on dataTable using brin(x,y); update datatable set x = random()*1024, y = random()*1024; https://gist.github.com/x4m/7e69fd924b9ffd2fdc9c6100e741171d Probably I was looking in a wrong place. I do not see other cases when PageIndexTupleOverwrite can improve performance of BRIN. Though I'll make PageIndexTupleOverwrite BRIN-compatible in forthcoming patch version: BRIN tuples have no length in header and it must be taken as a parameter. Just as the PageAddItem do. Best regards, Andrey Borodin, Octonica & Ural Federal University. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
>Can you please patch BRIN to use this new function too? AFAIK Sergey Mirvoda was going to implement and test it. It is easy to do this optimization for brin_samepage_update (see patch draft in attachment, make check passes), but WAL of regular BRIN update is not clear for me. Best regards, Andrey Borodin, Octonica & Ural Federal University. Brin_PageIndexTupleOverwrite.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Andrew Borodin wrote: > >If a feature changes the shape of WAL records, XLOG_PAGE_MAGIC is bumped to > >prevent any problems. > I've attached patch with a bump, but, obviously, it'll be irrelevant > after any other commit changing WAL shape. > > Here is a patch with updated GiST README. > I haven't found apropriate place to describe PageIndexTupleOverwrite > function in docs, since all it's siblings are described only in the > code. > Code comments describing this function are coherent with others > (PageAddItem, PageIndexTupleDelete). Can you please patch BRIN to use this new function too? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Michael Paquierwrites: > On Sun, Jul 24, 2016 at 7:18 PM, Andrew Borodin wrote: >> I've attached patch with a bump, but, obviously, it'll be irrelevant >> after any other commit changing WAL shape. > Usually the committer in charge of reviewing such a patch would bump > it. There is no need for the patch submitter to do so. I should have > been more precise previously, sorry for my twisted words. It's good to remind the committer that such a bump is needed, of course. But yeah, casting the reminder in the form of a hunk of the patch is more likely to cause trouble than be helpful. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
On Sun, Jul 24, 2016 at 7:18 PM, Andrew Borodinwrote: >>If a feature changes the shape of WAL records, XLOG_PAGE_MAGIC is bumped to >>prevent any problems. > > I've attached patch with a bump, but, obviously, it'll be irrelevant > after any other commit changing WAL shape. Usually the committer in charge of reviewing such a patch would bump it. There is no need for the patch submitter to do so. I should have been more precise previously, sorry for my twisted words. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
>If a feature changes the shape of WAL records, XLOG_PAGE_MAGIC is bumped to >prevent any problems. I've attached patch with a bump, but, obviously, it'll be irrelevant after any other commit changing WAL shape. Here is a patch with updated GiST README. I haven't found apropriate place to describe PageIndexTupleOverwrite function in docs, since all it's siblings are described only in the code. Code comments describing this function are coherent with others (PageAddItem, PageIndexTupleDelete). Best regards, Andrey Borodin, Octonica & Ural Federal University. XLog-magic-bump.patch Description: Binary data PageIndexTupleOverwrite v5.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
On Fri, Jul 8, 2016 at 2:00 PM, Andrew Borodinwrote: (Please top-post on threads, this is annoying) > One more thing came to my mind: > WAL records done by the GiST before patch will corrupt GiST after > patch if replayed. > Is it a problem? May be it's prevented on some higher level? If a feature changes the shape of WAL records, XLOG_PAGE_MAGIC is bumped to prevent any problems. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
One more thing came to my mind: WAL records done by the GiST before patch will corrupt GiST after patch if replayed. Is it a problem? May be it's prevented on some higher level? Best regards, Andrey Borodin, Octonica & Ural Federal University. 2016-07-06 22:11 GMT+05:00 Andrew Borodin: > Here is a new patch. Updates: > 1. Uses MAXALIGN to allocate space on page > 2. Uses ItemIdSetNormal in case when ItemId was not normal for some > reasons before call > 3. Improved comments and var names > > Best regards, Andrey Borodin, Octonica & Ural Federal University. > > 2016-07-05 17:54 GMT+05:00 Andrew Borodin : >> Here is a patch with updated WAL behavior. >> >> I'm not certain about MAXALIGN for size of appended tuple. Currently >> if anyone calls PageIndexTupleOverwrite() with unalligned tuple it >> will ereport(ERROR). >> I suspect that it should allign size instead. >> >> Also I suspect that PageIndexTupleOverwrite() should make a call to >> ItemIdSetNormal() to pretend it is generic function and not just a >> private part of GiST. >> >> Best regards, Andrey Borodin, Octonica & Ural Federal University. >> >> 2016-07-04 22:59 GMT+05:00 Andrew Borodin : >>> Thank you, Alvaro. >>> >>> Yes, now I see. I'll update gistRedoPageUpdateRecord() to work >>> accordingly with patched gistplacetopage(). >>> Also, i've noticed that PageAddItem uses MAXALIGN() macro to calculate >>> tuple size. So, PageIndexTupleOverwrite should behave the same. >>> >>> About PageIndexDeleteNoCompact() in BRIN. I do not see why >>> PageIndexDeleteNoCompact() is not a good solution instead of >>> PageIndexTupleOverwrite? Will it mark tuple as unused until next >>> PageAddItem will use it's place? >>> >>> Best regards, Andrey Borodin, Octonica & Ural Federal University. >>> >>> 2016-07-04 22:16 GMT+05:00 Alvaro Herrera : Andrew Borodin wrote: > Thank you, Amit. > > Currently, if WAL reconstruct page in another order it won't be a problem. We require that replay of WAL leads to identical bytes than the original. Among other things, this enables use of a tool that verifies that WAL is working correctly simply by comparing page images. So please fix it instead of just verifying that this works for GiST. By the way, BRIN indexes have a need of this operation too. The current approach is to call PageIndexDeleteNoCompact followed by PageAddItem. I suppose it would be beneficial to use your new routine there too. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Here is a new patch. Updates: 1. Uses MAXALIGN to allocate space on page 2. Uses ItemIdSetNormal in case when ItemId was not normal for some reasons before call 3. Improved comments and var names Best regards, Andrey Borodin, Octonica & Ural Federal University. 2016-07-05 17:54 GMT+05:00 Andrew Borodin: > Here is a patch with updated WAL behavior. > > I'm not certain about MAXALIGN for size of appended tuple. Currently > if anyone calls PageIndexTupleOverwrite() with unalligned tuple it > will ereport(ERROR). > I suspect that it should allign size instead. > > Also I suspect that PageIndexTupleOverwrite() should make a call to > ItemIdSetNormal() to pretend it is generic function and not just a > private part of GiST. > > Best regards, Andrey Borodin, Octonica & Ural Federal University. > > 2016-07-04 22:59 GMT+05:00 Andrew Borodin : >> Thank you, Alvaro. >> >> Yes, now I see. I'll update gistRedoPageUpdateRecord() to work >> accordingly with patched gistplacetopage(). >> Also, i've noticed that PageAddItem uses MAXALIGN() macro to calculate >> tuple size. So, PageIndexTupleOverwrite should behave the same. >> >> About PageIndexDeleteNoCompact() in BRIN. I do not see why >> PageIndexDeleteNoCompact() is not a good solution instead of >> PageIndexTupleOverwrite? Will it mark tuple as unused until next >> PageAddItem will use it's place? >> >> Best regards, Andrey Borodin, Octonica & Ural Federal University. >> >> 2016-07-04 22:16 GMT+05:00 Alvaro Herrera : >>> Andrew Borodin wrote: Thank you, Amit. Currently, if WAL reconstruct page in another order it won't be a problem. >>> >>> We require that replay of WAL leads to identical bytes than the >>> original. Among other things, this enables use of a tool that verifies >>> that WAL is working correctly simply by comparing page images. So >>> please fix it instead of just verifying that this works for GiST. >>> >>> By the way, BRIN indexes have a need of this operation too. The current >>> approach is to call PageIndexDeleteNoCompact followed by PageAddItem. >>> I suppose it would be beneficial to use your new routine there too. >>> >>> -- >>> Álvaro Herrerahttp://www.2ndQuadrant.com/ >>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services PageIndexTupleOverwrite v4.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Here is a patch with updated WAL behavior. I'm not certain about MAXALIGN for size of appended tuple. Currently if anyone calls PageIndexTupleOverwrite() with unalligned tuple it will ereport(ERROR). I suspect that it should allign size instead. Also I suspect that PageIndexTupleOverwrite() should make a call to ItemIdSetNormal() to pretend it is generic function and not just a private part of GiST. Best regards, Andrey Borodin, Octonica & Ural Federal University. 2016-07-04 22:59 GMT+05:00 Andrew Borodin: > Thank you, Alvaro. > > Yes, now I see. I'll update gistRedoPageUpdateRecord() to work > accordingly with patched gistplacetopage(). > Also, i've noticed that PageAddItem uses MAXALIGN() macro to calculate > tuple size. So, PageIndexTupleOverwrite should behave the same. > > About PageIndexDeleteNoCompact() in BRIN. I do not see why > PageIndexDeleteNoCompact() is not a good solution instead of > PageIndexTupleOverwrite? Will it mark tuple as unused until next > PageAddItem will use it's place? > > Best regards, Andrey Borodin, Octonica & Ural Federal University. > > 2016-07-04 22:16 GMT+05:00 Alvaro Herrera : >> Andrew Borodin wrote: >>> Thank you, Amit. >>> >>> Currently, if WAL reconstruct page in another order it won't be a problem. >> >> We require that replay of WAL leads to identical bytes than the >> original. Among other things, this enables use of a tool that verifies >> that WAL is working correctly simply by comparing page images. So >> please fix it instead of just verifying that this works for GiST. >> >> By the way, BRIN indexes have a need of this operation too. The current >> approach is to call PageIndexDeleteNoCompact followed by PageAddItem. >> I suppose it would be beneficial to use your new routine there too. >> >> -- >> Álvaro Herrerahttp://www.2ndQuadrant.com/ >> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services PageIndexTupleOverwrite v3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Thank you, Alvaro. Yes, now I see. I'll update gistRedoPageUpdateRecord() to work accordingly with patched gistplacetopage(). Also, i've noticed that PageAddItem uses MAXALIGN() macro to calculate tuple size. So, PageIndexTupleOverwrite should behave the same. About PageIndexDeleteNoCompact() in BRIN. I do not see why PageIndexDeleteNoCompact() is not a good solution instead of PageIndexTupleOverwrite? Will it mark tuple as unused until next PageAddItem will use it's place? Best regards, Andrey Borodin, Octonica & Ural Federal University. 2016-07-04 22:16 GMT+05:00 Alvaro Herrera: > Andrew Borodin wrote: >> Thank you, Amit. >> >> Currently, if WAL reconstruct page in another order it won't be a problem. > > We require that replay of WAL leads to identical bytes than the > original. Among other things, this enables use of a tool that verifies > that WAL is working correctly simply by comparing page images. So > please fix it instead of just verifying that this works for GiST. > > By the way, BRIN indexes have a need of this operation too. The current > approach is to call PageIndexDeleteNoCompact followed by PageAddItem. > I suppose it would be beneficial to use your new routine there too. > > -- > Álvaro Herrerahttp://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Andrew Borodin wrote: > Thank you, Amit. > > Currently, if WAL reconstruct page in another order it won't be a problem. We require that replay of WAL leads to identical bytes than the original. Among other things, this enables use of a tool that verifies that WAL is working correctly simply by comparing page images. So please fix it instead of just verifying that this works for GiST. By the way, BRIN indexes have a need of this operation too. The current approach is to call PageIndexDeleteNoCompact followed by PageAddItem. I suppose it would be beneficial to use your new routine there too. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Thank you, Amit. Currently, if WAL reconstruct page in another order it won't be a problem. How can I check that it works? Would it be sufficient if I kill 9 backend and postmaster after commit and it will start normally executing select queries after restart? I'll make a patch with missing spaces later. I also found some more missing spaces in PageIndexTupleOverwrite call and typo in comments ("on-place"). Best regards, Andrey Borodin, Octonica & Ural Federal University. 2016-07-04 18:58 GMT+05:00 Amit Kapila: > On Mon, Jul 4, 2016 at 4:52 PM, Andrew Borodin > wrote: > > Here is a patch with corrections from Alexander Korotkov. > > My tests, check and check-world show no problems against Ubuntu LTS 14 > x64 VM. > > > > > > - PageIndexTupleDelete(page, oldoffnum); > - gistfillbuffer(page, itup, ntup, InvalidOffsetNumber); > + { > + /*if we have just one tuple to update we replace it on-place on page*/ > + if(ntup == 1) > + { > + PageIndexTupleOverwrite(page,oldoffnum,*itup); > + } > + else > + { > + /*this corner case is here to support mix calls case (see comment > above)*/ > + PageIndexTupleDelete(page, oldoffnum); > + gistfillbuffer(page, itup, ntup, InvalidOffsetNumber); > + } > > > Here, you have changed the way tuple is added on a page, but still the > WAL record is same as before. I think WAL replay should perform the > action in a same way as we have done when original operation is > performed. If not, then it can change the organization of page which > might lead to failure in replay of consecutive WAL records. > > + /*if we have just one tuple to update we replace it on-place on page*/ > > In comments, we always leave a space both in the beginning and at the > end of a comment. See comments in nearby code. > > > -- > With Regards, > Amit Kapila. > EnterpriseDB: http://www.enterprisedb.com >
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
On Mon, Jul 4, 2016 at 4:52 PM, Andrew Borodinwrote: > Here is a patch with corrections from Alexander Korotkov. > My tests, check and check-world show no problems against Ubuntu LTS 14 x64 VM. > > - PageIndexTupleDelete(page, oldoffnum); - gistfillbuffer(page, itup, ntup, InvalidOffsetNumber); + { + /*if we have just one tuple to update we replace it on-place on page*/ + if(ntup == 1) + { + PageIndexTupleOverwrite(page,oldoffnum,*itup); + } + else + { + /*this corner case is here to support mix calls case (see comment above)*/ + PageIndexTupleDelete(page, oldoffnum); + gistfillbuffer(page, itup, ntup, InvalidOffsetNumber); + } Here, you have changed the way tuple is added on a page, but still the WAL record is same as before. I think WAL replay should perform the action in a same way as we have done when original operation is performed. If not, then it can change the organization of page which might lead to failure in replay of consecutive WAL records. + /*if we have just one tuple to update we replace it on-place on page*/ In comments, we always leave a space both in the beginning and at the end of a comment. See comments in nearby code. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers