On Sun, Mar 12, 2017 at 8:06 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Sat, Mar 11, 2017 at 12:20 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: >>> /* >>> + * Change the shared buffer state in critical section, >>> + * otherwise any error could make it unrecoverable after >>> + * recovery. >>> + */ >>> + START_CRIT_SECTION(); >>> + >>> + /* >>> * Insert tuple on new page, using _hash_pgaddtup to ensure >>> * correct ordering by hashkey. This is a tad inefficient >>> * since we may have to shuffle itempointers repeatedly. >>> * Possible future improvement: accumulate all the items >>> for >>> * the new page and qsort them before insertion. >>> */ >>> (void) _hash_pgaddtup(rel, nbuf, itemsz, new_itup); >>> >>> + END_CRIT_SECTION(); >>> >>> No way. You have to start the critical section before making any page >>> modifications and keep it alive until all changes have been logged. >>> >> >> I think what we need to do here is to accumulate all the tuples that >> need to be added to new bucket page till either that page has no more >> space or there are no more tuples remaining in an old bucket. Then in >> a critical section, add them to the page using _hash_pgaddmultitup and >> log the entire new bucket page contents as is currently done in patch >> log_split_page(). > > I agree. >
Okay, I have changed like that in the patch. >> Now, here we can choose to log the individual >> tuples as well instead of a complete page, however not sure if there >> is any benefit for doing the same because XLogRecordAssemble() will >> anyway remove the empty space from the page. Let me know if you have >> something else in mind. > > Well, if you have two pages that are 75% full, and you move a third of > the tuples from one of them into the other, it's going to be about > four times more efficient to log only the moved tuples than the whole > page. > I don't see how this could happen during split? I mean if you are moving 25% tuples from old bucket page to a new bucket page which is 75% full, it will log the new bucket page only after it gets full. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers