On 05/03/2022 00:03, Melanie Plageman wrote:
On Wed, Mar 2, 2022 at 8:09 PM Justin Pryzby <pry...@telsasoft.com> wrote:

Rebased to appease cfbot.

I ran these paches under a branch which shows code coverage in cirrus.  It
looks good to my eyes.
https://api.cirrus-ci.com/v1/artifact/task/5212346552418304/coverage/coverage/00-index.html

thanks for doing that and for the rebase! since I made updates, the
attached version 6 is also rebased.

I'm surprised by all the changes in nbtsort.c to choose between using the buffer manager or the new API. I would've expected the new API to abstract that away. Otherwise we need to copy that logic to all the index AMs.

I'd suggest an API along the lines of:

/*
 * Start building a relation in bulk.
 *
 * If the relation is going to be small, we will use the buffer manager,
 * but if it's going to be large, this will skip the buffer manager and
 * write the pages directly to disk.
 */
bulk_init(SmgrRelation smgr, BlockNumber estimated_size)

/*
 * Extend relation by one page
 */
bulk_extend(SmgrRelation, BlockNumber, Page)

/*
 * Finish building the relation
 *
 * This will fsync() the data to disk, if required.
 */
bulk_finish()


Behind this interface, you can encapsulate the logic to choose whether to use the buffer manager or not. And I think bulk_extend() could do the WAL-logging too.

Or you could make the interface look more like the buffer manager:

/* as above */
bulk_init(SmgrRelation smgr, BlockNumber estimated_size)
bulk_finish()

/*
 * Get a buffer for writing out a page.
 *
 * The contents of the buffer are uninitialized. The caller
 * must fill it in before releasing it.
 */
BulkBuffer
bulk_getbuf(SmgrRelation smgr, BlockNumber blkno)

/*
 * Release buffer. It will be WAL-logged and written out to disk.
 * Not necessarily immediately, but at bulk_finish() at latest.
 *
 * NOTE: There is no way to read back the page after you release
 * it, until you finish the build with bulk_finish().
 */
void
bulk_releasebuf(SmgrRelation smgr, BulkBuffer buf)


With this interface, you could also batch multiple pages and WAL-log them all in one WAL record with log_newpage_range(), for example.

- Heikki


Reply via email to