Re: New Table Access Methods for Multi and Single Inserts

Luc Vlaming Sun, 03 Jan 2021 23:59:24 -0800

On 28-12-2020 13:48, Bharath Rupireddy wrote:

On Fri, Dec 25, 2020 at 8:10 AM Justin Pryzby <pry...@telsasoft.com> wrote:

On Thu, Dec 24, 2020 at 05:48:42AM +0530, Bharath Rupireddy wrote:

I'm not posting the updated 0002 to 0004 patches, I plan to do so
after a couple of reviews happen on the design of the APIs in 0001.


Thoughts?


Are you familiar with this work ?

https://commitfest.postgresql.org/31/2717/
Reloptions for table access methods

It seems like that can be relevant for your patch, and I think some of what
your patch needs might be provided by AM opts.

It's difficult to generalize AMs when we have only one, but your use-case might
be a concrete example which would help to answer some questions on the other
thread.

@Jeff: https://commitfest.postgresql.org/31/2871/


Note that I have not gone through the entire thread at [1]. On some
initial study, that patch is proposing to allow different table AMs to
have custom rel options.

In the v2 patch that I sent upthread [2] for new table AMs has heap AM
multi insert code moved inside the new heap AM implementation and I
don't see any need of having rel options. In case, any other AMs want
to have the control for their multi insert API implementation via rel
options, I think the proposal at [1] can be useful.


Thoughts?

[1] - https://commitfest.postgresql.org/31/2717/
[2] - 
https://www.postgresql.org/message-id/CALj2ACWMnZZCu%3DG0PJkEeYYicKeuJ-X%3DSU19i6vQ1%2B%3DuXz8u0Q%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Hi,

> IIUC, there's no dependency or anything as such for the new table AM
> patch with the rel options thread [1]. If I'm right, can this new
> table AM patch [2] be reviewed further?

To me this seems good enough. Reason is that I anticipate that therewould not necessarily be per-table options for now but rather globaloptions, if any. Moreover, if we want to make these kind of tradeoffsuser-controllable I would argue this should be done in a differentpatch-set either way. Reason is that there are parameters in heapalready that are computed / hardcoded as well (see e.g.RelationAddExtraBlocks).


===

As to the patches themselves:

I think the API is a huge step forward! I assume that we want to have asingle-insert API like heap_insert_v2 so that we can encode theknowledge that there will just be a single insert coming and likely acommit afterwards?

Reason I'm asking is that I quite liked the heap_insert_begin parameteris_multi, which could even be turned into a "expected_rowcount" of theamount of rows expected to be commited in the transaction (e.g. single,several, thousands/stream).If we were to make the API based on expected rowcounts, the wholeheap_insert_v2, heap_insert and heap_multi_insert could be turned into asingle function heap_insert, as the knowledge about buffering of theslots is then already stored in the TableInsertState, creating an API like:


// expectedRows: -1 = streaming, otherwise expected rowcount.

TableInsertState* heap_insert_begin(Relation rel, CommandId cid, intoptions, int expectedRows);

heap_insert(TableInsertState *state, TupleTableSlot *slot);

Do you think that's a good idea?

Two smaller things I'm wondering:

- the clear_mi_slots; why is this not in the HeapMultiInsertState? theslots themselves are declared there? also, the boolean themselves issomewhat problematic I think because it would only work if you specifiedis_multi=true which would depend on the actual tableam to implement thisthen in a way that copy/ctas/etc can also use the slot properly, which Ithink would severely limit their freedom to store the slots moreefficiently? Also, why do we want to do ExecClearTuple() anyway? Isn'tit good enough that the next call to ExecCopySlot will effectively clearit out?- flushed -> why is this a stored boolean? isn't this indirectly encodedby cur_slots/cur_size == 0?

For patches 02-04 I quickly skimmed through them as I assume we firstwant the API agreed upon. Generally they look nice and like a big stepforward. What I'm just wondering about is the usage of theimplementation details like mistate->slots[X]. It makes a lot of senseto do so but also makes for a difficult compromise, because now thetableam has to guarantee a copy of the slot, and hopefully even one in asomewhat efficient form.


Kind regards,
Luc

Re: New Table Access Methods for Multi and Single Inserts

Reply via email to