On Mon, Nov 02, 2020 at 07:53:45AM +0100, Luc Vlaming wrote:
> On 30.10.20 05:51, Justin Pryzby wrote:
> > On Thu, Oct 22, 2020 at 01:29:53PM +0100, Simon Riggs wrote:
> > > On Fri, 16 Oct 2020 at 22:05, Justin Pryzby <pry...@telsasoft.com> wrote:
> > > 
> > > > > > I made this conditional on BEGIN BULK/SET bulk, so I'll solicit 
> > > > > > comments on that.
> > > 
> > > I think it would be better if this was self-tuning. So that we don't
> > > allocate a bulkinsert state until we've done say 100 (?) rows
> > > inserted.
> > 
> > I made it an optional, non-default behavior in response to the legitimate
> > concern for performance regression for the cases where a loader needs to be 
> > as
> > fast as possible - as compared with our case, where we want instead to 
> > optimize
> > for our reports by making the loaders responsible for their own writes, 
> > rather
> > than leaving behind many dirty pages, and clobbering the cache, too.
> > 
> > Also, INSERT SELECT doesn't immediately help us (telsasoft), since we use
> > INSERT .. VALUES () .. ON CONFLICT.  This would handle that case, which is
> > great, even though that wasn't a design goal.  It could also be an integer 
> > GUC
> > to allow configuring the size of the ring buffer.
> > 
> > > You should also use table_multi_insert() since that will give further
> > > performance gains by reducing block access overheads. Switching from
> > > single row to multi-row should also only happen once we've loaded a
> > > few rows, so we don't introduce overahads for smaller SQL statements.
> > 
> > Good idea...multi_insert (which reduces the overhead of individual inserts) 
> > is
> > mostly independent from BulkInsert state (which uses a ring-buffer to avoid
> > dirtying the cache).  I made this 0002.
> > 
> > This makes INSERT SELECT several times faster, and not clobber the cache 
> > too.
> > 
> > Time: 4700.606 ms (00:04.701)
> >     123 |          1
> >      37 |          2
> >      20 |          3
> >      11 |          4
> >    4537 |          5
> >   11656 |
> > 
> > Time: 1125.302 ms (00:01.125)
> >    2171 |          1
> >      37 |          2
> >      20 |          3
> >      11 |          4
> >     111 |          5
> >   14034 |
> > 
> > When enabled, this passes nearly all regression tests, and all but 2 of the
> > changes are easily understood.  The 2nd patch still needs work.
> > 
> 
> Hi,
> 
> Came across this thread because I'm working on an improvement for the
> relation extension to improve the speed of the bulkinsert itself in (highly)
> parallel cases and would like to make sure that our approaches work nicely

Thanks for looking.

Since this is a GUC, I thought it would accomodate users optimizing for either
inserts vs selects, as well as users who don't want to change their application
(they can "ALTER SYSTEM SET bulk_insert=on").  I'm not thrilled about making a
new guc, but that seems to be required for "begin bulk", which was the obvious
way to make it an 'opt-in' feature.

I guess it'd be easy to add a counter to ModifyTableState, although it makes
the code a bit less clean and conceivably performs "discontinuously" - inserts
100rows/sec for the first 999 rows and then 200rows/sec afterwards.

If you "mix" small inserts and big inserts, it would be a bad strategy to
optimize for the small ones.  Anyway, in a quick test, small inserts were not
slower.
https://www.postgresql.org/message-id/20200713015700.GA23581%40telsasoft.com

Do you have an example that regresses with bulk insert ?

The two patches are separate, and it's possible they should be enabled
differently or independently.

-- 
Justin

> Given what I've seen and tried so far with various benchmarks I would also
> really like to see a different approach here. The "BEGIN BULK" can be
> problematic for example if you mix small amounts of inserts and big amounts
> in the same transaction, or if your application possibly does a bulk insert
> but otherwise mostly OLTP transactions.

> To me the idea from Simon sounds good to only use a bulk insert state after
> inserting e.g. a 1000 rows, and this also seems more applicable to most
> applications compared to requiring a change to any application that wishes
> to have faster ingest.
> 
> Another approach could be to combine this, for example, with a few extra
> requirements to limit the amount of regressions and first learn more how
> this behaves in the field.
> We could, for example, only (just throwing out some ideas), require that:
> - the relation has a certain size
> - a BufferStrategy a maximum certain size is used
> - there is a certain amount of lock waiters on relation extension. (like we
> do with bulk extend)
> - we have extended the relation for at least e.g. 4 MB and not used the FSM
> anymore thereby proving that we are doing bulk operations instead of random
> small extensions everywhere into the relation that use the FSM.
> 
> Another thing is that we first try to improve the bulk operation facilities
> in general and then have another shot at this? Not sure if there is some
> benchmark / query that shows where such a 10x slowdown would appear but
> maybe that would be worth a look as well possibly.
> 
> Regards,
> Luc


Reply via email to