On Tue, 2008-02-26 at 15:19 +0900, ITAGAKI Takahiro wrote:
> Simon Riggs <[EMAIL PROTECTED]> wrote:
> 
> > One of the reasons why I hadn't wanted to pursue earlier ideas to use
> > LOCK was that applying a lock will prevent running in parallel, which
> > ultimately may prevent further performance gains.
> > 
> > Is there a way of doing this that will allow multiple concurrent COPYs?
> 
> I think there is same difficulty as parallel queries. It requires tighter
> communication among COPY threads whether we will use multi-process model
> or multi-thread model.
> 
> We have independent concurrent COPYs now; COPYs are not aware of each
> other because no intermediate status during COPY. However, COPY will
> have "phases" if we use bulkbuild. Therefore, we will need joining
> COPY threads and passing each working memories between threads.

The use case for this seems to be
* an existing table - since we have indexes
* large indexes - larger than shared_buffers since we are worried about
the effects of index inserts causing random I/O
* a table that can be LOCKed for periods of time
* load performance is critical

I'm worried that the last two items are often mutually exclusive for
many people, making this look like a very narrow use case. My experience
at Teradata with complicated load mechanisms is that they take a long
time to write, are bug-prone and have serious restrictions on how and
when we can use them. Even very large data warehouses frequently use a
trickle loader or "normal SQL" mechanism because the business
requirement for immediate access to data is just as high as the need for
load speed.

Faster loading is a requirement for most people however. (Dimitri
Fontaine is working on parallel COPY statements from pgloader).

So I feel we must try very hard to avoid the LOCK.

The LOCK is only required because we defer the inserts into unique
indexes, yes? Perhaps we might get good performance by making the
inserts into the unique index immediately, but deferring the insert of
other indexes?

Unique index inserts tend to go into the rightmost blocks, which are
almost always in memory as a result. So all the random I/O is caused by
non-unique indexes.

I very much like the idea of index merging, or put another way: batch
index inserts. How big do the batch of index inserts have to be for us
to gain benefit from this technique? Would it be possible to just buffer
the index inserts inside the indexam module so that we perform a batch
of index inserts every N rows? Maybe use work_mem? Or specify a batch
size as a parameter on COPY? Do we really need to pass data between COPY
sessions to gain maximum benefit? Feels like there is a way that is
faster in many cases but simpler than the fully parallel route
described.

The SQL Standard allows us to define a table as GENERATED ALWAYS AS
IDENTITY, so in that case we would be able to defer unique index inserts
also, since we know they are already unique.

Unique index values arriving from outside the database might be able to
be checked against each other as a batch first before adding to the
index. For example, deferring unique index checks until we have filled a
whole block, then checking that all values on the block are unique with
respect to each other and then inserting into the unique index.

Some other aspects to this, slightly OT

* variable load speed - it is often a requirement to have the data load
slow down at busy times and speed up again when less busy. Maybe
automatically, but definitely manually. If we avoid locks and large
batches then we should be able to do this eventually.

* driving UPDATEs and DELETEs - we often want to perform more than just
INSERTs. If we over-specialise code for INSERTing data then we may miss
out on opportunities to improve the overall data maintenance workload.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [EMAIL PROTECTED] so that your
       message can get through to the mailing list cleanly

Reply via email to