> On 19 Sep 2017, at 15:47, Job <j...@colliniconsulting.it> wrote:
> 
> Hi guys,
> 
> we need to insert from a table to another (Postgresql 9.6.1) a large amount 
> of data (about 10/20 millions of rows) without locking destination table.
> Pg_bulkload is the fastest way but it locks the table.
> 
> Are there other ways?
> Classic "COPY" from?

We do something like that using a staging table to load to initially (although 
not bulk; data arrives in our staging table with batches of 5k to 100k rows) 
and then we transfer the data using insert/select and "on conflict do".

That data-transfer within PG takes a couple of minutes on our rather limited VM 
for a wide 37M rows table (~37GB on disk). That only locks the staging table 
(during initial bulkload) and the rows in the master table that are currently 
being altered (during the insert/select).

If your data-source is a file in a format supported by COPY, then you can use 
COPY to do the initial bulk load into the staging table.

Some benefits of this 2-stage approach are that it leaves room to manipulate 
the data (type conversions, for example) and that it can handle the scenario 
where a matching target record in the master table already exists. In our case, 
we convert character fields to varchar (which saves a lot of space(s)).

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to