Julien, I have the following setting for WAL level: #wal_level = minimal (which defaults to minimal anyway)
On Tue, Jul 14, 2015 at 6:19 PM Julien Rouhaud <julien.rouh...@dalibo.com> wrote: > On 14/07/2015 11:12, Igor Stassiy wrote: > > Hello, > > > > I am benchmarking different ways of putting data into table on table > > creation: > > > > 1. INSERT INTO c SELECT * FROM a JOIN b on a.id <http://a.id> = b.id > > <http://b.id>; > > 2. CREATE TABLE c AS SELECT * FROM a JOIN b on a.id <http://a.id> = b.id > > <http://b.id>; > > 3. psql -c "COPY (SELECT * FROM a JOIN b on a.id <http://a.id> = b.id > > <http://b.id>) TO STDOUT" | > > parallel --block 128M --jobs 4 --pipe psql -c "COPY c FROM STDIN"; > > > > (the parallel command is available as part of parallel deb package in > > Ubuntu for example, it splits the stdin by newline character and feeds > > it to the corresponding command) > > > > Both tables a and b have ~16M records and one of the columns in a is > > geometry (ranging from several KB in size to several MB). Columns in b > > are mostly integers. > > > > The machine that I am running these commands on has the following > > parameters: > > > > default_statistics_target = 50 # pgtune wizard 2012-06-06 > > maintenance_work_mem = 1GB # pgtune wizard 2012-06-06 > > constraint_exclusion = on # pgtune wizard 2012-06-06 > > checkpoint_completion_target = 0.9 # pgtune wizard 2012-06-06 > > effective_cache_size = 48GB # pgtune wizard 2012-06-06 > > work_mem = 80MB # pgtune wizard 2012-06-06 > > wal_buffers = 8MB # pgtune wizard 2012-06-06 > > checkpoint_segments = 16 # pgtune wizard 2012-06-06 > > shared_buffers = 16GB # pgtune wizard 2012-06-06 > > max_connections = 400 # pgtune wizard 2012-06-06 > > > > One would expect the 3rd option to be faster than 1 and 2, however 2 > > outperforms both by a large margin (sometimes x2). This is especially > > surprising taking into account that COPY doesn't acquire a global lock > > on the table, only a RowExclusiveLock > > (according > > to http://www.postgresql.org/message-id/10611.1014867...@sss.pgh.pa.us) > > > > What is wal_level value? I think this is because of an optimisation > happening with wal_level = minimal: > > "In minimal level, WAL-logging of some bulk operations can be safely > skipped, which can make those operations much faster" > > see http://www.postgresql.org/docs/current/static/runtime-config-wal.html > > > So is option 2 a winner by design? Could you please suggest other > > alternatives to try (if there are any)? And what might be the reason > > that 3 is not outperforming the other 2? > > > > Thank you, > > Igor > > > > > > > -- > Julien Rouhaud > http://dalibo.com - http://dalibo.org >