David, I did something like this: psql -f /dev/fd/3 3 << IN1 & psql -f /dev/fd/4 4 << IN2 ... INSERT INTO c SELECT * FRO a JOIN b ON a.ad=b.id WHERE a.id < 0.25th quantile IN1 INSERT INTO c SELECT * FRO a JOIN b ON a.ad=b.id WHERE a.id < 0.5th quantile AND a.id >= 0.25th quantile IN2 ... IN3 ... IN4
And quantiles were computed using the function: SELECT percentile_cont(0.5) WITHIN GROUP (ORDER BY id) FROM a, and then once more for lower and upper halves. But unfortunately, I got only about 16% improvement from non-parallelized version of INSERT INTO .. SELECT .. Marc, I am using postgres 9.4. I didn't benchmark, but intuitively the modulo operator will force traversing every record in table "a" 4 times, as it can't use an index. Julien, my concern was why the option 3 (with parallel) is not the fastest. And now, even with parallel INSERT INTO .. SELECT its not the fastest. I can't really use the UNLOGGED table in this case. The following document summarises why is CREATE TABLE AS .. the fastest: 14.4.7 http://www.postgresql.org/docs/current/static/populate.html#POPULATE-PITR Basically CREATE TABLE AS .. just doesn't write to wal if the wal_level is minimal and hence cuts IO about in half. On Tue, Jul 14, 2015 at 1:42 PM David Rowley <david.row...@2ndquadrant.com> wrote: > On 14 July 2015 at 21:12, Igor Stassiy <istas...@gmail.com> wrote: > >> Hello, >> >> I am benchmarking different ways of putting data into table on table >> creation: >> >> 1. INSERT INTO c SELECT * FROM a JOIN b on a.id = b.id; >> 2. CREATE TABLE c AS SELECT * FROM a JOIN b on a.id = b.id; >> 3. psql -c "COPY (SELECT * FROM a JOIN b on a.id = b.id) TO STDOUT" | >> parallel --block 128M --jobs 4 --pipe psql -c "COPY c FROM STDIN"; >> >> (the parallel command is available as part of parallel deb package in >> Ubuntu for example, it splits the stdin by newline character and feeds it >> to the corresponding command) >> >> Both tables a and b have ~16M records and one of the columns in a is >> geometry (ranging from several KB in size to several MB). Columns in b are >> mostly integers. >> >> The machine that I am running these commands on has the following >> parameters: >> >> default_statistics_target = 50 # pgtune wizard 2012-06-06 >> maintenance_work_mem = 1GB # pgtune wizard 2012-06-06 constraint_exclusion >> = on # pgtune wizard 2012-06-06 checkpoint_completion_target = 0.9 # pgtune >> wizard 2012-06-06 effective_cache_size = 48GB # pgtune wizard 2012-06-06 >> work_mem = 80MB # pgtune wizard 2012-06-06 >> wal_buffers = 8MB # pgtune wizard 2012-06-06 >> checkpoint_segments = 16 # pgtune wizard 2012-06-06 >> shared_buffers = 16GB # pgtune wizard 2012-06-06 >> max_connections = 400 # pgtune wizard 2012-06-06 >> >> One would expect the 3rd option to be faster than 1 and 2, however 2 >> outperforms both by a large margin (sometimes x2). This is especially >> surprising taking into account that COPY doesn't acquire a global lock on >> the table, only a RowExclusiveLock >> (according to >> http://www.postgresql.org/message-id/10611.1014867...@sss.pgh.pa.us) >> >> So is option 2 a winner by design? Could you please suggest other >> alternatives to try (if there are any)? And what might be the reason that 3 >> is not outperforming the other 2? >> > > I would imagine that the calling of the output function to translate the > each value's internal representation to it's user visible/external > representation plus all the overhead of sending results to the client would > be a likely candidate of the slow down. In either case 3 would only be as > fast as the query generating the output. With 1 and 2 all the tuple > representations of each record stays in the internal format. > > If you have some logical way to break the query down into parts, then > maybe that would be a place to look. > For example: > > INSERT INTO c SELECT * FROM a JOIN b on a.id = b.id WHERE a.id < 8000000; > INSERT INTO c SELECT * FROM a JOIN b on a.id = b.id WHERE a.id >= 8000000; > > Of course, you'd need to be very careful to ensure that the results of > each SELECT never overlap. It would be nice to invent some better way than > this that divided the workload evenly even when the tables grow. > > Then you could run these concurrently. > > Regards > > David Rowley > > -- > David Rowley http://www.2ndQuadrant.com/ > <http://www.2ndquadrant.com/> > PostgreSQL Development, 24x7 Support, Training & Services > >