On Thu, Jan 28, 2016 at 10:50 AM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote: >> If I would make a proof-of-concept patch with interface itself, it >> seems to me file_fdw may be a good candidate for this enhancement. >> It is not a field for postgres_fdw. >> > The attached patch is enhancement of FDW/CSP interface and PoC feature > of file_fdw to scan source file partially. It was smaller enhancement > than my expectations. > > It works as follows. This query tried to read 20M rows from a CSV file, > using 3 background worker processes. > > postgres=# set max_parallel_degree = 3; > SET > postgres=# explain analyze select * from test_csv where id % 20 = 6; > QUERY PLAN > -------------------------------------------------------------------------------- > Gather (cost=1000.00..194108.60 rows=94056 width=52) > (actual time=0.570..19268.010 rows=2000000 loops=1) > Number of Workers: 3 > -> Parallel Foreign Scan on test_csv (cost=0.00..183703.00 rows=94056 > width=52) > (actual time=0.180..12744.655 rows=500000 > loops=4) > Filter: ((id % 20) = 6) > Rows Removed by Filter: 9500000 > Foreign File: /tmp/testdata.csv > Foreign File Size: 1504892535 > Planning time: 0.147 ms > Execution time: 19330.201 ms > (9 rows)
Could you try it not in parallel and then with 1, 2, 3, and 4 workers and post the times for all? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers