Hi, On 2017-05-05 14:20:48 +1200, David Rowley wrote: > Yeah, I did get some time to look over the contention in Parallel Seq > Scan a while back and I discovered that on the machine that I was > testing on. the lock obtained in heap_parallelscan_nextpage() was > causing workers to have to wait for other workers to fetch their next > task to work on.
Oh, if it's "just" that, it should be easy enough to address. Two approaches: 1) use atomic ops for increment, modulo afterwards to deal with wraparound in the synchronous scan 2) batching > I ended up writing the attached (which I'd not intended to post until > some time closer to when the doors open for PG11). At the moment it's > basically just a test patch to see how it affects things when we give > workers a bit more to do before they come back to look for more work. > In this case, I've just given them 10 pages to work on, instead of the > 1 that's allocated in 9.6 and v10. Right. > A quick test on a pretty large table on a large machine shows: > > Unpatched: > > postgres=# select count(*) from a; > count > ------------ > 1874000000 > (1 row) > > Time: 5211.485 ms (00:05.211) > > Patched: > > postgres=# select count(*) from a; > count > ------------ > 1874000000 > (1 row) > > Time: 2523.983 ms (00:02.524) Neat! > I'd had thoughts that the 10 pages wouldn't be constant, but the > batching size would depend on the size of the relation to be scanned. > I'd rough ideas to just try to make about 1 million batches. Something > like batch_pages = Max(parallel_scan->phs_nblocks / 1000000, 1); so > that we only take more than 1 page if there's some decent amount to > process. We don't want to make the batches too big as we might end up > having to wait on slow workers at the end of a scan. I wonder how much doing the atomic ops approach alone can help, that doesn't have the issue that the work might be unevenly distributed between pages. > Anyway. I don't want to hi-jack this thread with discussions on this. > I just wanted to mark that I plan to work on this in order to avoid > any repeat developments or analysis. I'll probably start a new thread > for this sometime nearer PG11's dev cycle. Cool. I think it might sense to post about this soon, just to give it some more visibility to reduce the potential for duplication. - andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers