On Tue, Jan 24, 2017 at 4:51 PM, Robert Haas <robertmh...@gmail.com> wrote: > On Mon, Jan 23, 2017 at 1:03 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: >> In spite of being careful, I missed reorganizing the functions in >> genam.h which I have done in attached patch. > > Cool. Committed parallel-generic-index-scan.2.patch. Thanks.
Reviewing parallel_index_scan_v6.patch: I think it might be better to swap the return value and the out parameter for _bt_parallel_seize; that is, return a bool, and have callers ignore the value of the out parameter (e.g. *pageno). I think _bt_parallel_advance_scan should be renamed something that includes the word "keys", like _bt_parallel_advance_array_keys. The hunk in indexam.c looks like a leftover that can be omitted. +/* + * Below flags are used to indicate the state of parallel scan. They aren't really flags any more; they're members of an enum. I think you could just leave this sentence out entirely and start right in with descriptions of the individual values. But maybe all of those descriptions should end in a period (instead of one ending in a period but not the other three) since they read like complete sentences. + * btinitparallelscan - Initializing BTParallelScanDesc for parallel btree scan Initializing -> initialize + * btparallelrescan() -- reset parallel scan Previous two prototypes have one dash, this one has two. Make it consistent, please. + * Ideally, we don't need to acquire spinlock here, but being + * consistent with heap_rescan seems to be a good idea. How about: In theory, we don't need to acquire the spinlock here, because there shouldn't be any other workers running at this point, but we do so for consistency. + * _bt_parallel_seize() -- returns the next block to be scanned for forward + * scans and latest block scanned for backward scans. I think the description should be more like "Begin the process of advancing the scan to a new page. Other scans must wait until we call bt_parallel_release() or bt_parallel_done()." And likewise _bt_parallel_release() should say something like "Complete the process of advancing the scan to a new page. We now have the new value for btps_scanPage; some other backend can now begin advancing the scan." And _bt_parallel_done should say something like "Mark the parallel scan as complete." I am a bit mystified about how this manages to work with array keys. _bt_parallel_done() won't set btps_pageStatus to BTPARALLEL_DONE unless so->arrayKeyCount >= btscan->btps_arrayKeyCount, but _bt_parallel_advance_scan() won't do anything unless btps_pageStatus is already BTPARALLEL_DONE. It seems like one of those two things has to be wrong. _bt_readpage's header comment should be updated to note that, in the case of a parallel scan, _bt_parallel_seize should have been called prior to entering this function, and _bt_parallel_release will be called prior to return (or this could alternatively be phrased in terms of btps_pageStatus on entry/exit). _bt_readnextpage isn't very clear about the meaning of its blkno argument. It looks like it always has to be valid when scanning forward, but only in the parallel case when scanning backward? That is a little odd. Another, possibly related thing that is odd is that when _bt_steppage() finds no tuples and decides to advance to a new page again, there's a very short comment in the forward case and a very long comment in the backward case: /* nope, keep going */ vs. /* * For parallel scans, get the last page scanned as it is quite * possible that by the time we try to fetch previous page, other * worker has also decided to scan that previous page. We could * avoid that by doing _bt_parallel_release once we have read the * current page, but it is bad to make other workers wait till we * read the page. */ Now it seems to me that these cases are symmetric and the issues are identical. It's basically that, while we were judging whether the current page has useful contents, some other process could have advanced the scan (since _bt_readpage un-seized it). - /* check for deleted page */ page = BufferGetPage(so->currPos.buf); TestForOldSnapshot(scan->xs_snapshot, rel, page); opaque = (BTPageOpaque) PageGetSpecialPointer(page); + /* check for deleted page */ This is an independent change; committed. What kind of testing has been done to ensure that this doesn't have concurrency bugs? What's the highest degree of parallelism that's been tested? Did that test include array keys? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers