On Sun, Jul 26, 2020 at 6:43 PM Dilip Kumar <[email protected]> wrote: > > I would like to propose a patch for enabling the parallelism for the > bitmap index scan path. > > Background: > Currently, we support only a parallel bitmap heap scan path. Therein, > the underlying bitmap index scan is done by a single worker called the > leader. The leader creates a bitmap in shared memory and once the > bitmap is ready it creates a shared iterator and after that, all the > workers process the shared iterator and scan the heap in parallel. > While analyzing the TPCH plan we have observed that some of the > queries are spending significant time in preparing the bitmap. So the > idea of this patch is to use the parallel index scan for preparing the > underlying bitmap in parallel. > > Design: > If underlying index AM supports the parallel path (currently only > BTREE support it), then we will create a parallel bitmap heap scan > path on top of the parallel bitmap index scan path. So the idea of > this patch is that each worker will do the parallel index scan and > generate their part of the bitmap. And, we will create a barrier so > that we can not start preparing the shared iterator until all the > worker is ready with their bitmap. The first worker, which is ready > with the bitmap will keep a copy of its TBM and the page table in the > shared memory. And, all the subsequent workers will merge their TBM > with the shared TBM. Once all the TBM are merged we will get one > common shared TBM and after that stage, the worker can continue. The > remaining part is the same, basically, again one worker will scan the > shared TBM and prepare the shared iterator and once it is ready all > the workers will jointly scan the heap in parallel using shared > iterator. >
Though I have not looked at the patch or code for the existing parallel bitmap heap scan, one point keeps bugging in my mind. I may be utterly wrong or my question may be so silly, anyways I would like to ask here: >From the above design: each parallel worker creates partial bitmaps for the index data that they looked at. Why should they merge these bitmaps to a single bitmap in shared memory? Why can't each parallel worker do a bitmap heap scan using the partial bitmaps they built during it's bitmap index scan and emit qualified tuples/rows so that the gather node can collect them? There may not be even lock contention as bitmap heap scan takes read locks for the heap pages/tuples. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
