Hi,

Zhang Mingli
www.hashdata.xyz

Hi, Tomas
On Dec 31, 2023 at 07:10 +0800, Tomas Vondra <tomas.von...@enterprisedb.com>, 
wrote:
> Sadly, there's no explanation why parallel scans do not allow disabling
> sync scans just like serial scans - and it's not quite obvious to me.

Feel confused too.

```
        Assert(!IsBootstrapProcessingMode());
        Assert(allow_sync);
        snapshot = scan->rs_snapshot;
```

I dig this for a while and found that there a close relationship between field 
phs_syncscan(For parallel scan) and the allow_sync flag.

In table_block_parallelscan_initialize() ParallelTableScanDescData.phs_syncscan 
is set.

        /* compare phs_syncscan initialization to similar logic in initscan */
        bpscan->base.phs_syncscan = synchronize_seqscans &&
                !RelationUsesLocalBuffers(rel) &&
                bpscan->phs_nblocks > NBuffers / 4;

And the allow_sync is always set to true in initscan(), when phs_syncscan is 
not NULL.

        if (scan->rs_base.rs_parallel != NULL)
        {
                /* For parallel scan, believe whatever ParallelTableScanDesc 
says. */
                if (scan->rs_base.rs_parallel->phs_syncscan)
                        scan->rs_base.rs_flags |= SO_ALLOW_SYNC;
                else
                        scan->rs_base.rs_flags &= ~SO_ALLOW_SYNC;
        }

And phs_syncscan is used in 
table_block_parallelscan_startblock_init(),table_block_parallelscan_nextpage() 
to do sth of syncscan.

Back to the Assertion, else branch means param scan(parallel scan desc) is not 
null and we are in parallel scan mode.
        else
        {
                /*
                 * Parallel index build.
                 *
                 * Parallel case never registers/unregisters own snapshot.  
Snapshot
                 * is taken from parallel heap scan, and is SnapshotAny or an 
MVCC
                 * snapshot, based on same criteria as serial case.
                 */
                Assert(!IsBootstrapProcessingMode());
                Assert(allow_sync);
                snapshot = scan->rs_snapshot;
        }

Agree with you that: why all parallel plans should / must be synchronized?
Parallel scan should have a choice about syncscan.
Besides that I think there is a risk Assert(allow_sync), at least should use 
phs_syncscan field  here to judge if allow_sync is true according to above.
So I guess, there should be an Assertion failure  of Assert(allow_sync) in 
theory when we use a parallel scan but phs_syncscan is false.

        /* compare phs_syncscan initialization to similar logic in initscan */
        bpscan->base.phs_syncscan = synchronize_seqscans &&
                !RelationUsesLocalBuffers(rel) &&
                bpscan->phs_nblocks > NBuffers / 4;

However, I didn’t produce it because phs_syncscan is set according to data 
size, even with some parallel cost GUCs set to 0.
And if there is not enough data, we usually do not choose a parallel plan,
like this case: build a index with parallel scan on underlying tables.



Reply via email to