Hi Thomas
I found some other problems which I want to share my change with you to make
you confirm.
<1> I changed the code in smgr_alloc_sr to avoid dead lock.
LWLockAcquire(mapping_lock, LW_EXCLUSIVE);
flags = smgr_lock_sr(sr);
Assert(flags & SR_SYNCING(forknum));
+ flags &= ~SR_SYNCING(forknum);
if (flags & SR_JUST_DIRTIED(forknum))
{
/*
* Someone else dirtied it while we were syncing, so we can't mark
* it clean. Let's give up on this SR and go around again.
*/
smgr_unlock_sr(sr, flags);
LWLockRelease(mapping_lock);
goto retry;
}
/* This fork is clean! */
- flags &= ~SR_SYNCING(forknum);
flags &= ~SR_DIRTY(forknum);
}
In smgr_drop_sr, if the sr is SR_SYNCING, it will retry until the sr is not
SR_SYNCING. But in smgr_alloc_sr, if the sr is SR_JUST_DIRTIED, it will retry
to get another sr, although it has been synced by smgrimmedsync, the flag
SR_SYNCING doesn't changed. This might cause dead lock. So I changed your
patch as above.
<2> I changed the code in smgr_drop_sr to avoid some corner cases
/* Mark it invalid and drop the mapping. */
smgr_unlock_sr(sr, ~SR_VALID);
+ for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
+ sr->nblocks[forknum] = InvalidBlockNumber;
hash_search_with_hash_value(sr_mapping_table,
&reln->smgr_rnode,
hash,
HASH_REMOVE,
NULL);
smgr_drop_sr just removes the hash entry from sr_mapping_table, but doesn't
remove the sr_pool. But in smgrnblocks_fast, it just get sr from sr_pool, so I
add some codes as above to avoid some corner cases get an unexpected result
from smgrnblocks_fast. Is it necessary, I also want some advice from you.
Thanks a lot.
Buzhen
------------------原始邮件 ------------------
发件人:Thomas Munro <[email protected]>
发送时间:Tue Dec 29 22:52:59 2020
收件人:陈佳昕(步真) <[email protected]>
抄送:Amit Kapila <[email protected]>, Konstantin Knizhnik
<[email protected]>, PostgreSQL Hackers
<[email protected]>
主题:Re: Re: Cache relation sizes?
On Wed, Dec 23, 2020 at 1:31 AM 陈佳昕(步真) <[email protected]> wrote:
> I studied your patch these days and found there might be a problem.
> When execute 'drop database', the smgr shared pool will not be removed
> because of no call 'smgr_drop_sr'. Function 'dropdb' in dbcommands.c remove
> the buffer from bufferpool and unlink the real files by 'rmtree', It doesn't
> call smgrdounlinkall, so the smgr shared cache will not be dropped although
> the table has been removed. This will cause some errors when smgr_alloc_str
> -> smgropen、smgrimmedsync. Table file has been removed, so smgropen and
> smgrimmedsync will get a unexpected result.
Hi Buzhen,
Thanks, you're right -- it needs to scan the pool of SRs and forget
everything from the database you're dropping.