Hello guys, 

 

we have PG 8.3.13 in our system. When running performance cases, we find the
startup recovery cost about 3 minutes. It is too long in our system. 

We diagnose the problem by adding timestamp. Finally, we find almost all 3
minutes were used by the relation dropping and buffer invalidation loop in
xact_redo_commit. 

Before the problem happens, we drop 40000 tables and reboot linux. So the
out loop will run 40000 times . And we have 13000 share buffer pages in PG.
But in DropRelFileNodeBuffers who is used to drop shared buffer associated
to the specified relation we will have to run through all the shared buffers
for each relation to check whether the buffer can be dropped, no matter how
many pages the relation has in shared buffer. 

In all, we will have 40000 * 13000 LWLock acquire and release. Is this
necessary? How about building a hash to record all relfilenode to be
dropped, and run through the shared buffers once to check where the buffer's
relfilenode is going to be dropped! If we can do this, LWLock traffic will
be 13000 , we will have much better performance!

Does this work? And is there any risk to do so?

 

Thanks!

 

Best reguards,

 

甘嘉栋(Gan Jiadong)

E-MAIL: ga...@huawei.com

Tel:+86-755-289720578

****************************************************************************
*****************************

This e-mail and its attachments contain confidential information from
HUAWEI, which is intended only for the person or entity whose address is
listed above. Any use of the information contained herein in any way
(including, but not limited to, total or partial disclosure, reproduction,
or dissemination) by persons other than the intended recipient(s) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

****************************************************************************
***************************** 

 

Reply via email to