On Wed, Sep 23, 2020 at 12:00 PM tsunakawa.ta...@fujitsu.com <tsunakawa.ta...@fujitsu.com> wrote: > > From: Amit Kapila <amit.kapil...@gmail.com> > > The idea is that we can't use this optimization if the value is not > > cached because we can't rely on lseek behavior. See all the discussion > > between Horiguchi-San and me in the thread above. So, how would you > > ensure that if we don't use Kirk-San's proposal? > > Hmm, buggy Linux kernel... (Until when should we be worried about the bug?) > > According to the following Horiguchi-san's suggestion, it's during normal > operation, not during recovery, when we should be careful, right? >
No, during recovery also we need to be careful. We need to ensure that we use cached value during recovery and cached value is always up-to-date. We can't rely on lseek and I have provided some scenario up thread [1] where such behavior can cause problem and then see the response from Tom Lane why the same can be true for recovery as well. The basic approach we are trying to pursue here is to rely on the cached value of 'number of blocks' (as that always gives correct value and even if there is a problem that will be our bug, we don't need to rely on OS for correct value and it will be better w.r.t performance as well). It is currently only possible during recovery so we are using it in recovery path and later once Thomas's patch to cache it for non-recovery cases is also done, we can use it for non-recovery cases as well. [1] - https://www.postgresql.org/message-id/CAA4eK1LqaJvT%3DbFOpc4i5Haq4oaVQ6wPbAcg64-Kt1qzp_MZYA%40mail.gmail.com -- With Regards, Amit Kapila.