Re: WIP: Avoid creation of the free space map for small tables

Amit Kapila Mon, 19 Nov 2018 04:25:32 -0800

On Mon, Nov 19, 2018 at 4:40 PM John Naylor <jcnay...@gmail.com> wrote:
>
> On 11/19/18, Amit Kapila <amit.kapil...@gmail.com> wrote:
> > On Mon, Nov 19, 2018 at 7:30 AM John Naylor <jcnay...@gmail.com> wrote:
> >> Let's say we have to wait to acquire a relation extension lock,
> >> because another backend had already started extending the heap by 1
> >> block. We call GetPageWithFreeSpace() and now the local map looks like
> >>
> >> 0123
> >> TTA0
> >>
> >> By using bitwise OR to set availability, the already-tried blocks
> >> remain as they are. With only 2 states, the map would look like this
> >> instead:
> >>
> >> 0123
> >> AAAN
> >>
>
> > In my mind for such a case it should look like below:
> > 0123
> > NNAN
>
> Okay, to retain that behavior with only 2 status codes, I have
> implemented the map as a struct with 2 members: the cached number of
> blocks, plus the same array I had before. This also allows a more
> efficient implementation at the micro level. I just need to do some
> more testing on it.
>


Okay.

> [ abortive states ]
> > I think it might come from any other place between when you set it and
> > before it got cleared (like any intermediate buffer and pin related
> > API's).
>
> Okay, I will look into that.
>
> > One other thing that slightly bothers me is the call to
> > RelationGetNumberOfBlocks via fsm_allow_writes.  It seems that call
> > will happen quite frequently in this code-path and can have some
> > performance impact.  As of now, I don't have any idea to avoid it or
> > reduce it more than what you already have in the patch, but I think we
> > should try some more to avoid it.  Let me know if you have any ideas
> > around that?
>
> FWIW, I believe that the callers of RecordPageWithFreeSpace() will
> almost always avoid that call. Otherwise, there is at least one detail
> that could use attention: If rel->rd_rel->relpages shows fewer pages
> than the threshold, than the code doesn't trust it to be true. Might
> be worth revisiting.
>

I think it is less of a concern when called from vacuum code path.

> Aside from that, I will have to think about it.
>
> More generally, I have a couple ideas about performance:
>
> 1. Only mark available every other block such that visible blocks are
> interleaved as the relation extends. To explain, this diagram shows a
> relation extending, with 1 meaning marked available and 0 meaning
> marked not-available.
>
> A
> NA
> ANA
> NANA
>
> So for a 3-block table, we never check block 1. Any free space it has
> acquired will become visible when it extends to 4 blocks. For a
> 4-block threshold, we only check 2 blocks or less. This reduces the
> number of lock/pin events but still controls bloat. We could also
> check both blocks of a 2-block table.
>

We can try something like this if we see there is any visible
performance hit in some scenario.

> 2. During manual testing I seem to remember times that the FSM code
> was invoked even though I expected the smgr entry to have a cached
> target block. Perhaps VACUUM or something is clearing that away
> unnecessarily. It seems worthwhile to verify and investigate, but that
> seems like a separate project.
>

makes sense, let's not get distracted by stuff that is not related to
this patch.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WIP: Avoid creation of the free space map for small tables

Reply via email to