Re: WIP: Avoid creation of the free space map for small tables

John Naylor Mon, 19 Nov 2018 03:11:10 -0800

On 11/19/18, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Mon, Nov 19, 2018 at 7:30 AM John Naylor <jcnay...@gmail.com> wrote:
>> Let's say we have to wait to acquire a relation extension lock,
>> because another backend had already started extending the heap by 1
>> block. We call GetPageWithFreeSpace() and now the local map looks like
>>
>> 0123
>> TTA0
>>
>> By using bitwise OR to set availability, the already-tried blocks
>> remain as they are. With only 2 states, the map would look like this
>> instead:
>>
>> 0123
>> AAAN
>>


> In my mind for such a case it should look like below:
> 0123
> NNAN

Okay, to retain that behavior with only 2 status codes, I have
implemented the map as a struct with 2 members: the cached number of
blocks, plus the same array I had before. This also allows a more
efficient implementation at the micro level. I just need to do some
more testing on it.

[ abortive states ]
> I think it might come from any other place between when you set it and
> before it got cleared (like any intermediate buffer and pin related
> API's).

Okay, I will look into that.

> One other thing that slightly bothers me is the call to
> RelationGetNumberOfBlocks via fsm_allow_writes.  It seems that call
> will happen quite frequently in this code-path and can have some
> performance impact.  As of now, I don't have any idea to avoid it or
> reduce it more than what you already have in the patch, but I think we
> should try some more to avoid it.  Let me know if you have any ideas
> around that?

FWIW, I believe that the callers of RecordPageWithFreeSpace() will
almost always avoid that call. Otherwise, there is at least one detail
that could use attention: If rel->rd_rel->relpages shows fewer pages
than the threshold, than the code doesn't trust it to be true. Might
be worth revisiting.
Aside from that, I will have to think about it.

More generally, I have a couple ideas about performance:

1. Only mark available every other block such that visible blocks are
interleaved as the relation extends. To explain, this diagram shows a
relation extending, with 1 meaning marked available and 0 meaning
marked not-available.

A
NA
ANA
NANA

So for a 3-block table, we never check block 1. Any free space it has
acquired will become visible when it extends to 4 blocks. For a
4-block threshold, we only check 2 blocks or less. This reduces the
number of lock/pin events but still controls bloat. We could also
check both blocks of a 2-block table.

2. During manual testing I seem to remember times that the FSM code
was invoked even though I expected the smgr entry to have a cached
target block. Perhaps VACUUM or something is clearing that away
unnecessarily. It seems worthwhile to verify and investigate, but that
seems like a separate project.

-John Naylor

Re: WIP: Avoid creation of the free space map for small tables

Reply via email to