On Wed, Mar 20, 2024 at 11:19 PM John Naylor <johncnaylo...@gmail.com> wrote:
>
> On Wed, Mar 20, 2024 at 8:30 PM Masahiko Sawada <sawada.m...@gmail.com> wrote:
> > I forgot to report the results. Yes, I did some tests where I inserted
> > many TIDs to make the tidstore use several GB memory. I did two cases:
> >
> > 1. insert 100M blocks of TIDs with an offset of 100.
> > 2. insert 10M blocks of TIDs with an offset of 2048.
> >
> > The tidstore used about 4.8GB and 5.2GB, respectively, and all lookup
> > and iteration results were expected.
>
> Thanks for confirming!
>
> > While reviewing the codes again, the following two things caught my eyes:
> >
> > in check_set_block_offset() function, we don't take a lock on the
> > tidstore while checking all possible TIDs. I'll add
> > TidStoreLockShare() and TidStoreUnlock() as follows:
> >
> > +           TidStoreLockShare(tidstore);
> >             if (TidStoreIsMember(tidstore, &tid))
> >                 ItemPointerSet(&items.lookup_tids[num_lookup_tids++],
> > blkno, offset);
> > +           TidStoreUnlock(tidstore);
>
> In one sense, all locking in the test module is useless since there is
> only a single process. On the other hand, it seems good to at least
> run what we have written to run it trivially, and serve as an example
> of usage. We should probably be consistent, and document at the top
> that the locks are pro-forma only.

Agreed.

>
> > Regarding TidStoreMemoryUsage(), IIUC the caller doesn't need to take
> > a lock on the shared tidstore since dsa_get_total_size() (called by
> > RT_MEMORY_USAGE()) does appropriate locking. I think we can mention it
> > in the comment as follows:
> >
> > -/* Return the memory usage of TidStore */
> > +/*
> > + * Return the memory usage of TidStore.
> > + *
> > + * In shared TidStore cases, since shared_ts_memory_usage() does 
> > appropriate
> > + * locking, the caller doesn't need to take a lock.
> > + */
> >
> > What do you think?
>
> That duplicates the underlying comment on the radix tree function that
> this calls, so I'm inclined to leave it out. At this level it's
> probably best to document when a caller _does_ need to take an action.

Okay, I didn't change it.

>
> One thing I forgot to ask about earlier:
>
> +-- Add tids in out of order.
>
> Are they (the blocks to be precise) really out of order? The VALUES
> statement is ordered, but after inserting it does not output that way.
> I wondered if this is platform independent, but CI and our dev
> machines haven't failed this test, and I haven't looked into what
> determines the order. It's easy enough to hide the blocks if we ever
> need to, as we do elsewhere...

It seems not necessary as such a test is already covered by
test_radixtree. I've changed the query to hide the output blocks.

I've pushed the tidstore patch after incorporating the above changes.
In addition to that, I've added the following changes before the push:

- Added src/test/modules/test_tidstore/.gitignore file.
- Removed unnecessary #include from tidstore.c.

The buildfarm has been all-green so far.

I've attached the latest vacuum improvement patch.

I just remembered that the tidstore cannot still be used for parallel
vacuum with minimum maintenance_work_mem. Even when the shared
tidstore is empty, its memory usage reports 1056768 bytes, a bit above
1MB (1048576 bytes). We need something discussed on another thread[1]
in order to make it work.

Regards,

[1] 
https://www.postgresql.org/message-id/CAD21AoCVMw6DSmgZY9h%2BxfzKtzJeqWiwxaUD2T-FztVcV-XibQ%40mail.gmail.com

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment: v76-0001-Use-TidStore-for-dead-tuple-TIDs-storage-during-.patch
Description: Binary data

Reply via email to