On Thu, Jan 11, 2024 at 9:28 AM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > On Mon, Jan 8, 2024 at 8:35 PM John Naylor <johncnaylo...@gmail.com> wrote: > > > > On Wed, Jan 3, 2024 at 9:10 PM John Naylor <johncnaylo...@gmail.com> wrote: > > > > > > On Tue, Jan 2, 2024 at 8:01 PM Masahiko Sawada <sawada.m...@gmail.com> > > > wrote: > > > > > > > I agree that we expose RT_LOCK_* functions and have tidstore use them, > > > > but am not sure the if (TidStoreIsShared(ts) LWLockAcquire(..., ...)" > > > > calls part. I think that even if we expose them, we will still need to > > > > do something like "if (TidStoreIsShared(ts)) > > > > shared_rt_lock_share(ts->tree.shared)", no? > > > > > > I'll come back to this topic separately. > > > > To answer your question, sure, but that "if (TidStoreIsShared(ts))" > > part would be pushed down into a function so that only one place has > > to care about it. > > > > However, I'm starting to question whether we even need that. Meaning, > > lock the tidstore separately. To "lock the tidstore" means to take a > > lock, _separate_ from the radix tree's internal lock, to control > > access to two fields in a separate "control object": > > > > +typedef struct TidStoreControl > > +{ > > + /* the number of tids in the store */ > > + int64 num_tids; > > + > > + /* the maximum bytes a TidStore can use */ > > + size_t max_bytes; > > > > I'm pretty sure max_bytes does not need to be in shared memory, and > > certainly not under a lock: Thinking of a hypothetical > > parallel-prune-phase scenario, one way would be for a leader process > > to pass out ranges of blocks to workers, and when the limit is > > exceeded, stop passing out blocks and wait for all the workers to > > finish. > > True. I agreed that it doesn't need to be under a lock anyway, as it's > read-only. > > > > > As for num_tids, vacuum previously put the similar count in > > > > @@ -176,7 +179,8 @@ struct ParallelVacuumState > > PVIndStats *indstats; > > > > /* Shared dead items space among parallel vacuum workers */ > > - VacDeadItems *dead_items; > > + TidStore *dead_items; > > > > VacDeadItems contained "num_items". What was the reason to have new > > infrastructure for that count? And it doesn't seem like access to it > > was controlled by a lock -- can you confirm? If we did get parallel > > pruning, maybe the count would belong inside PVShared? > > I thought that since the tidstore is a general-purpose data structure > the shared counter should be protected by a lock. One thing I'm > concerned about is that we might need to update both the radix tree > and the counter atomically in some cases. But that's true we don't > need it for lazy vacuum at least for now. Even given the parallel scan > phase, probably we won't need to have workers check the total number > of stored tuples during a parallel scan. > > > > > The number of tids is not that tightly bound to the tidstore's job. I > > believe tidbitmap.c (a possible future client) doesn't care about the > > global number of tids -- not only that, but AND/OR operations can > > change the number in a non-obvious way, so it would not be convenient > > to keep an accurate number anyway. But the lock would still be > > mandatory with this patch. > > Very good point. > > > > > If we can make vacuum work a bit closer to how it does now, it'd be a > > big step up in readability, I think. Namely, getting rid of all the > > locking logic inside tidstore.c and let the radix tree's locking do > > the right thing. We'd need to make that work correctly when receiving > > pointers to values upon lookup, and I already shared ideas for that. > > But I want to see if there is any obstacle in the way of removing the > > tidstore control object and it's separate lock. > > So I agree to remove both max_bytes and num_items from the control > object.Also, as you mentioned, we can remove the tidstore control > object itself. TidStoreGetHandle() returns a radix tree handle, and we > can pass it to TidStoreAttach(). I'll try it. >
I realized that if we remove the whole tidstore control object including max_bytes, processes who attached the shared tidstore cannot use TidStoreIsFull() actually as it always returns true. Also they cannot use TidStoreReset() as well since it needs to pass max_bytes to RT_CREATE(). It might not be a problem in terms of lazy vacuum, but it could be problematic for general use. If we remove it, we probably need a safeguard to prevent those who attached the tidstore from calling these functions. Or we can keep the control object but remove the lock and num_tids. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com