> On Apr 13, 2026, at 12:35, David Rowley <[email protected]> wrote: > > (v20 material) > > While working on some new code which required offsetting the members > of a Bitmapset, I decided to go and write a function to do this rather > than copy the various other places where we manually construct a new > set with a bms_next_member() -> bms_add_member() loop. The new use > case I have is from pulling varattnos from a scan's targetlist, which > there could be several hundred Vars in. I considered this might be > noticeably expensive. > > The current manual way we have of doing this is a bit laborious since > we're only doing 1 member per bms_next_member() loop, and also, if the > set has multiple words, we may end up doing several repallocs to > expand the set, perhaps as little as 1 word at a time. That's not to > mention the rather repetitive code that we have to do this in various > places that might be nice to consolidate. > > I've attached a patch which adds bms_offset_members(), which does > bitshifting to move the members up or down by the given offset. While > working on this I made a few choices which might be worth a revisit: > > 1) The function modifies the given set in-place rather than making a new set. > 2) The function will ERROR if any member would go above INT_MAX. These > would be inaccessible, and that seems weird/wrong. > 3) When offsetting by a negative value, members that would go below > zero fall out of the set silently. > > For #1; my original use-case that made me write this didn't need a > copy, so I wrote the function to modify the set in-place. After > hunting down and replacing the relevant existing bms_next_member() > loops with the new function, I noticed all these seem to need a copy. > Currently, I have coded the patch to do > bms_offset_members(bms_copy(set), ...), but that's a little > inefficient as it *could* result in a palloc for the copy then a > repalloc in the offset. If bms_offset_members() just created a new > set, then it could palloc() a set to the exact required size. The > counterargument to that is that I don't really want to copy the set > for my intended use case. I thought about two versions, but thought > that might be overkill. There could be a boolean parameter to define > that, but we don't do that anywhere else in bitmapset.c, or we could > avoid a copy-paste of the code with an always-inlined helper function. > I couldn't decide, so left it as-is. > > For #2, I could have equally made these fall off the top of the set, > but I thought we might want to know about it in the unlikely event > that this ever happens. > > #3 We commonly want to get rid of system columns from a > pull_varattnos() set which is offset by > FirstLowInvalidHeapAttributeNumber, so those disappearing silently is > what most use cases seem to want. I expect that's not for revisiting, > but I listed this one anyway as it flies in the face of #2. > > Patch attached. Comments welcome. > > David > <v1-0001-Introduce-bms_offset_members-function.patch>
I basically agree with design choices 1/2/3. And the implementation of v1
overall looks good to me.
The only issue I found is this overflow check:
```
+ /* Handle a positive offset (bitshift left) */
+ if (offset > 0)
+ {
+ /*
+ * An unlikely scenario, but check we're not going to create a
member
+ * greater than PG_INT32_MAX.
+ */
+ if (((uint64) new_nwords - 1) * BITS_PER_BITMAPWORD + high_bit
+ offset_bits > PG_INT32_MAX)
+ elog(ERROR, "bitmapset overflow");
```
This overflow check seems wrong. Because when high_bit + offset_bits >
BITS_PER_BITMAPWORD, new_nwords has been increased by 1, so there high_bit +
offset_bits are double counted.
To verify that, I added a new test:
```
-- 2147483583 is PG_INT32_MAX - 64, so offsetting by 1 should succeed,
-- but offsetting it by 65 should fail with overflow error
SELECT test_random_offset_operations_check_highest(2147483583, 1) AS result;
SELECT test_random_offset_operations_check_highest(2147483583, 65) AS result;
```
With v1, test_random_offset_operations_check_highest(2147483583, 1) reports an
overflow error, but it should not.
Please see the attached diff for the test I added. In that diff, I also
included a fix, and with that fix, the tests pass.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
nocfbot_chao_test.diff
Description: Binary data
