Re: [PoC] Improve dead tuple storage for lazy vacuum

Masahiko Sawada Sun, 15 Jan 2023 18:53:21 -0800

On Fri, Dec 23, 2022 at 4:33 PM John Naylor
<[email protected]> wrote:
>
>
>
> On Thu, Dec 22, 2022 at 10:00 PM Masahiko Sawada <[email protected]> 
> wrote:
>
> > If the value is a power of 2, it seems to work perfectly fine. But for
> > example if it's 700MB, the total memory exceeds the limit:
> >
> > 2*(1+2+4+8+16+32+64+128) = 510MB (72.8% of 700MB) -> keep going
> > 510 + 256 = 766MB -> stop but it exceeds the limit.
> >
> > In a more bigger case, if it's 11000MB,
> >
> > 2*(1+2+...+2048) = 8190MB (74.4%)
> > 8190 + 4096 = 12286MB
> >
> > That being said, I don't think they are not common cases. So the 75%
> > threshold seems to work fine in most cases.
>
> Thinking some more, I agree this doesn't have large practical risk, but 
> thinking from the point of view of the community, being loose with memory 
> limits by up to 10% is not a good precedent.


Agreed.

> Perhaps we can be clever and use 75% when the limit is a power of two and 50% 
> otherwise. I'm skeptical of trying to be clever, and I just thought of an 
> additional concern: We're assuming behavior of the growth in size of new DSA 
> segments, which could possibly change. Given how allocators are typically 
> coded, though, it seems safe to assume that they'll at most double in size.

Sounds good to me.

I've written a simple script to simulate the DSA memory usage and the
limit. The 75% limit works fine for a power of two cases, and we can
use the 60% limit for other cases (it seems we can use up to about 66%
but used 60% for safety). It would be best if we can mathematically
prove it but I could prove only the power of two cases. But the script
practically shows the 60% threshold would work for these cases.

Regards

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

# Prepare an array for the total amount of memory allocated in DSA.
dsa_total_memory = [0] * 40
dsa_total_memory[0] = 1
dsa_total_memory[1] = dsa_total_memory[0] + 1
for i in range(1, 20):
    dsa_memory = 1 << (i)
    dsa_total_memory[i * 2] = dsa_total_memory[(i * 2) - 1] + dsa_memory
    dsa_total_memory[(i * 2) + 1] = dsa_total_memory[i * 2] + dsa_memory
#print(dsa_total_memory)

# The array 'a' has values for maintenance_work_mem, i.e., the threthold. For each value
# in the given array 'a', this function checks the 'ratio'% threthold would work.
def check(a, ratio, verbose):
    ret = True
    for i in a:
        have_ng = False
        mwm = i
        mwm_limit = mwm * ratio
        for dsa in dsa_total_memory:
            if mwm_limit <= dsa:
                # The DSA memory usage exceeded the N% threshold. Check also
                # if it exceeded the maintenance_work_mem too.
                if mwm < dsa:
                    if verbose:
                        print("total=%d %d%%=%d ... FAIL (usage=%d)" % (mwm, ratio * 100, mwm_limit, dsa))
                    ret = False
                    have_ng = True
                break
        if not have_ng and verbose:
            print("all OK (total=%d %d%%=%d)" % (mwm, ratio * 100, mwm_limit))

    return ret

def check_ratio(a, ratio_begin=60, ratio_end=76, ratio_incr=1):
    for ratio in range(ratio_begin, ratio_end, ratio_incr):
        ret = check(a, ratio / 100, False)
        if ret:
            print("ratio %d%% ok" % ratio)
        else:
            print("ratio %d%% FAILED" % ratio)

# power-of-2, such as 1024kB, 2048kB, 4096kB, 8192kB, ...
po2 = []
for i in range(1, 20):
    po2.append(1 << (i + 10))

# from 1024kB(1MB) to 2000MB by 1024
normal_1024 = []
for i in range(1, 2000):
    normal_1024.append(1024 * i)

# from 1024 by 1000
normal_1000 = []
for i in range(1, 2000):
    normal_1000.append(1024 + (1000 * i))

print("test power-of-2")
check_ratio(po2)
print("test normal 1024")
check_ratio(normal_1024)
print("test normal 1000")
check_ratio(normal_1000)

Re: [PoC] Improve dead tuple storage for lazy vacuum

Reply via email to