Re: [PoC] Improve dead tuple storage for lazy vacuum

Masahiko Sawada Wed, 16 Nov 2022 07:24:53 -0800

On Wed, Nov 16, 2022 at 4:39 PM John Naylor
<john.nay...@enterprisedb.com> wrote:
>
>
> On Wed, Nov 16, 2022 at 12:33 PM Masahiko Sawada <sawada.m...@gmail.com> 
> wrote:
> >
> > On Wed, Nov 16, 2022 at 1:46 PM John Naylor
> > <john.nay...@enterprisedb.com> wrote:
> > >
> > >
> > > On Tue, Nov 15, 2022 at 11:59 AM Masahiko Sawada <sawada.m...@gmail.com> 
> > > wrote:
> > > > Thanks! Please let me know if there is something I can help with.
> > >
> > > I didn't get very far because the tests fail on 0004 in rt_verify_node:
> > >
> > > TRAP: failed Assert("n4->chunks[i - 1] < n4->chunks[i]"), File: 
> > > "../src/backend/lib/radixtree.c", Line: 2186, PID: 18242
> >
> > Which tests do you use to get this assertion failure? I've confirmed
> > there is a bug in 0005 patch but without it, "make check-world"
> > passed.
>
> Hmm, I started over and rebuilt and it didn't reproduce. Not sure what 
> happened, sorry for the noise.


Good to know. No problem.

> I'm attaching a test I wrote to stress test branch prediction in search, and 
> while trying it out I found two possible issues.

Thank you for testing!

>
> It's based on the random int load test, but tests search speed. Run like this:
>
> select * from bench_search_random_nodes(10 * 1000 * 1000)
>
> It also takes some care to include all the different node kinds, restricting 
> the possible keys by AND-ing with a filter. Here's a simple demo:
>
> filter = ((uint64)1<<40)-1;
> LOG:  num_keys = 9999967, height = 4, n4 = 17513814, n32 = 6320, n128 = 
> 62663, n256 = 3130
>
> Just using random integers leads to >99% using the smallest node. I wanted to 
> get close to having the same number of each, but that's difficult while still 
> using random inputs. I ended up using
>
> filter = (((uint64) 0x7F<<32) | (0x07<<24) | (0xFF<<16) | 0xFF)
>
> which gives
>
> LOG:  num_keys = 9291812, height = 4, n4 = 262144, n32 = 79603, n128 = 
> 182670, n256 = 1024
>
> Which seems okay for the task. One puzzling thing I found while trying 
> various filters is that sometimes the reported tree height would change. For 
> example:
>
> filter = (((uint64) 1<<32) | (0xFF<<24));
> LOG:  num_keys = 9999944, height = 7, n4 = 47515559, n32 = 6209, n128 = 
> 62632, n256 = 3161
>
> 1) Any idea why the tree height would be reported as 7 here? I didn't expect 
> that.

In my environment, (0xFF<<24) is 0xFFFFFFFFFF000000, not 0xFF000000.
It seems the filter should be (((uint64) 1<<32) | ((uint64)
0xFF<<24)).

>
> 2) It seems that 0004 actually causes a significant slowdown in this test (as 
> in the attached, using the second filter above and with turboboost disabled):
>
> v9 0003: 2062 2051 2050
> v9 0004: 2346 2316 2321
>
> That means my idea for the pointer struct might have some problems, at least 
> as currently implemented. Maybe in the course of separating out and polishing 
> that piece, an inefficiency will fall out. Or, it might be another reason to 
> template local and shared separately. Not sure yet. I also haven't tried to 
> adjust this test for the shared memory case.

I'll also run the test on my environment and do the investigation tomorrow.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: [PoC] Improve dead tuple storage for lazy vacuum

Reply via email to