On Wed, Nov 16, 2022 at 4:39 PM John Naylor <john.nay...@enterprisedb.com> wrote: > > > On Wed, Nov 16, 2022 at 12:33 PM Masahiko Sawada <sawada.m...@gmail.com> > wrote: > > > > On Wed, Nov 16, 2022 at 1:46 PM John Naylor > > <john.nay...@enterprisedb.com> wrote: > > > > > > > > > On Tue, Nov 15, 2022 at 11:59 AM Masahiko Sawada <sawada.m...@gmail.com> > > > wrote: > > > > Thanks! Please let me know if there is something I can help with. > > > > > > I didn't get very far because the tests fail on 0004 in rt_verify_node: > > > > > > TRAP: failed Assert("n4->chunks[i - 1] < n4->chunks[i]"), File: > > > "../src/backend/lib/radixtree.c", Line: 2186, PID: 18242 > > > > Which tests do you use to get this assertion failure? I've confirmed > > there is a bug in 0005 patch but without it, "make check-world" > > passed. > > Hmm, I started over and rebuilt and it didn't reproduce. Not sure what > happened, sorry for the noise.
Good to know. No problem. > I'm attaching a test I wrote to stress test branch prediction in search, and > while trying it out I found two possible issues. Thank you for testing! > > It's based on the random int load test, but tests search speed. Run like this: > > select * from bench_search_random_nodes(10 * 1000 * 1000) > > It also takes some care to include all the different node kinds, restricting > the possible keys by AND-ing with a filter. Here's a simple demo: > > filter = ((uint64)1<<40)-1; > LOG: num_keys = 9999967, height = 4, n4 = 17513814, n32 = 6320, n128 = > 62663, n256 = 3130 > > Just using random integers leads to >99% using the smallest node. I wanted to > get close to having the same number of each, but that's difficult while still > using random inputs. I ended up using > > filter = (((uint64) 0x7F<<32) | (0x07<<24) | (0xFF<<16) | 0xFF) > > which gives > > LOG: num_keys = 9291812, height = 4, n4 = 262144, n32 = 79603, n128 = > 182670, n256 = 1024 > > Which seems okay for the task. One puzzling thing I found while trying > various filters is that sometimes the reported tree height would change. For > example: > > filter = (((uint64) 1<<32) | (0xFF<<24)); > LOG: num_keys = 9999944, height = 7, n4 = 47515559, n32 = 6209, n128 = > 62632, n256 = 3161 > > 1) Any idea why the tree height would be reported as 7 here? I didn't expect > that. In my environment, (0xFF<<24) is 0xFFFFFFFFFF000000, not 0xFF000000. It seems the filter should be (((uint64) 1<<32) | ((uint64) 0xFF<<24)). > > 2) It seems that 0004 actually causes a significant slowdown in this test (as > in the attached, using the second filter above and with turboboost disabled): > > v9 0003: 2062 2051 2050 > v9 0004: 2346 2316 2321 > > That means my idea for the pointer struct might have some problems, at least > as currently implemented. Maybe in the course of separating out and polishing > that piece, an inefficiency will fall out. Or, it might be another reason to > template local and shared separately. Not sure yet. I also haven't tried to > adjust this test for the shared memory case. I'll also run the test on my environment and do the investigation tomorrow. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com