On Wed, Sep 28, 2022 at 1:18 PM I wrote:

> Along those lines, one thing I've been thinking about is the number of
size classes. There is a tradeoff between memory efficiency and number of
branches when searching/inserting. My current thinking is there is too much
coupling between size class and data type. Each size class currently uses a
different data type and a different algorithm to search and set it, which
in turn requires another branch. We've found that a larger number of size
classes leads to poor branch prediction [1] and (I imagine) code density.
>
> I'm thinking we can use "flexible array members" for the values/pointers,
and keep the rest of the control data in the struct the same. That way, we
never have more than 4 actual "kinds" to code and branch on. As a bonus,
when migrating a node to a larger size class of the same kind, we can
simply repalloc() to the next size.

While the most important challenge right now is how to best represent and
organize the shared memory case, I wanted to get the above idea working and
out of the way, to be saved for a future time. I've attached a rough
implementation (applies on top of v9 0003) that splits node32 into 2 size
classes. They both share the exact same base data type and hence the same
search/set code, so the number of "kind"s is still four, but here there are
five "size classes", so a new case in the "unlikely" node-growing path. The
smaller instance of node32 is a "node15", because that's currently 160
bytes, corresponding to one of the DSA size classes. This idea can be
applied to any other node except the max size, as we see fit. (Adding a
singleton size class would bring it back in line with the prototype, at
least as far as memory consumption.)

One issue with this patch: The "fanout" member is a uint8, so it can't hold
256 for the largest node kind. That's not an issue in practice, since we
never need to grow it, and we only compare that value with the count in an
Assert(), so I just set it to zero. That does break an invariant, so it's
not great. We could use 2 bytes to be strictly correct in all cases, but
that limits what we can do with the smallest node kind.

In the course of working on this, I encountered a pain point. Since it's
impossible to repalloc in slab, we have to do alloc/copy/free ourselves.
That's fine, but the current coding makes too many assumptions about the
use cases: rt_alloc_node and rt_copy_node are too entangled with each other
and do too much work unrelated to what the names imply. I seem to remember
an earlier version had something like rt_node_copy_common that did
only...copying. That was much easier to reason about. In 0002 I resorted to
doing my own allocation to show what I really want to do, because the new
use case doesn't need zeroing and setting values. It only needs
to...allocate (and increase the stats counter if built that way).

Future optimization work while I'm thinking of it: rt_alloc_node should be
always-inlined and the memset done separately (i.e. not *AllocZero). That
way the compiler should be able generate more efficient zeroing code for
smaller nodes. I'll test the numbers on this sometime in the future.

--
John Naylor
EDB: http://www.enterprisedb.com
From 6fcc970ae7e31f44fa6b6aface983cadb023cc50 Mon Sep 17 00:00:00 2001
From: John Naylor <john.nay...@postgresql.org>
Date: Thu, 17 Nov 2022 16:10:44 +0700
Subject: [PATCH v901 2/2] Make node32 variable sized

Add a size class for 15 elements, which corresponds to 160 bytes,
an allocation size used by DSA. When a 16th element is to be
inserted, allocte a larger area and memcpy the entire old node
to it.

NB: Zeroing the new area is only necessary if it's for an
inner node128, since insert logic must check for null child
pointers.

This technique allows us to limit the node kinds to 4, which
1. limits the number of cases in switch statements
2. allows a possible future optimization to encode the node kind
in a pointer tag
---
 src/backend/lib/radixtree.c | 141 +++++++++++++++++++++++++++---------
 1 file changed, 108 insertions(+), 33 deletions(-)

diff --git a/src/backend/lib/radixtree.c b/src/backend/lib/radixtree.c
index bef1a438ab..f368e750d5 100644
--- a/src/backend/lib/radixtree.c
+++ b/src/backend/lib/radixtree.c
@@ -130,6 +130,7 @@ typedef enum
 typedef enum rt_size_class
 {
        RT_CLASS_4_FULL = 0,
+       RT_CLASS_32_PARTIAL,
        RT_CLASS_32_FULL,
        RT_CLASS_128_FULL,
        RT_CLASS_256
@@ -147,6 +148,8 @@ typedef struct rt_node
        uint16          count;
 
        /* Max number of children. We can use uint8 because we never need to 
store 256 */
+       /* WIP: if we don't have a variable sized node4, this should instead be 
in the base
+       types as needed, since saving every byte is crucial for the smallest 
node kind */
        uint8           fanout;
 
        /*
@@ -166,6 +169,8 @@ typedef struct rt_node
        ((node)->base.n.count < (node)->base.n.fanout)
 
 /* Base type of each node kinds for leaf and inner nodes */
+/* The base types must be a be able to accommodate the largest size
+class for variable-sized node kinds*/
 typedef struct rt_node_base_4
 {
        rt_node         n;
@@ -217,40 +222,40 @@ typedef struct rt_node_inner_4
 {
        rt_node_base_4 base;
 
-       /* 4 children, for key chunks */
-       rt_node    *children[4];
+       /* number of children depends on size class */
+       rt_node    *children[FLEXIBLE_ARRAY_MEMBER];
 } rt_node_inner_4;
 
 typedef struct rt_node_leaf_4
 {
        rt_node_base_4 base;
 
-       /* 4 values, for key chunks */
-       uint64          values[4];
+       /* number of values depends on size class */
+       uint64          values[FLEXIBLE_ARRAY_MEMBER];
 } rt_node_leaf_4;
 
 typedef struct rt_node_inner_32
 {
        rt_node_base_32 base;
 
-       /* 32 children, for key chunks */
-       rt_node    *children[32];
+       /* number of children depends on size class */
+       rt_node    *children[FLEXIBLE_ARRAY_MEMBER];
 } rt_node_inner_32;
 
 typedef struct rt_node_leaf_32
 {
        rt_node_base_32 base;
 
-       /* 32 values, for key chunks */
-       uint64          values[32];
+       /* number of values depends on size class */
+       uint64          values[FLEXIBLE_ARRAY_MEMBER];
 } rt_node_leaf_32;
 
 typedef struct rt_node_inner_128
 {
        rt_node_base_128 base;
 
-       /* Slots for 128 children */
-       rt_node    *children[128];
+       /* number of children depends on size class */
+       rt_node    *children[FLEXIBLE_ARRAY_MEMBER];
 } rt_node_inner_128;
 
 typedef struct rt_node_leaf_128
@@ -260,8 +265,8 @@ typedef struct rt_node_leaf_128
        /* isset is a bitmap to track which slot is in use */
        uint8           isset[RT_NODE_NSLOTS_BITS(128)];
 
-       /* Slots for 128 values */
-       uint64          values[128];
+       /* number of values depends on size class */
+       uint64          values[FLEXIBLE_ARRAY_MEMBER];
 } rt_node_leaf_128;
 
 /*
@@ -307,32 +312,40 @@ typedef struct rt_size_class_elem
  * from the block.
  */
 #define NODE_SLAB_BLOCK_SIZE(size)     \
-       Max((SLAB_DEFAULT_BLOCK_SIZE / (size)) * size, (size) * 32)
+       Max((SLAB_DEFAULT_BLOCK_SIZE / (size)) * (size), (size) * 32)
 static rt_size_class_elem rt_size_class_info[RT_SIZE_CLASS_COUNT] = {
 
        [RT_CLASS_4_FULL] = {
                .name = "radix tree node 4",
                .fanout = 4,
-               .inner_size = sizeof(rt_node_inner_4),
-               .leaf_size = sizeof(rt_node_leaf_4),
-               .inner_blocksize = 
NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_inner_4)),
-               .leaf_blocksize = NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_leaf_4)),
+               .inner_size = sizeof(rt_node_inner_4) + 4 * sizeof(rt_node *),
+               .leaf_size = sizeof(rt_node_leaf_4) + 4 * sizeof(uint64),
+               .inner_blocksize = NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_inner_4) 
+ 4 * sizeof(rt_node *)),
+               .leaf_blocksize = NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_leaf_4) + 
4 * sizeof(uint64)),
+       },
+       [RT_CLASS_32_PARTIAL] = {
+               .name = "radix tree node 15",
+               .fanout = 15,
+               .inner_size = sizeof(rt_node_inner_32) + 15 * sizeof(rt_node *),
+               .leaf_size = sizeof(rt_node_leaf_32) + 15 * sizeof(uint64),
+               .inner_blocksize = 
NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_inner_32) + 15 * sizeof(rt_node *)),
+               .leaf_blocksize = NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_leaf_32) 
+ 15 * sizeof(uint64)),
        },
        [RT_CLASS_32_FULL] = {
                .name = "radix tree node 32",
                .fanout = 32,
-               .inner_size = sizeof(rt_node_inner_32),
-               .leaf_size = sizeof(rt_node_leaf_32),
-               .inner_blocksize = 
NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_inner_32)),
-               .leaf_blocksize = NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_leaf_32)),
+               .inner_size = sizeof(rt_node_inner_32) + 32 * sizeof(rt_node *),
+               .leaf_size = sizeof(rt_node_leaf_32) + 32 * sizeof(uint64),
+               .inner_blocksize = 
NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_inner_32) + 32 * sizeof(rt_node *)),
+               .leaf_blocksize = NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_leaf_32) 
+ 32 * sizeof(uint64)),
        },
        [RT_CLASS_128_FULL] = {
                .name = "radix tree node 128",
                .fanout = 128,
-               .inner_size = sizeof(rt_node_inner_128),
-               .leaf_size = sizeof(rt_node_leaf_128),
-               .inner_blocksize = 
NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_inner_128)),
-               .leaf_blocksize = 
NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_leaf_128)),
+               .inner_size = sizeof(rt_node_inner_128) + 128 * sizeof(rt_node 
*),
+               .leaf_size = sizeof(rt_node_leaf_128) + 128 * sizeof(uint64),
+               .inner_blocksize = 
NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_inner_128) + 128 * sizeof(rt_node *)),
+               .leaf_blocksize = NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_leaf_128) 
+ 128 * sizeof(uint64)),
        },
        [RT_CLASS_256] = {
                .name = "radix tree node 256",
@@ -922,7 +935,6 @@ rt_free_node(radix_tree *tree, rt_node *node)
 
 #ifdef RT_DEBUG
        /* update the statistics */
-       // FIXME
        for (i = 0; i < RT_SIZE_CLASS_COUNT; i++)
        {
                if (node->fanout == rt_size_class_info[i].fanout)
@@ -1240,7 +1252,7 @@ rt_node_insert_inner(radix_tree *tree, rt_node *parent, 
rt_node *node, uint64 ke
 
                                        /* grow node from 4 to 32 */
                                        new32 = (rt_node_inner_32 *) 
rt_copy_node(tree, (rt_node *) n4,
-                                                                               
                                          RT_NODE_KIND_32, RT_CLASS_32_FULL);
+                                                                               
                                          RT_NODE_KIND_32, RT_CLASS_32_PARTIAL);
                                        
chunk_children_array_copy(n4->base.chunks, n4->children,
                                                                                
          new32->base.chunks, new32->children,
                                                                                
          n4->base.n.count);
@@ -1282,6 +1294,37 @@ rt_node_insert_inner(radix_tree *tree, rt_node *parent, 
rt_node *node, uint64 ke
 
                                if (unlikely(!NODE_HAS_FREE_SLOT(n32)))
                                {
+                                       Assert(parent != NULL);
+
+                                       if (n32->base.n.fanout == 
rt_size_class_info[RT_CLASS_32_PARTIAL].fanout)
+                                       {
+                                               /* use the same node kind, but 
expand to the next size class */
+
+                                               /* no need to zero the new 
memory */
+                                               rt_node_inner_32 *new32 =
+                                                       (rt_node_inner_32 *) 
MemoryContextAlloc(tree->inner_slabs[RT_CLASS_32_FULL],
+                                                                               
                         rt_size_class_info[RT_CLASS_32_FULL].inner_size);
+
+// FIXME the API for rt_alloc_node and rt_node_copy are too entangled
+#ifdef RT_DEBUG
+       /* update the statistics */
+       tree->cnt[RT_CLASS_32_FULL]++;
+#endif
+                                               /* copy the entire old node -- 
the new node is only different in having
+                                               additional slots so we only 
have to change the fanout */
+                                               memcpy(new32, n32, 
rt_size_class_info[RT_CLASS_32_PARTIAL].inner_size);
+                                               new32->base.n.fanout = 
rt_size_class_info[RT_CLASS_32_FULL].fanout;
+
+                                               rt_replace_node(tree, parent, 
(rt_node *) n32, (rt_node *) new32,
+                                                                               
key);
+
+                                               /* must update both pointers 
here */
+                                               node = (rt_node *) new32;
+                                               n32 = new32;
+                                               goto retry_insert_inner_32;
+                                       }
+                                       else
+                                       {
                                        rt_node_inner_128 *new128;
 
                                        /* grow node from 32 to 128 */
@@ -1290,13 +1333,14 @@ rt_node_insert_inner(radix_tree *tree, rt_node *parent, 
rt_node *node, uint64 ke
                                        for (int i = 0; i < n32->base.n.count; 
i++)
                                                node_inner_128_insert(new128, 
n32->base.chunks[i], n32->children[i]);
 
-                                       Assert(parent != NULL);
                                        rt_replace_node(tree, parent, (rt_node 
*) n32, (rt_node *) new128,
                                                                        key);
                                        node = (rt_node *) new128;
+                                       }
                                }
                                else
                                {
+retry_insert_inner_32:
                                        int                     insertpos = 
node_32_get_insertpos((rt_node_base_32 *) n32, chunk);
                                        int16           count = 
n32->base.n.count;
 
@@ -1409,12 +1453,10 @@ rt_node_insert_leaf(radix_tree *tree, rt_node *parent, 
rt_node *node,
 
                                        /* grow node from 4 to 32 */
                                        new32 = (rt_node_leaf_32 *) 
rt_copy_node(tree, (rt_node *) n4,
-                                                                               
                                         RT_NODE_KIND_32, RT_CLASS_32_FULL);
+                                                                               
                                         RT_NODE_KIND_32, RT_CLASS_32_PARTIAL);
                                        
chunk_values_array_copy(n4->base.chunks, n4->values,
                                                                                
        new32->base.chunks, new32->values,
                                                                                
        n4->base.n.count);
-
-                                       Assert(parent != NULL);
                                        rt_replace_node(tree, parent, (rt_node 
*) n4, (rt_node *) new32,
                                                                        key);
                                        node = (rt_node *) new32;
@@ -1451,6 +1493,37 @@ rt_node_insert_leaf(radix_tree *tree, rt_node *parent, 
rt_node *node,
 
                                if (unlikely(!NODE_HAS_FREE_SLOT(n32)))
                                {
+                                       Assert(parent != NULL);
+
+                                       if (n32->base.n.fanout == 
rt_size_class_info[RT_CLASS_32_PARTIAL].fanout)
+                                       {
+                                               /* use the same node kind, but 
expand to the next size class */
+
+                                               /* no need to zero the new 
memory */
+                                               rt_node_leaf_32 *new32 =
+                                                       (rt_node_leaf_32 *) 
MemoryContextAlloc(tree->leaf_slabs[RT_CLASS_32_FULL],
+                                                                               
                         rt_size_class_info[RT_CLASS_32_FULL].leaf_size);
+
+// FIXME the API for rt_alloc_node and rt_node_copy are too entangled
+#ifdef RT_DEBUG
+       /* update the statistics */
+       tree->cnt[RT_CLASS_32_FULL]++;
+#endif
+                                               /* copy the entire old node -- 
the new node is only different in having
+                                               additional slots so we only 
have to change the fanout */
+                                               memcpy(new32, n32, 
rt_size_class_info[RT_CLASS_32_PARTIAL].leaf_size);
+                                               new32->base.n.fanout = 
rt_size_class_info[RT_CLASS_32_FULL].fanout;
+
+                                               rt_replace_node(tree, parent, 
(rt_node *) n32, (rt_node *) new32,
+                                                                               
key);
+
+                                               /* must update both pointers 
here */
+                                               node = (rt_node *) new32;
+                                               n32 = new32;
+                                               goto retry_insert_leaf_32;
+                                       }
+                                       else
+                                       {
                                        rt_node_leaf_128 *new128;
 
                                        /* grow node from 32 to 128 */
@@ -1459,13 +1532,14 @@ rt_node_insert_leaf(radix_tree *tree, rt_node *parent, 
rt_node *node,
                                        for (int i = 0; i < n32->base.n.count; 
i++)
                                                node_leaf_128_insert(new128, 
n32->base.chunks[i], n32->values[i]);
 
-                                       Assert(parent != NULL);
                                        rt_replace_node(tree, parent, (rt_node 
*) n32, (rt_node *) new128,
                                                                        key);
                                        node = (rt_node *) new128;
+                                       }
                                }
                                else
                                {
+retry_insert_leaf_32:
                                        int                     insertpos = 
node_32_get_insertpos((rt_node_base_32 *) n32, chunk);
                                        int                     count = 
n32->base.n.count;
 
@@ -2189,10 +2263,11 @@ rt_verify_node(rt_node *node)
 void
 rt_stats(radix_tree *tree)
 {
-       ereport(LOG, (errmsg("num_keys = %lu, height = %u, n4 = %u, n32 = %u, 
n128 = %u, n256 = %u",
+       ereport(NOTICE, (errmsg("num_keys = %lu, height = %u, n4 = %u, n15 = 
%u, n32 = %u, n128 = %u, n256 = %u",
                                                 tree->num_keys,
                                                 tree->root->shift / 
RT_NODE_SPAN,
                                                 tree->cnt[RT_CLASS_4_FULL],
+                                                tree->cnt[RT_CLASS_32_PARTIAL],
                                                 tree->cnt[RT_CLASS_32_FULL],
                                                 tree->cnt[RT_CLASS_128_FULL],
                                                 tree->cnt[RT_CLASS_256])));
-- 
2.38.1

From 15e16df13912d265c3b1eda858456de6fe595c33 Mon Sep 17 00:00:00 2001
From: John Naylor <john.nay...@postgresql.org>
Date: Thu, 17 Nov 2022 12:10:31 +0700
Subject: [PATCH v901 1/2] Preparatory refactoring for decoupling kind from
 size class

Rename the current kind info array to refer to size classes, but
keep all the contents the same.

Add a fanout member to all nodes which stores the max capacity of
the node. This is currently set with the same hardcoded value as
in the kind info array.

In passing, remove outdated reference to node16 in the regression
test.
---
 src/backend/lib/radixtree.c                   | 147 +++++++++++-------
 .../modules/test_radixtree/test_radixtree.c   |   1 -
 2 files changed, 87 insertions(+), 61 deletions(-)

diff --git a/src/backend/lib/radixtree.c b/src/backend/lib/radixtree.c
index bd58b2bfad..bef1a438ab 100644
--- a/src/backend/lib/radixtree.c
+++ b/src/backend/lib/radixtree.c
@@ -127,6 +127,16 @@ typedef enum
 #define RT_NODE_KIND_256               0x03
 #define RT_NODE_KIND_COUNT             4
 
+typedef enum rt_size_class
+{
+       RT_CLASS_4_FULL = 0,
+       RT_CLASS_32_FULL,
+       RT_CLASS_128_FULL,
+       RT_CLASS_256
+
+#define RT_SIZE_CLASS_COUNT (RT_CLASS_256 + 1)
+} rt_size_class;
+
 /* Common type for all nodes types */
 typedef struct rt_node
 {
@@ -136,6 +146,9 @@ typedef struct rt_node
         */
        uint16          count;
 
+       /* Max number of children. We can use uint8 because we never need to 
store 256 */
+       uint8           fanout;
+
        /*
         * Shift indicates which part of the key space is represented by this
         * node. That is, the key is shifted by 'shift' and the lowest
@@ -144,13 +157,13 @@ typedef struct rt_node
        uint8           shift;
        uint8           chunk;
 
-       /* Size kind of the node */
+       /* Node kind, one per search/set algorithm */
        uint8           kind;
 } rt_node;
 #define NODE_IS_LEAF(n)                        (((rt_node *) (n))->shift == 0)
 #define NODE_IS_EMPTY(n)               (((rt_node *) (n))->count == 0)
-#define NODE_HAS_FREE_SLOT(n) \
-       (((rt_node *) (n))->count < rt_node_kind_info[((rt_node *) 
(n))->kind].fanout)
+#define NODE_HAS_FREE_SLOT(node) \
+       ((node)->base.n.count < (node)->base.n.fanout)
 
 /* Base type of each node kinds for leaf and inner nodes */
 typedef struct rt_node_base_4
@@ -190,7 +203,7 @@ typedef struct rt_node_base256
 /*
  * Inner and leaf nodes.
  *
- * There are separate from inner node size classes for two main reasons:
+ * Theres are separate for two main reasons:
  *
  * 1) the value type might be different than something fitting into a pointer
  *    width type
@@ -274,8 +287,8 @@ typedef struct rt_node_leaf_256
        uint64          values[RT_NODE_MAX_SLOTS];
 } rt_node_leaf_256;
 
-/* Information of each size kinds */
-typedef struct rt_node_kind_info_elem
+/* Information for each size class */
+typedef struct rt_size_class_elem
 {
        const char *name;
        int                     fanout;
@@ -287,7 +300,7 @@ typedef struct rt_node_kind_info_elem
        /* slab block size */
        Size            inner_blocksize;
        Size            leaf_blocksize;
-} rt_node_kind_info_elem;
+} rt_size_class_elem;
 
 /*
  * Calculate the slab blocksize so that we can allocate at least 32 chunks
@@ -295,9 +308,9 @@ typedef struct rt_node_kind_info_elem
  */
 #define NODE_SLAB_BLOCK_SIZE(size)     \
        Max((SLAB_DEFAULT_BLOCK_SIZE / (size)) * size, (size) * 32)
-static rt_node_kind_info_elem rt_node_kind_info[RT_NODE_KIND_COUNT] = {
+static rt_size_class_elem rt_size_class_info[RT_SIZE_CLASS_COUNT] = {
 
-       [RT_NODE_KIND_4] = {
+       [RT_CLASS_4_FULL] = {
                .name = "radix tree node 4",
                .fanout = 4,
                .inner_size = sizeof(rt_node_inner_4),
@@ -305,7 +318,7 @@ static rt_node_kind_info_elem 
rt_node_kind_info[RT_NODE_KIND_COUNT] = {
                .inner_blocksize = 
NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_inner_4)),
                .leaf_blocksize = NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_leaf_4)),
        },
-       [RT_NODE_KIND_32] = {
+       [RT_CLASS_32_FULL] = {
                .name = "radix tree node 32",
                .fanout = 32,
                .inner_size = sizeof(rt_node_inner_32),
@@ -313,7 +326,7 @@ static rt_node_kind_info_elem 
rt_node_kind_info[RT_NODE_KIND_COUNT] = {
                .inner_blocksize = 
NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_inner_32)),
                .leaf_blocksize = NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_leaf_32)),
        },
-       [RT_NODE_KIND_128] = {
+       [RT_CLASS_128_FULL] = {
                .name = "radix tree node 128",
                .fanout = 128,
                .inner_size = sizeof(rt_node_inner_128),
@@ -321,9 +334,11 @@ static rt_node_kind_info_elem 
rt_node_kind_info[RT_NODE_KIND_COUNT] = {
                .inner_blocksize = 
NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_inner_128)),
                .leaf_blocksize = 
NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_leaf_128)),
        },
-       [RT_NODE_KIND_256] = {
+       [RT_CLASS_256] = {
                .name = "radix tree node 256",
-               .fanout = 256,
+               /* technically it's 256, but we can't store that in a uint8,
+                 and this is the max size class so it will never grow */
+               .fanout = 0,
                .inner_size = sizeof(rt_node_inner_256),
                .leaf_size = sizeof(rt_node_leaf_256),
                .inner_blocksize = 
NODE_SLAB_BLOCK_SIZE(sizeof(rt_node_inner_256)),
@@ -372,17 +387,17 @@ struct radix_tree
        uint64          max_val;
        uint64          num_keys;
 
-       MemoryContextData *inner_slabs[RT_NODE_KIND_COUNT];
-       MemoryContextData *leaf_slabs[RT_NODE_KIND_COUNT];
+       MemoryContextData *inner_slabs[RT_SIZE_CLASS_COUNT];
+       MemoryContextData *leaf_slabs[RT_SIZE_CLASS_COUNT];
 
        /* statistics */
 #ifdef RT_DEBUG
-       int32           cnt[RT_NODE_KIND_COUNT];
+       int32           cnt[RT_SIZE_CLASS_COUNT];
 #endif
 };
 
 static void rt_new_root(radix_tree *tree, uint64 key);
-static rt_node *rt_alloc_node(radix_tree *tree, int kind, uint8 shift, uint8 
chunk,
+static rt_node *rt_alloc_node(radix_tree *tree, int kind, rt_size_class 
size_class, uint8 shift, uint8 chunk,
                                                          bool inner);
 static void rt_free_node(radix_tree *tree, rt_node *node);
 static void rt_extend(radix_tree *tree, uint64 key);
@@ -584,7 +599,7 @@ chunk_children_array_copy(uint8 *src_chunks, rt_node 
**src_children,
                                                  uint8 *dst_chunks, rt_node 
**dst_children, int count)
 {
        /* For better code generation */
-       if (count > rt_node_kind_info[RT_NODE_KIND_4].fanout)
+       if (count > rt_size_class_info[RT_CLASS_4_FULL].fanout)
                pg_unreachable();
 
        memcpy(dst_chunks, src_chunks, sizeof(uint8) * count);
@@ -596,7 +611,7 @@ chunk_values_array_copy(uint8 *src_chunks, uint64 
*src_values,
                                                uint8 *dst_chunks, uint64 
*dst_values, int count)
 {
        /* For better code generation */
-       if (count > rt_node_kind_info[RT_NODE_KIND_4].fanout)
+       if (count > rt_size_class_info[RT_CLASS_4_FULL].fanout)
                pg_unreachable();
 
        memcpy(dst_chunks, src_chunks, sizeof(uint8) * count);
@@ -837,7 +852,7 @@ rt_new_root(radix_tree *tree, uint64 key)
        int                     shift = key_get_shift(key);
        rt_node    *node;
 
-       node = (rt_node *) rt_alloc_node(tree, RT_NODE_KIND_4, shift, 0,
+       node = (rt_node *) rt_alloc_node(tree, RT_NODE_KIND_4, RT_CLASS_4_FULL, 
shift, 0,
                                                                         shift 
> 0);
        tree->max_val = shift_get_max_val(shift);
        tree->root = node;
@@ -847,18 +862,19 @@ rt_new_root(radix_tree *tree, uint64 key)
  * Allocate a new node with the given node kind.
  */
 static rt_node *
-rt_alloc_node(radix_tree *tree, int kind, uint8 shift, uint8 chunk, bool inner)
+rt_alloc_node(radix_tree *tree, int kind, rt_size_class size_class, uint8 
shift, uint8 chunk, bool inner)
 {
        rt_node    *newnode;
 
        if (inner)
-               newnode = (rt_node *) 
MemoryContextAllocZero(tree->inner_slabs[kind],
-                                                                               
                         rt_node_kind_info[kind].inner_size);
+               newnode = (rt_node *) 
MemoryContextAllocZero(tree->inner_slabs[size_class],
+                                                                               
                         rt_size_class_info[size_class].inner_size);
        else
-               newnode = (rt_node *) 
MemoryContextAllocZero(tree->leaf_slabs[kind],
-                                                                               
                         rt_node_kind_info[kind].leaf_size);
+               newnode = (rt_node *) 
MemoryContextAllocZero(tree->leaf_slabs[size_class],
+                                                                               
                         rt_size_class_info[size_class].leaf_size);
 
        newnode->kind = kind;
+       newnode->fanout = rt_size_class_info[size_class].fanout;
        newnode->shift = shift;
        newnode->chunk = chunk;
 
@@ -872,7 +888,7 @@ rt_alloc_node(radix_tree *tree, int kind, uint8 shift, 
uint8 chunk, bool inner)
 
 #ifdef RT_DEBUG
        /* update the statistics */
-       tree->cnt[kind]++;
+       tree->cnt[size_class]++;
 #endif
 
        return newnode;
@@ -883,11 +899,11 @@ rt_alloc_node(radix_tree *tree, int kind, uint8 shift, 
uint8 chunk, bool inner)
  * count of 'node'.
  */
 static rt_node *
-rt_copy_node(radix_tree *tree, rt_node *node, int new_kind)
+rt_copy_node(radix_tree *tree, rt_node *node, int new_kind, rt_size_class 
new_size_class)
 {
        rt_node    *newnode;
 
-       newnode = rt_alloc_node(tree, new_kind, node->shift, node->chunk,
+       newnode = rt_alloc_node(tree, new_kind, new_size_class, node->shift, 
node->chunk,
                                                        node->shift > 0);
        newnode->count = node->count;
 
@@ -898,14 +914,22 @@ rt_copy_node(radix_tree *tree, rt_node *node, int 
new_kind)
 static void
 rt_free_node(radix_tree *tree, rt_node *node)
 {
+       int i;
+
        /* If we're deleting the root node, make the tree empty */
        if (tree->root == node)
                tree->root = NULL;
 
 #ifdef RT_DEBUG
        /* update the statistics */
-       tree->cnt[node->kind]--;
-       Assert(tree->cnt[node->kind] >= 0);
+       // FIXME
+       for (i = 0; i < RT_SIZE_CLASS_COUNT; i++)
+       {
+               if (node->fanout == rt_size_class_info[i].fanout)
+                       break;
+       }
+       tree->cnt[i]--;
+       Assert(tree->cnt[i] >= 0);
 #endif
 
        pfree(node);
@@ -954,7 +978,7 @@ rt_extend(radix_tree *tree, uint64 key)
        {
                rt_node_inner_4 *node;
 
-               node = (rt_node_inner_4 *) rt_alloc_node(tree, RT_NODE_KIND_4,
+               node = (rt_node_inner_4 *) rt_alloc_node(tree, RT_NODE_KIND_4, 
RT_CLASS_4_FULL,
                                                                                
                 shift, 0, true);
                node->base.n.count = 1;
                node->base.chunks[0] = 0;
@@ -984,7 +1008,7 @@ rt_set_extend(radix_tree *tree, uint64 key, uint64 value, 
rt_node *parent,
                rt_node    *newchild;
                int                     newshift = shift - RT_NODE_SPAN;
 
-               newchild = rt_alloc_node(tree, RT_NODE_KIND_4, newshift,
+               newchild = rt_alloc_node(tree, RT_NODE_KIND_4,  
RT_CLASS_4_FULL, newshift,
                                                                 
RT_GET_KEY_CHUNK(key, node->shift),
                                                                 newshift > 0);
                rt_node_insert_inner(tree, parent, node, key, newchild);
@@ -1216,7 +1240,7 @@ rt_node_insert_inner(radix_tree *tree, rt_node *parent, 
rt_node *node, uint64 ke
 
                                        /* grow node from 4 to 32 */
                                        new32 = (rt_node_inner_32 *) 
rt_copy_node(tree, (rt_node *) n4,
-                                                                               
                                          RT_NODE_KIND_32);
+                                                                               
                                          RT_NODE_KIND_32, RT_CLASS_32_FULL);
                                        
chunk_children_array_copy(n4->base.chunks, n4->children,
                                                                                
          new32->base.chunks, new32->children,
                                                                                
          n4->base.n.count);
@@ -1262,7 +1286,7 @@ rt_node_insert_inner(radix_tree *tree, rt_node *parent, 
rt_node *node, uint64 ke
 
                                        /* grow node from 32 to 128 */
                                        new128 = (rt_node_inner_128 *) 
rt_copy_node(tree, (rt_node *) n32,
-                                                                               
                                                RT_NODE_KIND_128);
+                                                                               
                                                RT_NODE_KIND_128, 
RT_CLASS_128_FULL);
                                        for (int i = 0; i < n32->base.n.count; 
i++)
                                                node_inner_128_insert(new128, 
n32->base.chunks[i], n32->children[i]);
 
@@ -1305,7 +1329,7 @@ rt_node_insert_inner(radix_tree *tree, rt_node *parent, 
rt_node *node, uint64 ke
 
                                        /* grow node from 128 to 256 */
                                        new256 = (rt_node_inner_256 *) 
rt_copy_node(tree, (rt_node *) n128,
-                                                                               
                                                RT_NODE_KIND_256);
+                                                                               
                                                RT_NODE_KIND_256, RT_CLASS_256);
                                        for (int i = 0; i < RT_NODE_MAX_SLOTS 
&& cnt < n128->base.n.count; i++)
                                        {
                                                if 
(!node_128_is_chunk_used((rt_node_base_128 *) n128, i))
@@ -1332,7 +1356,8 @@ rt_node_insert_inner(radix_tree *tree, rt_node *parent, 
rt_node *node, uint64 ke
                                rt_node_inner_256 *n256 = (rt_node_inner_256 *) 
node;
 
                                chunk_exists = 
node_inner_256_is_chunk_used(n256, chunk);
-                               Assert(chunk_exists || 
NODE_HAS_FREE_SLOT(n256));
+                               Assert(n256->base.n.fanout == 0);
+                               Assert(chunk_exists || ((rt_node *) 
n256)->count < RT_NODE_MAX_SLOTS);
 
                                node_inner_256_set(n256, chunk, child);
                                break;
@@ -1384,7 +1409,7 @@ rt_node_insert_leaf(radix_tree *tree, rt_node *parent, 
rt_node *node,
 
                                        /* grow node from 4 to 32 */
                                        new32 = (rt_node_leaf_32 *) 
rt_copy_node(tree, (rt_node *) n4,
-                                                                               
                                         RT_NODE_KIND_32);
+                                                                               
                                         RT_NODE_KIND_32, RT_CLASS_32_FULL);
                                        
chunk_values_array_copy(n4->base.chunks, n4->values,
                                                                                
        new32->base.chunks, new32->values,
                                                                                
        n4->base.n.count);
@@ -1430,7 +1455,7 @@ rt_node_insert_leaf(radix_tree *tree, rt_node *parent, 
rt_node *node,
 
                                        /* grow node from 32 to 128 */
                                        new128 = (rt_node_leaf_128 *) 
rt_copy_node(tree, (rt_node *) n32,
-                                                                               
                                           RT_NODE_KIND_128);
+                                                                               
                                           RT_NODE_KIND_128, RT_CLASS_128_FULL);
                                        for (int i = 0; i < n32->base.n.count; 
i++)
                                                node_leaf_128_insert(new128, 
n32->base.chunks[i], n32->values[i]);
 
@@ -1473,7 +1498,7 @@ rt_node_insert_leaf(radix_tree *tree, rt_node *parent, 
rt_node *node,
 
                                        /* grow node from 128 to 256 */
                                        new256 = (rt_node_leaf_256 *) 
rt_copy_node(tree, (rt_node *) n128,
-                                                                               
                                           RT_NODE_KIND_256);
+                                                                               
                                           RT_NODE_KIND_256, RT_CLASS_256);
                                        for (int i = 0; i < RT_NODE_MAX_SLOTS 
&& cnt < n128->base.n.count; i++)
                                        {
                                                if 
(!node_128_is_chunk_used((rt_node_base_128 *) n128, i))
@@ -1500,7 +1525,8 @@ rt_node_insert_leaf(radix_tree *tree, rt_node *parent, 
rt_node *node,
                                rt_node_leaf_256 *n256 = (rt_node_leaf_256 *) 
node;
 
                                chunk_exists = 
node_leaf_256_is_chunk_used(n256, chunk);
-                               Assert(chunk_exists || 
NODE_HAS_FREE_SLOT(n256));
+                               Assert(((rt_node *) n256)->fanout == 0);
+                               Assert(chunk_exists || ((rt_node *) 
n256)->count < 256);
 
                                node_leaf_256_set(n256, chunk, value);
                                break;
@@ -1538,16 +1564,16 @@ rt_create(MemoryContext ctx)
        tree->num_keys = 0;
 
        /* Create the slab allocator for each size class */
-       for (int i = 0; i < RT_NODE_KIND_COUNT; i++)
+       for (int i = 0; i < RT_SIZE_CLASS_COUNT; i++)
        {
                tree->inner_slabs[i] = SlabContextCreate(ctx,
-                                                                               
                 rt_node_kind_info[i].name,
-                                                                               
                 rt_node_kind_info[i].inner_blocksize,
-                                                                               
                 rt_node_kind_info[i].inner_size);
+                                                                               
                 rt_size_class_info[i].name,
+                                                                               
                 rt_size_class_info[i].inner_blocksize,
+                                                                               
                 rt_size_class_info[i].inner_size);
                tree->leaf_slabs[i] = SlabContextCreate(ctx,
-                                                                               
                rt_node_kind_info[i].name,
-                                                                               
                rt_node_kind_info[i].leaf_blocksize,
-                                                                               
                rt_node_kind_info[i].leaf_size);
+                                                                               
                rt_size_class_info[i].name,
+                                                                               
                rt_size_class_info[i].leaf_blocksize,
+                                                                               
                rt_size_class_info[i].leaf_size);
 #ifdef RT_DEBUG
                tree->cnt[i] = 0;
 #endif
@@ -1564,7 +1590,7 @@ rt_create(MemoryContext ctx)
 void
 rt_free(radix_tree *tree)
 {
-       for (int i = 0; i < RT_NODE_KIND_COUNT; i++)
+       for (int i = 0; i < RT_SIZE_CLASS_COUNT; i++)
        {
                MemoryContextDelete(tree->inner_slabs[i]);
                MemoryContextDelete(tree->leaf_slabs[i]);
@@ -2076,7 +2102,7 @@ rt_memory_usage(radix_tree *tree)
 {
        Size            total = sizeof(radix_tree);
 
-       for (int i = 0; i < RT_NODE_KIND_COUNT; i++)
+       for (int i = 0; i < RT_SIZE_CLASS_COUNT; i++)
        {
                total += MemoryContextMemAllocated(tree->inner_slabs[i], true);
                total += MemoryContextMemAllocated(tree->leaf_slabs[i], true);
@@ -2166,10 +2192,10 @@ rt_stats(radix_tree *tree)
        ereport(LOG, (errmsg("num_keys = %lu, height = %u, n4 = %u, n32 = %u, 
n128 = %u, n256 = %u",
                                                 tree->num_keys,
                                                 tree->root->shift / 
RT_NODE_SPAN,
-                                                tree->cnt[0],
-                                                tree->cnt[1],
-                                                tree->cnt[2],
-                                                tree->cnt[3])));
+                                                tree->cnt[RT_CLASS_4_FULL],
+                                                tree->cnt[RT_CLASS_32_FULL],
+                                                tree->cnt[RT_CLASS_128_FULL],
+                                                tree->cnt[RT_CLASS_256])));
 }
 
 static void
@@ -2177,11 +2203,12 @@ rt_dump_node(rt_node *node, int level, bool recurse)
 {
        char            space[128] = {0};
 
-       fprintf(stderr, "[%s] kind %d, count %u, shift %u, chunk 0x%X:\n",
+       fprintf(stderr, "[%s] kind %d, fanout %d, count %u, shift %u, chunk 
0x%X:\n",
                        NODE_IS_LEAF(node) ? "LEAF" : "INNR",
                        (node->kind == RT_NODE_KIND_4) ? 4 :
                        (node->kind == RT_NODE_KIND_32) ? 32 :
                        (node->kind == RT_NODE_KIND_128) ? 128 : 256,
+                       node->fanout == 0 ? 256 : node->fanout,
                        node->count, node->shift, node->chunk);
 
        if (level > 0)
@@ -2384,13 +2411,13 @@ rt_dump_search(radix_tree *tree, uint64 key)
 void
 rt_dump(radix_tree *tree)
 {
-       for (int i = 0; i < RT_NODE_KIND_COUNT; i++)
+       for (int i = 0; i < RT_SIZE_CLASS_COUNT; i++)
                fprintf(stderr, "%s\tinner_size%lu\tinner_blocksize 
%lu\tleaf_size %lu\tleaf_blocksize %lu\n",
-                               rt_node_kind_info[i].name,
-                               rt_node_kind_info[i].inner_size,
-                               rt_node_kind_info[i].inner_blocksize,
-                               rt_node_kind_info[i].leaf_size,
-                               rt_node_kind_info[i].leaf_blocksize);
+                               rt_size_class_info[i].name,
+                               rt_size_class_info[i].inner_size,
+                               rt_size_class_info[i].inner_blocksize,
+                               rt_size_class_info[i].leaf_size,
+                               rt_size_class_info[i].leaf_blocksize);
        fprintf(stderr, "max_val = %lu\n", tree->max_val);
 
        if (!tree->root)
diff --git a/src/test/modules/test_radixtree/test_radixtree.c 
b/src/test/modules/test_radixtree/test_radixtree.c
index cb3596755d..de1cd6cd70 100644
--- a/src/test/modules/test_radixtree/test_radixtree.c
+++ b/src/test/modules/test_radixtree/test_radixtree.c
@@ -40,7 +40,6 @@ static const bool rt_test_stats = false;
 /* The maximum number of entries each node type can have */
 static int     rt_node_max_entries[] = {
        4,                                                      /* 
RT_NODE_KIND_4 */
-       16,                                                     /* 
RT_NODE_KIND_16 */
        32,                                                     /* 
RT_NODE_KIND_32 */
        128,                                            /* RT_NODE_KIND_128 */
        256                                                     /* 
RT_NODE_KIND_256 */
-- 
2.38.1

Reply via email to