On Sat, Mar 27, 2021 at 4:52 AM Andrey Borodin <x4...@yandex-team.ru> wrote: > Some thoughts on HashTable patch: > 1. Can we allocate bigger hashtable to reduce probability of collisions?
Yeah, good idea, might require some study. > 2. Can we use specialised hashtable for this case? I'm afraid hash_search() > does comparable number of CPU cycles as simple cycle from 0 to 128. We could > inline everything and avoid hashp->hash(keyPtr, hashp->keysize) call. I'm not > insisting on special hash though, just an idea. I tried really hard to not fall into this rabbit h.... [hack hack hack], OK, here's a first attempt to use simplehash, Andres's steampunk macro-based robinhood template that we're already using for several other things, and murmurhash which is inlineable and branch-free. I had to tweak it to support "in-place" creation and fixed size (in other words, no allocators, for use in shared memory). Then I was annoyed that I had to add a "status" member to our struct, so I tried to fix that. Definitely needs more work to think about failure modes when running out of memory, how much spare space you need, etc. I have not experimented with this much beyond hacking until the tests pass, but it *should* be more efficient... > 3. pageno in SlruMappingTableEntry seems to be unused. It's the key (dynahash uses the first N bytes of your struct as the key, but in this new simplehash version it's more explicit).
From 5f5d4ed8ae2808766ac1fd48f68602ef530e3833 Mon Sep 17 00:00:00 2001 From: Andrey Borodin <amborodin@acm.org> Date: Mon, 15 Feb 2021 21:51:56 +0500 Subject: [PATCH v13 1/5] Make all SLRU buffer sizes configurable. Provide new GUCs to set the number of buffers, instead of using hard coded defaults. Author: Andrey M. Borodin <x4mmm@yandex-team.ru> Reviewed-by: Anastasia Lubennikova <a.lubennikova@postgrespro.ru> Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Gilles Darold <gilles@darold.net> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/2BEC2B3F-9B61-4C1D-9FB5-5FAB0F05EF86%40yandex-team.ru --- doc/src/sgml/config.sgml | 137 ++++++++++++++++++ src/backend/access/transam/clog.c | 6 + src/backend/access/transam/commit_ts.c | 5 +- src/backend/access/transam/multixact.c | 8 +- src/backend/access/transam/subtrans.c | 5 +- src/backend/commands/async.c | 8 +- src/backend/storage/lmgr/predicate.c | 4 +- src/backend/utils/init/globals.c | 8 + src/backend/utils/misc/guc.c | 77 ++++++++++ src/backend/utils/misc/postgresql.conf.sample | 9 ++ src/include/access/multixact.h | 4 - src/include/access/subtrans.h | 3 - src/include/commands/async.h | 5 - src/include/miscadmin.h | 7 + src/include/storage/predicate.h | 4 - 15 files changed, 261 insertions(+), 29 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index ddc6d789d8..f1112bfa9c 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1886,6 +1886,143 @@ include_dir 'conf.d' </para> </listitem> </varlistentry> + + <varlistentry id="guc-multixact-offsets-buffers" xreflabel="multixact_offsets_buffers"> + <term><varname>multixact_offsets_buffers</varname> (<type>integer</type>) + <indexterm> + <primary><varname>multixact_offsets_buffers</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the amount of shared memory to used to cache the contents + of <literal>pg_multixact/offsets</literal> (see + <xref linkend="pgdata-contents-table"/>). + If this value is specified without units, it is taken as blocks, + that is <symbol>BLCKSZ</symbol> bytes, typically 8kB. + The default value is <literal>8</literal>. + This parameter can only be set at server start. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-multixact-members-buffers" xreflabel="multixact_members_buffers"> + <term><varname>multixact_members_buffers</varname> (<type>integer</type>) + <indexterm> + <primary><varname>multixact_members_buffers</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the amount of shared memory to used to cache the contents + of <literal>pg_multixact/members</literal> (see + <xref linkend="pgdata-contents-table"/>). + If this value is specified without units, it is taken as blocks, + that is <symbol>BLCKSZ</symbol> bytes, typically 8kB. + The default value is <literal>16</literal>. + This parameter can only be set at server start. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-subtrans-buffers" xreflabel="subtrans_buffers"> + <term><varname>subtrans_buffers</varname> (<type>integer</type>) + <indexterm> + <primary><varname>subtrans_buffers</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the amount of shared memory to used to cache the contents + of <literal>pg_subtrans</literal> (see + <xref linkend="pgdata-contents-table"/>). + If this value is specified without units, it is taken as blocks, + that is <symbol>BLCKSZ</symbol> bytes, typically 8kB. + The default value is <literal>8</literal>. + This parameter can only be set at server start. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-notify-buffers" xreflabel="notify_buffers"> + <term><varname>notify_buffers</varname> (<type>integer</type>) + <indexterm> + <primary><varname>notify_buffers</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the amount of shared memory to used to cache the contents + of <literal>pg_notify</literal> (see + <xref linkend="pgdata-contents-table"/>). + If this value is specified without units, it is taken as blocks, + that is <symbol>BLCKSZ</symbol> bytes, typically 8kB. + The default value is <literal>8</literal>. + This parameter can only be set at server start. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-serial-buffers" xreflabel="serial_buffers"> + <term><varname>serial_buffers</varname> (<type>integer</type>) + <indexterm> + <primary><varname>serial_buffers</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the amount of shared memory to used to cache the contents + of <literal>pg_serial</literal> (see + <xref linkend="pgdata-contents-table"/>). + If this value is specified without units, it is taken as blocks, + that is <symbol>BLCKSZ</symbol> bytes, typically 8kB. + The default value is <literal>16</literal>. + This parameter can only be set at server start. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-clog-buffers" xreflabel="clog_buffers"> + <term><varname>clog_buffers</varname> (<type>integer</type>) + <indexterm> + <primary><varname>clog_buffers</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the amount of shared memory to used to cache the contents + of <literal>pg_xact</literal> (see + <xref linkend="pgdata-contents-table"/>). + If this value is specified without units, it is taken as blocks, + that is <symbol>BLCKSZ</symbol> bytes, typically 8kB. + The default value is <literal>0</literal>, which requests + <varname>shared_buffers</varname> / 512, but not more than 128 or + fewer than 4 blocks. + This parameter can only be set at server start. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-commit-ts-buffers" xreflabel="commit_ts_buffers"> + <term><varname>commit_ts_buffers</varname> (<type>integer</type>) + <indexterm> + <primary><varname>commit_ts_buffers</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the amount of memory to be used to cache the cotents of + <literal>pg_commit_ts</literal> (see + <xref linkend="pgdata-contents-table"/>). + If this value is specified without units, it is taken as blocks, + that is <symbol>BLCKSZ</symbol> bytes, typically 8kB. + The default value is <literal>0</literal>, which requests + <varname>shared_buffers</varname> / 512, but not more than 128 or + fewer than 16 blocks. + This parameter can only be set at server start. + </para> + </listitem> + </varlistentry> <varlistentry id="guc-max-stack-depth" xreflabel="max_stack_depth"> <term><varname>max_stack_depth</varname> (<type>integer</type>) diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c index 6fa4713fb4..0318e8ff59 100644 --- a/src/backend/access/transam/clog.c +++ b/src/backend/access/transam/clog.c @@ -659,6 +659,9 @@ TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn) /* * Number of shared CLOG buffers. * + * If values is configured via GUC - just use given value. Otherwise + * apply following euristics. + * * On larger multi-processor systems, it is possible to have many CLOG page * requests in flight at one time which could lead to disk access for CLOG * page if the required page is not found in memory. Testing revealed that we @@ -675,6 +678,9 @@ TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn) Size CLOGShmemBuffers(void) { + /* consider 0 and 1 as unset GUC */ + if (clog_buffers > 1) + return clog_buffers; return Min(128, Max(4, NBuffers / 512)); } diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c index 268bdba339..0d2632b90e 100644 --- a/src/backend/access/transam/commit_ts.c +++ b/src/backend/access/transam/commit_ts.c @@ -530,7 +530,10 @@ pg_xact_commit_timestamp_origin(PG_FUNCTION_ARGS) Size CommitTsShmemBuffers(void) { - return Min(16, Max(4, NBuffers / 1024)); + /* consider 0 and 1 as unset GUC */ + if (commit_ts_buffers > 1) + return commit_ts_buffers; + return Min(16, Max(4, NBuffers / 512)); } /* diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c index 1f9f1a1fa1..21787765e2 100644 --- a/src/backend/access/transam/multixact.c +++ b/src/backend/access/transam/multixact.c @@ -1831,8 +1831,8 @@ MultiXactShmemSize(void) mul_size(sizeof(MultiXactId) * 2, MaxOldestSlot)) size = SHARED_MULTIXACT_STATE_SIZE; - size = add_size(size, SimpleLruShmemSize(NUM_MULTIXACTOFFSET_BUFFERS, 0)); - size = add_size(size, SimpleLruShmemSize(NUM_MULTIXACTMEMBER_BUFFERS, 0)); + size = add_size(size, SimpleLruShmemSize(multixact_offsets_buffers, 0)); + size = add_size(size, SimpleLruShmemSize(multixact_members_buffers, 0)); return size; } @@ -1848,13 +1848,13 @@ MultiXactShmemInit(void) MultiXactMemberCtl->PagePrecedes = MultiXactMemberPagePrecedes; SimpleLruInit(MultiXactOffsetCtl, - "MultiXactOffset", NUM_MULTIXACTOFFSET_BUFFERS, 0, + "MultiXactOffset", multixact_offsets_buffers, 0, MultiXactOffsetSLRULock, "pg_multixact/offsets", LWTRANCHE_MULTIXACTOFFSET_BUFFER, SYNC_HANDLER_MULTIXACT_OFFSET); SlruPagePrecedesUnitTests(MultiXactOffsetCtl, MULTIXACT_OFFSETS_PER_PAGE); SimpleLruInit(MultiXactMemberCtl, - "MultiXactMember", NUM_MULTIXACTMEMBER_BUFFERS, 0, + "MultiXactMember", multixact_offsets_buffers, 0, MultiXactMemberSLRULock, "pg_multixact/members", LWTRANCHE_MULTIXACTMEMBER_BUFFER, SYNC_HANDLER_MULTIXACT_MEMBER); diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c index 6a8e521f89..785f2520fd 100644 --- a/src/backend/access/transam/subtrans.c +++ b/src/backend/access/transam/subtrans.c @@ -31,6 +31,7 @@ #include "access/slru.h" #include "access/subtrans.h" #include "access/transam.h" +#include "miscadmin.h" #include "pg_trace.h" #include "utils/snapmgr.h" @@ -184,14 +185,14 @@ SubTransGetTopmostTransaction(TransactionId xid) Size SUBTRANSShmemSize(void) { - return SimpleLruShmemSize(NUM_SUBTRANS_BUFFERS, 0); + return SimpleLruShmemSize(subtrans_buffers, 0); } void SUBTRANSShmemInit(void) { SubTransCtl->PagePrecedes = SubTransPagePrecedes; - SimpleLruInit(SubTransCtl, "Subtrans", NUM_SUBTRANS_BUFFERS, 0, + SimpleLruInit(SubTransCtl, "Subtrans", subtrans_buffers, 0, SubtransSLRULock, "pg_subtrans", LWTRANCHE_SUBTRANS_BUFFER, SYNC_HANDLER_NONE); SlruPagePrecedesUnitTests(SubTransCtl, SUBTRANS_XACTS_PER_PAGE); diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c index 4b16fb5682..de17f52cd7 100644 --- a/src/backend/commands/async.c +++ b/src/backend/commands/async.c @@ -107,7 +107,7 @@ * frontend during startup.) The above design guarantees that notifies from * other backends will never be missed by ignoring self-notifies. * - * The amount of shared memory used for notify management (NUM_NOTIFY_BUFFERS) + * The amount of shared memory used for notify management (notify_buffers) * can be varied without affecting anything but performance. The maximum * amount of notification data that can be queued at one time is determined * by slru.c's wraparound limit; see QUEUE_MAX_PAGE below. @@ -225,7 +225,7 @@ typedef struct QueuePosition * * Resist the temptation to make this really large. While that would save * work in some places, it would add cost in others. In particular, this - * should likely be less than NUM_NOTIFY_BUFFERS, to ensure that backends + * should likely be less than notify_buffers, to ensure that backends * catch up before the pages they'll need to read fall out of SLRU cache. */ #define QUEUE_CLEANUP_DELAY 4 @@ -514,7 +514,7 @@ AsyncShmemSize(void) size = mul_size(MaxBackends + 1, sizeof(QueueBackendStatus)); size = add_size(size, offsetof(AsyncQueueControl, backend)); - size = add_size(size, SimpleLruShmemSize(NUM_NOTIFY_BUFFERS, 0)); + size = add_size(size, SimpleLruShmemSize(notify_buffers, 0)); return size; } @@ -562,7 +562,7 @@ AsyncShmemInit(void) * Set up SLRU management of the pg_notify data. */ NotifyCtl->PagePrecedes = asyncQueuePagePrecedes; - SimpleLruInit(NotifyCtl, "Notify", NUM_NOTIFY_BUFFERS, 0, + SimpleLruInit(NotifyCtl, "Notify", notify_buffers, 0, NotifySLRULock, "pg_notify", LWTRANCHE_NOTIFY_BUFFER, SYNC_HANDLER_NONE); diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c index d493aeef0f..b1f4f1651d 100644 --- a/src/backend/storage/lmgr/predicate.c +++ b/src/backend/storage/lmgr/predicate.c @@ -872,7 +872,7 @@ SerialInit(void) */ SerialSlruCtl->PagePrecedes = SerialPagePrecedesLogically; SimpleLruInit(SerialSlruCtl, "Serial", - NUM_SERIAL_BUFFERS, 0, SerialSLRULock, "pg_serial", + serial_buffers, 0, SerialSLRULock, "pg_serial", LWTRANCHE_SERIAL_BUFFER, SYNC_HANDLER_NONE); #ifdef USE_ASSERT_CHECKING SerialPagePrecedesLogicallyUnitTests(); @@ -1395,7 +1395,7 @@ PredicateLockShmemSize(void) /* Shared memory structures for SLRU tracking of old committed xids. */ size = add_size(size, sizeof(SerialControlData)); - size = add_size(size, SimpleLruShmemSize(NUM_SERIAL_BUFFERS, 0)); + size = add_size(size, SimpleLruShmemSize(serial_buffers, 0)); return size; } diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c index 73e0a672ae..e90275be6c 100644 --- a/src/backend/utils/init/globals.c +++ b/src/backend/utils/init/globals.c @@ -148,3 +148,11 @@ int64 VacuumPageDirty = 0; int VacuumCostBalance = 0; /* working state for vacuum */ bool VacuumCostActive = false; + +int multixact_offsets_buffers = 8; +int multixact_members_buffers = 16; +int subtrans_buffers = 32; +int notify_buffers = 8; +int serial_buffers = 16; +int clog_buffers = 0; +int commit_ts_buffers = 0; diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 0c5dc4d3e8..003bc820d2 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -2305,6 +2305,83 @@ static struct config_int ConfigureNamesInt[] = NULL, NULL, NULL }, + { + {"multixact_offsets_buffers", PGC_POSTMASTER, RESOURCES_MEM, + gettext_noop("Sets the number of shared memory buffers used for MultiXact offsets SLRU."), + NULL, + GUC_UNIT_BLOCKS + }, + &multixact_offsets_buffers, + 8, 2, INT_MAX / 2, + NULL, NULL, NULL + }, + + { + {"multixact_members_buffers", PGC_POSTMASTER, RESOURCES_MEM, + gettext_noop("Sets the number of shared memory buffers used for MultiXact members SLRU."), + NULL, + GUC_UNIT_BLOCKS + }, + &multixact_members_buffers, + 16, 2, INT_MAX / 2, + NULL, NULL, NULL + }, + + { + {"subtrans_buffers", PGC_POSTMASTER, RESOURCES_MEM, + gettext_noop("Sets the number of shared memory buffers used for substransactions SLRU."), + NULL, + GUC_UNIT_BLOCKS + }, + &subtrans_buffers, + 32, 2, INT_MAX / 2, + NULL, NULL, NULL + }, + + { + {"notify_buffers", PGC_POSTMASTER, RESOURCES_MEM, + gettext_noop("Sets the number of shared memory buffers used for asyncronous notifications SLRU."), + NULL, + GUC_UNIT_BLOCKS + }, + ¬ify_buffers, + 8, 2, INT_MAX / 2, + NULL, NULL, NULL + }, + + { + {"serial_buffers", PGC_POSTMASTER, RESOURCES_MEM, + gettext_noop("Sets the number of shared memory buffers used for predicate locks SLRU."), + NULL, + GUC_UNIT_BLOCKS + }, + &serial_buffers, + 16, 2, INT_MAX / 2, + NULL, NULL, NULL + }, + + { + {"clog_buffers", PGC_POSTMASTER, RESOURCES_MEM, + gettext_noop("Sets the number of shared memory buffers used for commit log SLRU."), + NULL, + GUC_UNIT_BLOCKS + }, + &clog_buffers, + 0, 0, INT_MAX / 2, + NULL, NULL, NULL + }, + + { + {"commit_ts_buffers", PGC_POSTMASTER, RESOURCES_MEM, + gettext_noop("Sets the number of shared memory buffers used for commit timestamps SLRU."), + NULL, + GUC_UNIT_BLOCKS + }, + &commit_ts_buffers, + 0, 0, INT_MAX / 2, + NULL, NULL, NULL + }, + { {"temp_buffers", PGC_USERSET, RESOURCES_MEM, gettext_noop("Sets the maximum number of temporary buffers used by each session."), diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index b234a6bfe6..1b8515989b 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -190,6 +190,15 @@ # (change requires restart) #backend_flush_after = 0 # measured in pages, 0 disables +# - SLRU Buffers (change requires restart) - + +#clog_buffers = 0 # memory for pg_xact (0 = auto) +#subtrans_buffers = 32 # memory for pg_subtrans +#multixact_offsets_buffers = 8 # memory for pg_multixact/offsets +#multixact_members_buffers = 16 # memory for pg_multixact/members +#notify_buffers = 8 # memory for pg_nofity +#serial_buffers = 16 # memory for pg_serial +#commit_ts_buffers = 0 # memory for pg_commit_ts (0 = auto) #------------------------------------------------------------------------------ # WRITE-AHEAD LOG diff --git a/src/include/access/multixact.h b/src/include/access/multixact.h index 4bbb035eae..97c0a46376 100644 --- a/src/include/access/multixact.h +++ b/src/include/access/multixact.h @@ -29,10 +29,6 @@ #define MaxMultiXactOffset ((MultiXactOffset) 0xFFFFFFFF) -/* Number of SLRU buffers to use for multixact */ -#define NUM_MULTIXACTOFFSET_BUFFERS 8 -#define NUM_MULTIXACTMEMBER_BUFFERS 16 - /* * Possible multixact lock modes ("status"). The first four modes are for * tuple locks (FOR KEY SHARE, FOR SHARE, FOR NO KEY UPDATE, FOR UPDATE); the diff --git a/src/include/access/subtrans.h b/src/include/access/subtrans.h index d0ab44ae82..ca0999056e 100644 --- a/src/include/access/subtrans.h +++ b/src/include/access/subtrans.h @@ -11,9 +11,6 @@ #ifndef SUBTRANS_H #define SUBTRANS_H -/* Number of SLRU buffers to use for subtrans */ -#define NUM_SUBTRANS_BUFFERS 32 - extern void SubTransSetParent(TransactionId xid, TransactionId parent); extern TransactionId SubTransGetParent(TransactionId xid); extern TransactionId SubTransGetTopmostTransaction(TransactionId xid); diff --git a/src/include/commands/async.h b/src/include/commands/async.h index 9217f66b91..fa831e3721 100644 --- a/src/include/commands/async.h +++ b/src/include/commands/async.h @@ -15,11 +15,6 @@ #include <signal.h> -/* - * The number of SLRU page buffers we use for the notification queue. - */ -#define NUM_NOTIFY_BUFFERS 8 - extern bool Trace_notify; extern volatile sig_atomic_t notifyInterruptPending; diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h index 013850ac28..9c325b4312 100644 --- a/src/include/miscadmin.h +++ b/src/include/miscadmin.h @@ -162,6 +162,13 @@ extern PGDLLIMPORT int MaxBackends; extern PGDLLIMPORT int MaxConnections; extern PGDLLIMPORT int max_worker_processes; extern PGDLLIMPORT int max_parallel_workers; +extern PGDLLIMPORT int multixact_offsets_buffers; +extern PGDLLIMPORT int multixact_members_buffers; +extern PGDLLIMPORT int subtrans_buffers; +extern PGDLLIMPORT int notify_buffers; +extern PGDLLIMPORT int serial_buffers; +extern PGDLLIMPORT int clog_buffers; +extern PGDLLIMPORT int commit_ts_buffers; extern PGDLLIMPORT int MyProcPid; extern PGDLLIMPORT pg_time_t MyStartTime; diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h index 152b698611..c72779bd88 100644 --- a/src/include/storage/predicate.h +++ b/src/include/storage/predicate.h @@ -26,10 +26,6 @@ extern int max_predicate_locks_per_xact; extern int max_predicate_locks_per_relation; extern int max_predicate_locks_per_page; - -/* Number of SLRU buffers to use for Serial SLRU */ -#define NUM_SERIAL_BUFFERS 16 - /* * A handle used for sharing SERIALIZABLEXACT objects between the participants * in a parallel query. -- 2.30.1
From 3441027b9ac2f135dea7ec155503be9d6331dfc1 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.munro@gmail.com> Date: Sat, 27 Mar 2021 08:34:58 +1300 Subject: [PATCH v13 2/5] Make simplehash easy to use in shmem. Allow "in-place" creation of a simplehash hash table of fixed size, suitable for use in shared memory. No calling out to allocators, and no ability to grow. --- src/include/lib/simplehash.h | 56 +++++++++++++++++++++++++++++++++--- 1 file changed, 52 insertions(+), 4 deletions(-) diff --git a/src/include/lib/simplehash.h b/src/include/lib/simplehash.h index 395be1ca9a..32d3fa58fe 100644 --- a/src/include/lib/simplehash.h +++ b/src/include/lib/simplehash.h @@ -120,6 +120,7 @@ #define SH_ALLOCATE SH_MAKE_NAME(allocate) #define SH_FREE SH_MAKE_NAME(free) #define SH_STAT SH_MAKE_NAME(stat) +#define SH_ESTIMATE_SIZE SH_MAKE_NAME(estimate_size) /* internal helper functions (no externally visible prototypes) */ #define SH_COMPUTE_PARAMETERS SH_MAKE_NAME(compute_parameters) @@ -153,16 +154,22 @@ typedef struct SH_TYPE /* boundary after which to grow hashtable */ uint32 grow_threshold; +#ifndef SH_IN_PLACE /* hash buckets */ SH_ELEMENT_TYPE *data; +#endif -#ifndef SH_RAW_ALLOCATOR +#if !defined(SH_RAW_ALLOCATOR) && !defined(SH_IN_PLACE) /* memory context to use for allocations */ MemoryContext ctx; #endif /* user defined data, useful for callbacks */ void *private_data; + +#ifdef SH_IN_PLACE + SH_ELEMENT_TYPE data[FLEXIBLE_ARRAY_MEMBER]; +#endif } SH_TYPE; typedef enum SH_STATUS @@ -182,6 +189,11 @@ typedef struct SH_ITERATOR #ifdef SH_RAW_ALLOCATOR /* <prefix>_hash <prefix>_create(uint32 nelements, void *private_data) */ SH_SCOPE SH_TYPE *SH_CREATE(uint32 nelements, void *private_data); +#elif defined(SH_IN_PLACE) +/* size_t <prefix>_estimate_size(uint32 nelements) */ +SH_SCOPE size_t SH_ESTIMATE_SIZE(uint32 nelements); +/* void <prefix>_create(<prefix>_hash *place, uint32 nelements, void *private_data) */ +SH_SCOPE void SH_CREATE(SH_TYPE *place, uint32 nelements, void *private_data); #else /* * <prefix>_hash <prefix>_create(MemoryContext ctx, uint32 nelements, @@ -191,14 +203,18 @@ SH_SCOPE SH_TYPE *SH_CREATE(MemoryContext ctx, uint32 nelements, void *private_data); #endif +#ifndef SH_IN_PLACE /* void <prefix>_destroy(<prefix>_hash *tb) */ SH_SCOPE void SH_DESTROY(SH_TYPE * tb); +#endif /* void <prefix>_reset(<prefix>_hash *tb) */ SH_SCOPE void SH_RESET(SH_TYPE * tb); +#ifndef SH_IN_PLACE /* void <prefix>_grow(<prefix>_hash *tb) */ SH_SCOPE void SH_GROW(SH_TYPE * tb, uint32 newsize); +#endif /* <element> *<prefix>_insert(<prefix>_hash *tb, <key> key, bool *found) */ SH_SCOPE SH_ELEMENT_TYPE *SH_INSERT(SH_TYPE * tb, SH_KEY_TYPE key, bool *found); @@ -241,7 +257,7 @@ SH_SCOPE void SH_STAT(SH_TYPE * tb); /* generate implementation of the hash table */ #ifdef SH_DEFINE -#ifndef SH_RAW_ALLOCATOR +#if !defined(SH_RAW_ALLOCATOR) && !defined(SH_IN_PLACE) #include "utils/memutils.h" #endif @@ -383,11 +399,13 @@ SH_ENTRY_HASH(SH_TYPE * tb, SH_ELEMENT_TYPE * entry) #endif } +#ifndef SH_IN_PLACE /* default memory allocator function */ static inline void *SH_ALLOCATE(SH_TYPE * type, Size size); static inline void SH_FREE(SH_TYPE * type, void *pointer); +#endif -#ifndef SH_USE_NONDEFAULT_ALLOCATOR +#if !defined(SH_USE_NONDEFAULT_ALLOCATOR) && !defined(SH_IN_PLACE) /* default memory allocator function */ static inline void * @@ -410,6 +428,22 @@ SH_FREE(SH_TYPE * type, void *pointer) #endif +#ifdef SH_IN_PLACE +/* + * Compute the amount of memory required for a fixed sized in-place hash table. + */ +SH_SCOPE size_t +SH_ESTIMATE_SIZE(uint32 nelements) +{ + size_t size; + + size = Max(nelements, 2); + size = pg_nextpower2_64(size); + + return offsetof(SH_TYPE, data) + sizeof(SH_ELEMENT_TYPE) * size; +} +#endif + /* * Create a hash table with enough space for `nelements` distinct members. * Memory for the hash table is allocated from the passed-in context. If @@ -422,6 +456,9 @@ SH_FREE(SH_TYPE * type, void *pointer) #ifdef SH_RAW_ALLOCATOR SH_SCOPE SH_TYPE * SH_CREATE(uint32 nelements, void *private_data) +#elif defined(SH_IN_PLACE) +SH_SCOPE void +SH_CREATE(SH_TYPE *place, uint32 nelements, void *private_data) #else SH_SCOPE SH_TYPE * SH_CREATE(MemoryContext ctx, uint32 nelements, void *private_data) @@ -432,6 +469,8 @@ SH_CREATE(MemoryContext ctx, uint32 nelements, void *private_data) #ifdef SH_RAW_ALLOCATOR tb = SH_RAW_ALLOCATOR(sizeof(SH_TYPE)); +#elif defined(SH_IN_PLACE) + tb = place; #else tb = MemoryContextAllocZero(ctx, sizeof(SH_TYPE)); tb->ctx = ctx; @@ -443,11 +482,15 @@ SH_CREATE(MemoryContext ctx, uint32 nelements, void *private_data) SH_COMPUTE_PARAMETERS(tb, size); +#if defined(SH_IN_PLACE) + memset(&tb->data, 0, sizeof(SH_ELEMENT_TYPE) * tb->size); +#else tb->data = SH_ALLOCATE(tb, sizeof(SH_ELEMENT_TYPE) * tb->size); - return tb; +#endif } +#ifndef SH_IN_PLACE /* destroy a previously created hash table */ SH_SCOPE void SH_DESTROY(SH_TYPE * tb) @@ -455,6 +498,7 @@ SH_DESTROY(SH_TYPE * tb) SH_FREE(tb, tb->data); pfree(tb); } +#endif /* reset the contents of a previously created hash table */ SH_SCOPE void @@ -464,6 +508,7 @@ SH_RESET(SH_TYPE * tb) tb->members = 0; } +#ifndef SH_IN_PLACE /* * Grow a hash table to at least `newsize` buckets. * @@ -576,6 +621,7 @@ SH_GROW(SH_TYPE * tb, uint32 newsize) SH_FREE(tb, olddata); } +#endif /* * This is a separate static inline function, so it can be reliably be inlined @@ -592,6 +638,7 @@ SH_INSERT_HASH_INTERNAL(SH_TYPE * tb, SH_KEY_TYPE key, uint32 hash, bool *found) restart: insertdist = 0; +#ifndef SH_IN_PLACE /* * We do the grow check even if the key is actually present, to avoid * doing the check inside the loop. This also lets us avoid having to @@ -614,6 +661,7 @@ restart: SH_GROW(tb, tb->size * 2); /* SH_STAT(tb); */ } +#endif /* perform insert, start bucket search at optimal location */ data = tb->data; -- 2.30.1
From 3bbaf8d558e1e405217e17d1a502ff68556b21e3 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.munro@gmail.com> Date: Sat, 27 Mar 2021 09:04:56 +1300 Subject: [PATCH v13 3/5] Support intrusive status flag in simplehash. Before, you had to include a "status" member in the element type, which simplehash.h could use to detect free space. Allow the user to specify a special key value to use instead, for more compact representation. --- src/include/lib/simplehash.h | 66 ++++++++++++++++++++++++++---------- 1 file changed, 48 insertions(+), 18 deletions(-) diff --git a/src/include/lib/simplehash.h b/src/include/lib/simplehash.h index 32d3fa58fe..05c7ca8a47 100644 --- a/src/include/lib/simplehash.h +++ b/src/include/lib/simplehash.h @@ -131,6 +131,8 @@ #define SH_ENTRY_HASH SH_MAKE_NAME(entry_hash) #define SH_INSERT_HASH_INTERNAL SH_MAKE_NAME(insert_hash_internal) #define SH_LOOKUP_HASH_INTERNAL SH_MAKE_NAME(lookup_hash_internal) +#define SH_ENTRY_IS_EMPTY SH_MAKE_NAME(entry_is_empty) +#define SH_SET_ENTRY_EMPTY SH_MAKE_NAME(set_entry_empty) /* generate forward declarations necessary to use the hash table */ #ifdef SH_DECLARE @@ -172,11 +174,13 @@ typedef struct SH_TYPE #endif } SH_TYPE; +#ifndef SH_IS_EMPTY_KEY typedef enum SH_STATUS { SH_STATUS_EMPTY = 0x00, SH_STATUS_IN_USE = 0x01 } SH_STATUS; +#endif typedef struct SH_ITERATOR { @@ -309,6 +313,26 @@ SH_SCOPE void SH_STAT(SH_TYPE * tb); #endif +static inline bool +SH_ENTRY_IS_EMPTY(SH_TYPE * tb, SH_ELEMENT_TYPE * entry) +{ +#ifdef SH_IS_EMPTY_KEY + return SH_IS_EMPTY_KEY(tb, entry->SH_KEY); +#else + return entry->status == SH_STATUS_EMPTY; +#endif +} + +static inline void +SH_SET_ENTRY_EMPTY(SH_TYPE * tb, SH_ELEMENT_TYPE *entry) +{ +#ifdef SH_EMPTY_KEY + entry->SH_KEY = SH_EMPTY_KEY(tb); +#else + entry->status = SH_STATUS_EMPTY; +#endif +} + /* * Compute sizing parameters for hashtable. Called when creating and growing * the hashtable. @@ -483,7 +507,7 @@ SH_CREATE(MemoryContext ctx, uint32 nelements, void *private_data) SH_COMPUTE_PARAMETERS(tb, size); #if defined(SH_IN_PLACE) - memset(&tb->data, 0, sizeof(SH_ELEMENT_TYPE) * tb->size); + SH_RESET(tb); #else tb->data = SH_ALLOCATE(tb, sizeof(SH_ELEMENT_TYPE) * tb->size); return tb; @@ -504,7 +528,12 @@ SH_DESTROY(SH_TYPE * tb) SH_SCOPE void SH_RESET(SH_TYPE * tb) { +#ifdef SH_EMPTY_KEY + for (size_t i = 0; i < tb->size; ++i) + tb->data[i].SH_KEY = SH_EMPTY_KEY(tb); +#else memset(tb->data, 0, sizeof(SH_ELEMENT_TYPE) * tb->size); +#endif tb->members = 0; } @@ -675,14 +704,17 @@ restart: SH_ELEMENT_TYPE *entry = &data[curelem]; /* any empty bucket can directly be used */ - if (entry->status == SH_STATUS_EMPTY) + if (SH_ENTRY_IS_EMPTY(tb, entry)) { tb->members++; entry->SH_KEY = key; #ifdef SH_STORE_HASH SH_GET_HASH(tb, entry) = hash; #endif +#ifndef SH_IS_EMPTY_KEY entry->status = SH_STATUS_IN_USE; +#endif + Assert(!SH_ENTRY_IS_EMPTY(tb, entry)); *found = false; return entry; } @@ -697,7 +729,7 @@ restart: if (SH_COMPARE_KEYS(tb, hash, key, entry)) { - Assert(entry->status == SH_STATUS_IN_USE); + Assert(!SH_ENTRY_IS_EMPTY(tb, entry)); *found = true; return entry; } @@ -721,7 +753,7 @@ restart: emptyelem = SH_NEXT(tb, emptyelem, startelem); emptyentry = &data[emptyelem]; - if (emptyentry->status == SH_STATUS_EMPTY) + if (SH_ENTRY_IS_EMPTY(tb, emptyentry)) { lastentry = emptyentry; break; @@ -770,7 +802,10 @@ restart: #ifdef SH_STORE_HASH SH_GET_HASH(tb, entry) = hash; #endif +#ifndef SH_IS_EMPTY_KEY entry->status = SH_STATUS_IN_USE; +#endif + Assert(!SH_ENTRY_IS_EMPTY(tb, entry)); *found = false; return entry; } @@ -833,12 +868,8 @@ SH_LOOKUP_HASH_INTERNAL(SH_TYPE * tb, SH_KEY_TYPE key, uint32 hash) { SH_ELEMENT_TYPE *entry = &tb->data[curelem]; - if (entry->status == SH_STATUS_EMPTY) - { + if (SH_ENTRY_IS_EMPTY(tb, entry)) return NULL; - } - - Assert(entry->status == SH_STATUS_IN_USE); if (SH_COMPARE_KEYS(tb, hash, key, entry)) return entry; @@ -891,11 +922,10 @@ SH_DELETE(SH_TYPE * tb, SH_KEY_TYPE key) { SH_ELEMENT_TYPE *entry = &tb->data[curelem]; - if (entry->status == SH_STATUS_EMPTY) + if (SH_ENTRY_IS_EMPTY(tb, entry)) return false; - if (entry->status == SH_STATUS_IN_USE && - SH_COMPARE_KEYS(tb, hash, key, entry)) + if (SH_COMPARE_KEYS(tb, hash, key, entry)) { SH_ELEMENT_TYPE *lastentry = entry; @@ -917,9 +947,9 @@ SH_DELETE(SH_TYPE * tb, SH_KEY_TYPE key) curelem = SH_NEXT(tb, curelem, startelem); curentry = &tb->data[curelem]; - if (curentry->status != SH_STATUS_IN_USE) + if (SH_ENTRY_IS_EMPTY(tb, curentry)) { - lastentry->status = SH_STATUS_EMPTY; + SH_SET_ENTRY_EMPTY(tb, lastentry); break; } @@ -929,7 +959,7 @@ SH_DELETE(SH_TYPE * tb, SH_KEY_TYPE key) /* current is at optimal position, done */ if (curoptimal == curelem) { - lastentry->status = SH_STATUS_EMPTY; + SH_SET_ENTRY_EMPTY(tb, lastentry); break; } @@ -966,7 +996,7 @@ SH_START_ITERATE(SH_TYPE * tb, SH_ITERATOR * iter) { SH_ELEMENT_TYPE *entry = &tb->data[i]; - if (entry->status != SH_STATUS_IN_USE) + if (SH_ENTRY_IS_EMPTY(tb, entry)) { startelem = i; break; @@ -1027,7 +1057,7 @@ SH_ITERATE(SH_TYPE * tb, SH_ITERATOR * iter) if ((iter->cur & tb->sizemask) == (iter->end & tb->sizemask)) iter->done = true; - if (elem->status == SH_STATUS_IN_USE) + if (!SH_ENTRY_IS_EMPTY(tb, elem)) { return elem; } @@ -1063,7 +1093,7 @@ SH_STAT(SH_TYPE * tb) elem = &tb->data[i]; - if (elem->status != SH_STATUS_IN_USE) + if (SH_ENTRY_IS_EMPTY(tb, elem)) continue; hash = SH_ENTRY_HASH(tb, elem); -- 2.30.1
From 970504d304af469e494de6ca83123caf1a683ecf Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.munro@gmail.com> Date: Thu, 25 Mar 2021 10:11:31 +1300 Subject: [PATCH v13 4/5] Add buffer mapping table for SLRUs. Instead of doing a linear search for the buffer holding a given page number, use a hash table. Discussion: https://postgr.es/m/2BEC2B3F-9B61-4C1D-9FB5-5FAB0F05EF86%40yandex-team.ru --- src/backend/access/transam/slru.c | 111 +++++++++++++++++++++++++----- src/include/access/slru.h | 2 + 2 files changed, 96 insertions(+), 17 deletions(-) diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c index 82149ad782..f823c2a0c8 100644 --- a/src/backend/access/transam/slru.c +++ b/src/backend/access/transam/slru.c @@ -58,6 +58,8 @@ #include "pgstat.h" #include "storage/fd.h" #include "storage/shmem.h" +#include "utils/dynahash.h" +#include "utils/hsearch.h" #define SlruFileName(ctl, path, seg) \ snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir, seg) @@ -79,6 +81,12 @@ typedef struct SlruWriteAllData typedef struct SlruWriteAllData *SlruWriteAll; +typedef struct SlruMappingTableEntry +{ + int pageno; + int slotno; +} SlruMappingTableEntry; + /* * Populate a file tag describing a segment file. We only use the segment * number, since we can derive everything else we need by having separate @@ -146,6 +154,9 @@ static int SlruSelectLRUPage(SlruCtl ctl, int pageno); static bool SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int segpage, void *data); static void SlruInternalDeleteSegment(SlruCtl ctl, int segno); +static void SlruMappingAdd(SlruCtl ctl, int pageno, int slotno); +static void SlruMappingRemove(SlruCtl ctl, int pageno); +static int SlruMappingFind(SlruCtl ctl, int pageno); /* * Initialization of shared memory @@ -168,7 +179,8 @@ SimpleLruShmemSize(int nslots, int nlsns) if (nlsns > 0) sz += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr)); /* group_lsn[] */ - return BUFFERALIGN(sz) + BLCKSZ * nslots; + return BUFFERALIGN(sz) + BLCKSZ * nslots + + hash_estimate_size(nslots, sizeof(SlruMappingTableEntry)); } /* @@ -187,6 +199,9 @@ SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns, LWLock *ctllock, const char *subdir, int tranche_id, SyncRequestHandler sync_handler) { + char mapping_table_name[SHMEM_INDEX_KEYSIZE]; + HASHCTL mapping_table_info; + HTAB *mapping_table; SlruShared shared; bool found; @@ -258,11 +273,21 @@ SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns, else Assert(found); + /* Create or find the buffer mapping table. */ + memset(&mapping_table_info, 0, sizeof(mapping_table_info)); + mapping_table_info.keysize = sizeof(int); + mapping_table_info.entrysize = sizeof(SlruMappingTableEntry); + snprintf(mapping_table_name, sizeof(mapping_table_name), + "%s Mapping Table", name); + mapping_table = ShmemInitHash(mapping_table_name, nslots, nslots, + &mapping_table_info, HASH_ELEM | HASH_BLOBS); + /* * Initialize the unshared control struct, including directory path. We * assume caller set PagePrecedes. */ ctl->shared = shared; + ctl->mapping_table = mapping_table; ctl->sync_handler = sync_handler; strlcpy(ctl->Dir, subdir, sizeof(ctl->Dir)); } @@ -289,6 +314,9 @@ SimpleLruZeroPage(SlruCtl ctl, int pageno) shared->page_number[slotno] == pageno); /* Mark the slot as containing this page */ + if (shared->page_status[slotno] != SLRU_PAGE_EMPTY) + SlruMappingRemove(ctl, shared->page_number[slotno]); + SlruMappingAdd(ctl, pageno, slotno); shared->page_number[slotno] = pageno; shared->page_status[slotno] = SLRU_PAGE_VALID; shared->page_dirty[slotno] = true; @@ -362,7 +390,10 @@ SimpleLruWaitIO(SlruCtl ctl, int slotno) { /* indeed, the I/O must have failed */ if (shared->page_status[slotno] == SLRU_PAGE_READ_IN_PROGRESS) + { + SlruMappingRemove(ctl, shared->page_number[slotno]); shared->page_status[slotno] = SLRU_PAGE_EMPTY; + } else /* write_in_progress */ { shared->page_status[slotno] = SLRU_PAGE_VALID; @@ -436,6 +467,9 @@ SimpleLruReadPage(SlruCtl ctl, int pageno, bool write_ok, !shared->page_dirty[slotno])); /* Mark the slot read-busy */ + if (shared->page_status[slotno] != SLRU_PAGE_EMPTY) + SlruMappingRemove(ctl, shared->page_number[slotno]); + SlruMappingAdd(ctl, pageno, slotno); shared->page_number[slotno] = pageno; shared->page_status[slotno] = SLRU_PAGE_READ_IN_PROGRESS; shared->page_dirty[slotno] = false; @@ -459,7 +493,13 @@ SimpleLruReadPage(SlruCtl ctl, int pageno, bool write_ok, shared->page_status[slotno] == SLRU_PAGE_READ_IN_PROGRESS && !shared->page_dirty[slotno]); - shared->page_status[slotno] = ok ? SLRU_PAGE_VALID : SLRU_PAGE_EMPTY; + if (ok) + shared->page_status[slotno] = SLRU_PAGE_VALID; + else + { + SlruMappingRemove(ctl, pageno); + shared->page_status[slotno] = SLRU_PAGE_EMPTY; + } LWLockRelease(&shared->buffer_locks[slotno].lock); @@ -500,20 +540,20 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno, TransactionId xid) LWLockAcquire(shared->ControlLock, LW_SHARED); /* See if page is already in a buffer */ - for (slotno = 0; slotno < shared->num_slots; slotno++) + slotno = SlruMappingFind(ctl, pageno); + if (slotno >= 0 && + shared->page_status[slotno] != SLRU_PAGE_READ_IN_PROGRESS) { - if (shared->page_number[slotno] == pageno && - shared->page_status[slotno] != SLRU_PAGE_EMPTY && - shared->page_status[slotno] != SLRU_PAGE_READ_IN_PROGRESS) - { - /* See comments for SlruRecentlyUsed macro */ - SlruRecentlyUsed(shared, slotno); + Assert(shared->page_status[slotno] != SLRU_PAGE_EMPTY); + Assert(shared->page_number[slotno] == pageno); - /* update the stats counter of pages found in the SLRU */ - pgstat_count_slru_page_hit(shared->slru_stats_idx); + /* See comments for SlruRecentlyUsed macro */ + SlruRecentlyUsed(shared, slotno); - return slotno; - } + /* update the stats counter of pages found in the SLRU */ + pgstat_count_slru_page_hit(shared->slru_stats_idx); + + return slotno; } /* No luck, so switch to normal exclusive lock and do regular read */ @@ -1029,11 +1069,12 @@ SlruSelectLRUPage(SlruCtl ctl, int pageno) int best_invalid_page_number = 0; /* keep compiler quiet */ /* See if page already has a buffer assigned */ - for (slotno = 0; slotno < shared->num_slots; slotno++) + slotno = SlruMappingFind(ctl, pageno); + if (slotno >= 0) { - if (shared->page_number[slotno] == pageno && - shared->page_status[slotno] != SLRU_PAGE_EMPTY) - return slotno; + Assert(shared->page_number[slotno] == pageno); + Assert(shared->page_status[slotno] != SLRU_PAGE_EMPTY); + return slotno; } /* @@ -1266,6 +1307,7 @@ restart:; if (shared->page_status[slotno] == SLRU_PAGE_VALID && !shared->page_dirty[slotno]) { + SlruMappingRemove(ctl, shared->page_number[slotno]); shared->page_status[slotno] = SLRU_PAGE_EMPTY; continue; } @@ -1348,6 +1390,7 @@ restart: if (shared->page_status[slotno] == SLRU_PAGE_VALID && !shared->page_dirty[slotno]) { + SlruMappingRemove(ctl, shared->page_number[slotno]); shared->page_status[slotno] = SLRU_PAGE_EMPTY; continue; } @@ -1609,3 +1652,37 @@ SlruSyncFileTag(SlruCtl ctl, const FileTag *ftag, char *path) errno = save_errno; return result; } + +static int +SlruMappingFind(SlruCtl ctl, int pageno) +{ + SlruMappingTableEntry *mapping; + + mapping = hash_search(ctl->mapping_table, &pageno, HASH_FIND, NULL); + if (mapping) + return mapping->slotno; + + return -1; +} + +static void +SlruMappingAdd(SlruCtl ctl, int pageno, int slotno) +{ + SlruMappingTableEntry *mapping; + bool found PG_USED_FOR_ASSERTS_ONLY; + + mapping = hash_search(ctl->mapping_table, &pageno, HASH_ENTER, &found); + mapping->slotno = slotno; + + Assert(!found); +} + +static void +SlruMappingRemove(SlruCtl ctl, int pageno) +{ + bool found PG_USED_FOR_ASSERTS_ONLY; + + hash_search(ctl->mapping_table, &pageno, HASH_REMOVE, &found); + + Assert(found); +} diff --git a/src/include/access/slru.h b/src/include/access/slru.h index dd52e8cec7..8aa3efc0ee 100644 --- a/src/include/access/slru.h +++ b/src/include/access/slru.h @@ -16,6 +16,7 @@ #include "access/xlogdefs.h" #include "storage/lwlock.h" #include "storage/sync.h" +#include "utils/hsearch.h" /* @@ -110,6 +111,7 @@ typedef SlruSharedData *SlruShared; typedef struct SlruCtlData { SlruShared shared; + HTAB *mapping_table; /* * Which sync handler function to use when handing sync requests over to -- 2.30.1
From ef3c1e45a1f869544280f65d1eeba8544b028735 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.munro@gmail.com> Date: Sat, 27 Mar 2021 09:07:22 +1300 Subject: [PATCH v13 5/5] fixup: use simplehash instead of dynahash --- src/backend/access/transam/slru.c | 41 ++++++++++++++++++++----------- src/include/access/slru.h | 4 ++- 2 files changed, 30 insertions(+), 15 deletions(-) diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c index f823c2a0c8..ba9c0efc81 100644 --- a/src/backend/access/transam/slru.c +++ b/src/backend/access/transam/slru.c @@ -54,12 +54,11 @@ #include "access/slru.h" #include "access/transam.h" #include "access/xlog.h" +#include "common/hashfn.h" #include "miscadmin.h" #include "pgstat.h" #include "storage/fd.h" #include "storage/shmem.h" -#include "utils/dynahash.h" -#include "utils/hsearch.h" #define SlruFileName(ctl, path, seg) \ snprintf(path, MAXPGPATH, "%s/%04X", (ctl)->Dir, seg) @@ -85,8 +84,24 @@ typedef struct SlruMappingTableEntry { int pageno; int slotno; + char status; } SlruMappingTableEntry; +/* Instantiate specialized hash table routines. */ +#define SH_PREFIX smte +#define SH_ELEMENT_TYPE SlruMappingTableEntry +#define SH_KEY_TYPE int +#define SH_KEY pageno +#define SH_HASH_KEY(table, key) murmurhash32(key) +#define SH_EQUAL(table, a, b) a == b +#define SH_IS_EMPTY_KEY(table, pageno) pageno == -1 +#define SH_EMPTY_KEY(table) -1 +#define SH_DECLARE +#define SH_DEFINE +#define SH_SCOPE static inline +#define SH_IN_PLACE +#include "lib/simplehash.h" + /* * Populate a file tag describing a segment file. We only use the segment * number, since we can derive everything else we need by having separate @@ -179,8 +194,7 @@ SimpleLruShmemSize(int nslots, int nlsns) if (nlsns > 0) sz += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr)); /* group_lsn[] */ - return BUFFERALIGN(sz) + BLCKSZ * nslots + - hash_estimate_size(nslots, sizeof(SlruMappingTableEntry)); + return BUFFERALIGN(sz) + BLCKSZ * nslots + smte_estimate_size(nslots); } /* @@ -200,8 +214,7 @@ SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns, SyncRequestHandler sync_handler) { char mapping_table_name[SHMEM_INDEX_KEYSIZE]; - HASHCTL mapping_table_info; - HTAB *mapping_table; + smte_hash *mapping_table; SlruShared shared; bool found; @@ -274,13 +287,13 @@ SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns, Assert(found); /* Create or find the buffer mapping table. */ - memset(&mapping_table_info, 0, sizeof(mapping_table_info)); - mapping_table_info.keysize = sizeof(int); - mapping_table_info.entrysize = sizeof(SlruMappingTableEntry); snprintf(mapping_table_name, sizeof(mapping_table_name), "%s Mapping Table", name); - mapping_table = ShmemInitHash(mapping_table_name, nslots, nslots, - &mapping_table_info, HASH_ELEM | HASH_BLOBS); + mapping_table = ShmemInitStruct(mapping_table_name, + smte_estimate_size(nslots), + &found); + if (!found) + smte_create(mapping_table, nslots, NULL); /* * Initialize the unshared control struct, including directory path. We @@ -1658,7 +1671,7 @@ SlruMappingFind(SlruCtl ctl, int pageno) { SlruMappingTableEntry *mapping; - mapping = hash_search(ctl->mapping_table, &pageno, HASH_FIND, NULL); + mapping = smte_lookup(ctl->mapping_table, pageno); if (mapping) return mapping->slotno; @@ -1671,7 +1684,7 @@ SlruMappingAdd(SlruCtl ctl, int pageno, int slotno) SlruMappingTableEntry *mapping; bool found PG_USED_FOR_ASSERTS_ONLY; - mapping = hash_search(ctl->mapping_table, &pageno, HASH_ENTER, &found); + mapping = smte_insert(ctl->mapping_table, pageno, &found); mapping->slotno = slotno; Assert(!found); @@ -1682,7 +1695,7 @@ SlruMappingRemove(SlruCtl ctl, int pageno) { bool found PG_USED_FOR_ASSERTS_ONLY; - hash_search(ctl->mapping_table, &pageno, HASH_REMOVE, &found); + found = smte_delete(ctl->mapping_table, pageno); Assert(found); } diff --git a/src/include/access/slru.h b/src/include/access/slru.h index 8aa3efc0ee..b3d28ff135 100644 --- a/src/include/access/slru.h +++ b/src/include/access/slru.h @@ -104,6 +104,8 @@ typedef struct SlruSharedData typedef SlruSharedData *SlruShared; +struct smte_hash; + /* * SlruCtlData is an unshared structure that points to the active information * in shared memory. @@ -111,7 +113,7 @@ typedef SlruSharedData *SlruShared; typedef struct SlruCtlData { SlruShared shared; - HTAB *mapping_table; + struct smte_hash *mapping_table; /* * Which sync handler function to use when handing sync requests over to -- 2.30.1