Re: type cache cleanup improvements
On 29/8/2024 11:01, Alexander Korotkov wrote: On Mon, Aug 26, 2024 at 11:26 AM Alexander Korotkov Secondly, I'm not terribly happy with current state of type cache. The caller of lookup_type_cache() might get already invalidated data. This probably OK, because caller probably hold locks on dependent objects to guarantee that relevant properties of type actually persists. At very least this should be documented, but it doesn't seem so. Setting of tupdesc is sensitive to its order of execution. That feels quite fragile to me, and not documented either. I think this area needs improvements before we push additional functionality there. I see fdd965d074 added a proper handling for concurrent invalidation for relation cache. If a concurrent invalidation occurs, we retry building a relation descriptor. Thus, we end up with returning of a valid relation descriptor to caller. I wonder if we can take the same approach to type cache. That would make the whole type cache more consistent and less fragile. Also, this patch will be simpler. I think I understand the solution from the commit fdd965d074. Just for the record, you mentioned invalidation inside the lookup_type_cache above. Passing through the code, I found the only place for such a case - the call of the GetDefaultOpClass, which triggers the opening of the relation pg_opclass, which can cause an AcceptInvalidationMessages call. Did you mean this case, or does a wider field of cases exist here? -- regards, Andrei Lepikhov
Re: type cache cleanup improvements
On Mon, Aug 26, 2024 at 11:26 AM Alexander Korotkov wrote: > > On Mon, Aug 26, 2024 at 9:37 AM Andrei Lepikhov wrote: > > On 25/8/2024 23:22, Alexander Korotkov wrote: > > > On Sun, Aug 25, 2024 at 10:21 PM Alexander Korotkov > > >>> (This Assert is introduced by c14d4acb8.) > > >> > > >> Thank you for noticing. I'm checking this. > > > > > > I didn't take into account that TypeCacheEntry could be invalidated > > > while lookup_type_cache() does syscache lookups. When I realized that > > > I was curious on how does it currently work. It appears that type > > > cache invalidation mostly only clears the flags while values are > > > remaining in place and still available for lookup_type_cache() caller. > > > TypeCacheEntry.tupDesc is invalidated directly, and it has guarantee > > > to survive only because we don't do any syscache lookups for composite > > > data types later in lookup_type_cache(). I'm becoming less fan of how > > > this works... I think these aspects needs to be at least documented > > > in details. > > > > > > Regarding c14d4acb8, it appears to require redesign. I'm going to revert > > > it. > > Sorry, but I don't understand your point. > > Let's refocus on the problem at hand. The issue arose when the > > TypeCacheTypCallback and the TypeCacheRelCallback were executed in > > sequence within InvalidateSystemCachesExtended. > > The first callback cleaned the flags TCFLAGS_HAVE_PG_TYPE_DATA and > > TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS. But the call of the second callback > > checks the typentry->tupDesc and, because it wasn't NULL, attempted to > > remove this record a second time. > > I think there is no case for redesign, but we have a mess in > > insertion/deletion conditions. > > Yes, it's possible to repair the current approach. But we need to do > this correct, not just "not failing with current usages". Then we > need to call insert_rel_type_cache_if_needed() not just when we set > TCFLAGS_HAVE_PG_TYPE_DATA flag, but every time we set any of > TCFLAGS_OPERATOR_FLAGS or tupDesc. That's a lot of places, not as > simple and elegant as it was planned. This is why I wonder if there > is a better approach. > > Secondly, I'm not terribly happy with current state of type cache. > The caller of lookup_type_cache() might get already invalidated data. > This probably OK, because caller probably hold locks on dependent > objects to guarantee that relevant properties of type actually > persists. At very least this should be documented, but it doesn't > seem so. Setting of tupdesc is sensitive to its order of execution. > That feels quite fragile to me, and not documented either. I think > this area needs improvements before we push additional functionality > there. I see fdd965d074 added a proper handling for concurrent invalidation for relation cache. If a concurrent invalidation occurs, we retry building a relation descriptor. Thus, we end up with returning of a valid relation descriptor to caller. I wonder if we can take the same approach to type cache. That would make the whole type cache more consistent and less fragile. Also, this patch will be simpler. -- Regards, Alexander Korotkov Supabase
Re: type cache cleanup improvements
On Mon, Aug 26, 2024 at 9:37 AM Andrei Lepikhov wrote: > On 25/8/2024 23:22, Alexander Korotkov wrote: > > On Sun, Aug 25, 2024 at 10:21 PM Alexander Korotkov > >>> (This Assert is introduced by c14d4acb8.) > >> > >> Thank you for noticing. I'm checking this. > > > > I didn't take into account that TypeCacheEntry could be invalidated > > while lookup_type_cache() does syscache lookups. When I realized that > > I was curious on how does it currently work. It appears that type > > cache invalidation mostly only clears the flags while values are > > remaining in place and still available for lookup_type_cache() caller. > > TypeCacheEntry.tupDesc is invalidated directly, and it has guarantee > > to survive only because we don't do any syscache lookups for composite > > data types later in lookup_type_cache(). I'm becoming less fan of how > > this works... I think these aspects needs to be at least documented > > in details. > > > > Regarding c14d4acb8, it appears to require redesign. I'm going to revert > > it. > Sorry, but I don't understand your point. > Let's refocus on the problem at hand. The issue arose when the > TypeCacheTypCallback and the TypeCacheRelCallback were executed in > sequence within InvalidateSystemCachesExtended. > The first callback cleaned the flags TCFLAGS_HAVE_PG_TYPE_DATA and > TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS. But the call of the second callback > checks the typentry->tupDesc and, because it wasn't NULL, attempted to > remove this record a second time. > I think there is no case for redesign, but we have a mess in > insertion/deletion conditions. Yes, it's possible to repair the current approach. But we need to do this correct, not just "not failing with current usages". Then we need to call insert_rel_type_cache_if_needed() not just when we set TCFLAGS_HAVE_PG_TYPE_DATA flag, but every time we set any of TCFLAGS_OPERATOR_FLAGS or tupDesc. That's a lot of places, not as simple and elegant as it was planned. This is why I wonder if there is a better approach. Secondly, I'm not terribly happy with current state of type cache. The caller of lookup_type_cache() might get already invalidated data. This probably OK, because caller probably hold locks on dependent objects to guarantee that relevant properties of type actually persists. At very least this should be documented, but it doesn't seem so. Setting of tupdesc is sensitive to its order of execution. That feels quite fragile to me, and not documented either. I think this area needs improvements before we push additional functionality there. -- Regards, Alexander Korotkov Supabase
Re: type cache cleanup improvements
On 25/8/2024 23:22, Alexander Korotkov wrote: On Sun, Aug 25, 2024 at 10:21 PM Alexander Korotkov (This Assert is introduced by c14d4acb8.) Thank you for noticing. I'm checking this. I didn't take into account that TypeCacheEntry could be invalidated while lookup_type_cache() does syscache lookups. When I realized that I was curious on how does it currently work. It appears that type cache invalidation mostly only clears the flags while values are remaining in place and still available for lookup_type_cache() caller. TypeCacheEntry.tupDesc is invalidated directly, and it has guarantee to survive only because we don't do any syscache lookups for composite data types later in lookup_type_cache(). I'm becoming less fan of how this works... I think these aspects needs to be at least documented in details. Regarding c14d4acb8, it appears to require redesign. I'm going to revert it. Sorry, but I don't understand your point. Let's refocus on the problem at hand. The issue arose when the TypeCacheTypCallback and the TypeCacheRelCallback were executed in sequence within InvalidateSystemCachesExtended. The first callback cleaned the flags TCFLAGS_HAVE_PG_TYPE_DATA and TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS. But the call of the second callback checks the typentry->tupDesc and, because it wasn't NULL, attempted to remove this record a second time. I think there is no case for redesign, but we have a mess in insertion/deletion conditions. -- regards, Andrei Lepikhov
Re: type cache cleanup improvements
On Sun, Aug 25, 2024 at 10:21 PM Alexander Korotkov wrote: > On Sun, Aug 25, 2024 at 10:00 PM Alexander Lakhin wrote: > > 22.08.2024 19:52, Alexander Korotkov wrotd: > > > If no objections, I'm planning to push this after reverting PARTITION > > > SPLIT/MERGE. > > > > > > > Please try to perform `make check` against a CLOBBER_CACHE_ALWAYS build. > > trilobite failed it: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2024-08-25%2005%3A22%3A07 > > > > and I'm observing the same locally: > > ... > > #5 0x5636d37555f8 in ExceptionalCondition > > (conditionName=0x5636d39b1940 "found", > > fileName=0x5636d39b1308 "typcache.c", lineNumber=3077) at assert.c:66 > > #6 0x5636d37554a4 in delete_rel_type_cache_if_needed > > (typentry=0x5636d41d5d10) at typcache.c:3077 > > #7 0x5636d3754063 in InvalidateCompositeTypeCacheEntry > > (typentry=0x5636d41d5d10) at typcache.c:2355 > > #8 0x5636d37541d3 in TypeCacheRelCallback (arg=0, relid=0) at > > typcache.c:2441 > > ... > > > > (gdb) f 6 > > #6 0x5636d37554a4 in delete_rel_type_cache_if_needed > > (typentry=0x5636d41d5d10) at typcache.c:3077 > > 3077Assert(found); > > (gdb) p found > > $1 = false > > > > (This Assert is introduced by c14d4acb8.) > > Thank you for noticing. I'm checking this. I didn't take into account that TypeCacheEntry could be invalidated while lookup_type_cache() does syscache lookups. When I realized that I was curious on how does it currently work. It appears that type cache invalidation mostly only clears the flags while values are remaining in place and still available for lookup_type_cache() caller. TypeCacheEntry.tupDesc is invalidated directly, and it has guarantee to survive only because we don't do any syscache lookups for composite data types later in lookup_type_cache(). I'm becoming less fan of how this works... I think these aspects needs to be at least documented in details. Regarding c14d4acb8, it appears to require redesign. I'm going to revert it. -- Regards, Alexander Korotkov Supabase
Re: type cache cleanup improvements
On Sun, Aug 25, 2024 at 10:00 PM Alexander Lakhin wrote: > 22.08.2024 19:52, Alexander Korotkov wrotd: > > If no objections, I'm planning to push this after reverting PARTITION > > SPLIT/MERGE. > > > > Please try to perform `make check` against a CLOBBER_CACHE_ALWAYS build. > trilobite failed it: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2024-08-25%2005%3A22%3A07 > > and I'm observing the same locally: > ... > #5 0x5636d37555f8 in ExceptionalCondition (conditionName=0x5636d39b1940 > "found", > fileName=0x5636d39b1308 "typcache.c", lineNumber=3077) at assert.c:66 > #6 0x5636d37554a4 in delete_rel_type_cache_if_needed > (typentry=0x5636d41d5d10) at typcache.c:3077 > #7 0x5636d3754063 in InvalidateCompositeTypeCacheEntry > (typentry=0x5636d41d5d10) at typcache.c:2355 > #8 0x5636d37541d3 in TypeCacheRelCallback (arg=0, relid=0) at > typcache.c:2441 > ... > > (gdb) f 6 > #6 0x5636d37554a4 in delete_rel_type_cache_if_needed > (typentry=0x5636d41d5d10) at typcache.c:3077 > 3077Assert(found); > (gdb) p found > $1 = false > > (This Assert is introduced by c14d4acb8.) Thank you for noticing. I'm checking this. -- Regards, Alexander Korotkov Supabase
Re: type cache cleanup improvements
Hello Alexander, 22.08.2024 19:52, Alexander Korotkov wrotd: If no objections, I'm planning to push this after reverting PARTITION SPLIT/MERGE. Please try to perform `make check` against a CLOBBER_CACHE_ALWAYS build. trilobite failed it: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2024-08-25%2005%3A22%3A07 and I'm observing the same locally: ... #5 0x5636d37555f8 in ExceptionalCondition (conditionName=0x5636d39b1940 "found", fileName=0x5636d39b1308 "typcache.c", lineNumber=3077) at assert.c:66 #6 0x5636d37554a4 in delete_rel_type_cache_if_needed (typentry=0x5636d41d5d10) at typcache.c:3077 #7 0x5636d3754063 in InvalidateCompositeTypeCacheEntry (typentry=0x5636d41d5d10) at typcache.c:2355 #8 0x5636d37541d3 in TypeCacheRelCallback (arg=0, relid=0) at typcache.c:2441 ... (gdb) f 6 #6 0x5636d37554a4 in delete_rel_type_cache_if_needed (typentry=0x5636d41d5d10) at typcache.c:3077 3077 Assert(found); (gdb) p found $1 = false (This Assert is introduced by c14d4acb8.) Best regards, Alexander
Re: type cache cleanup improvements
Hi! On Thu, Aug 22, 2024 at 1:02 PM Pavel Borisov wrote: Looked at v9: > Patch looks good to me. I'd only suggest comments changes: > > "The map from relation's OID to the corresponding composite type OID" -> "The > mapping of relation's OID to the corresponding composite type OID" > "We're keeping the map entry when corresponding typentry have either > TCFLAGS_HAVE_PG_TYPE_DATA, or TCFLAGS_OPERATOR_FLAGS, or tupdesc. That is > we're keeping map entry if the entry has something to clear." -> "We're > keeping the map entry when the corresponding typentry has something to clear > i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or TCFLAGS_OPERATOR_FLAGS, or > tupdesc." > "Invalidate particular TypeCacheEntry on Relcache inval callback" - remove > extra tabs before. Maybe also add empty line above. > "Typically shouldn't be a problem" -> "Typically this shouldn't affect > performance" > "Relid = 0, so we need" -> "Relid is invalid. By convention we need" > "if cleaned TCFLAGS_HAVE_PG_TYPE_DATA flag" -> "if we cleaned > TCFLAGS_HAVE_PG_TYPE_DATA flag previously" > "+/* > + * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the > + * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags, > + * or tupDesc if needed." - remove one "if needed" Thank you for your feedback. I've integrated all your edits except the formatting change of InvalidateCompositeTypeCacheEntry() header comment. I think the functions below have the same formatting of header comments, and it's not necessary to change format. If no objections, I'm planning to push this after reverting PARTITION SPLIT/MERGE. -- Regards, Alexander Korotkov Supabase v10-0001-Avoid-looping-over-all-type-cache-entries-in-Typ.patch Description: Binary data
Re: type cache cleanup improvements
On 21/8/2024 17:28, Alexander Korotkov wrote: I've changed oid -> OID in the comments and in the commit message. I passed through the patch again: no objections and +1 to the changes of comments proposed by Pavel. -- regards, Andrei Lepikhov
Re: type cache cleanup improvements
Hi, Alexander! On Wed, 21 Aug 2024 at 19:29, Alexander Korotkov wrote: > Hi, Pavel! > > > On Wed, Aug 21, 2024 at 4:28 PM Pavel Borisov > wrote: > > I've looked at patch v8. > > > > 1. > > In function check_insert_rel_type_cache() the block: > > > > +#ifdef USE_ASSERT_CHECKING > > + > > + /* > > +* In assert-enabled builds otherwise check for > RelIdToTypeIdCacheHash > > +* entry if it should exist. > > +*/ > > + if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) && > > + typentry->tupDesc == NULL) > > + { > > + boolfound; > > + > > + (void) hash_search(RelIdToTypeIdCacheHash, > > + &typentry->typrelid, > > + HASH_FIND, &found); > > + Assert(found); > > + } > > +#endif > > > > As I understand it does HASH_FIND after the same value just inserted by > HASH_ENT > > ER above under the same if condition: > > > > if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) && > > + typentry->tupDesc == NULL) > > > > Why do we need to do this re-check HASH_ENTER? Also I see "otherwise" in > comment in a quoted block, but if condition is the same. > > Yep, these are remains from one of my previous attempt. No sense to > check for HASH_FIND right after HASH_ENTER. Removed. > > > 2. > > For function check_delete_rel_type_cache(): > > I'd modify the block: > > +#ifdef USE_ASSERT_CHECKING > > + > > + /* > > +* In assert-enabled builds otherwise check for > RelIdToTypeIdCacheHash > > +* entry if it should exist. > > +*/ > > + if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) || > > + (typentry->flags & TCFLAGS_OPERATOR_FLAGS) || > > + typentry->tupDesc != NULL) > > + { > > + boolfound; > > + > > + (void) hash_search(RelIdToTypeIdCacheHash, > > + &typentry->typrelid, > > + HASH_FIND, &found); > > + Assert(found); > > + } > > +#endif > > > > as: > > + > > + /* > > +* In assert-enabled builds otherwise check for > RelIdToTypeIdCacheHash > > +* entry if it should exist. > > +*/ > > + else > > +{ > > + #ifdef USE_ASSERT_CHECKING > > + boolfound; > > + > > + (void) hash_search(RelIdToTypeIdCacheHash, > > + &typentry->typrelid, > > + HASH_FIND, &found); > > + Assert(found); > > +#endif > > +} > > Changed in the way you proposed, except I put the comment inside the > #ifdef. I this it's easier to understand this way. > > > 3. I think check_delete_rel_type_cache and check_insert_rel_type_cache > are better to be renamed to be more clear, though I don't have exact > proposals yet, > > Renamed to delete_rel_type_cache_if_needed and > insert_rel_type_cache_if_needed. I've checked that > > > 4. I haven't looked into comments, though I'd recommend oid -> OID > replacement in the comments. > > I've changed oid -> OID in the comments and in the commit message. > > > Thank you for working on this patchset! > > Thank you for review! > Looked at v9: Patch looks good to me. I'd only suggest comments changes: "The map from relation's OID to the corresponding composite type OID" -> "The mapping of relation's OID to the corresponding composite type OID" "We're keeping the map entry when corresponding typentry have either TCFLAGS_HAVE_PG_TYPE_DATA, or TCFLAGS_OPERATOR_FLAGS, or tupdesc. That is we're keeping map entry if the entry has something to clear." -> "We're keeping the map entry when the corresponding typentry has something to clear i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or TCFLAGS_OPERATOR_FLAGS, or tupdesc." "Invalidate particular TypeCacheEntry on Relcache inval callback" - remove extra tabs before. Maybe also add empty line above. "Typically shouldn't be a problem" -> "Typically this shouldn't affect performance" "Relid = 0, so we need" -> "Relid is invalid. By convention we need" "if cleaned TCFLAGS_HAVE_PG_TYPE_DATA flag" -> "if we cleaned TCFLAGS_HAVE_PG_TYPE_DATA flag previously" "+/* + * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the + * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags, + * or tupDesc if needed." - remove one "if needed" Regards, Pavel Borisov Supabase
Re: type cache cleanup improvements
Hi, Pavel! On Wed, Aug 21, 2024 at 4:28 PM Pavel Borisov wrote: > I've looked at patch v8. > > 1. > In function check_insert_rel_type_cache() the block: > > +#ifdef USE_ASSERT_CHECKING > + > + /* > +* In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash > +* entry if it should exist. > +*/ > + if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) && > + typentry->tupDesc == NULL) > + { > + boolfound; > + > + (void) hash_search(RelIdToTypeIdCacheHash, > + &typentry->typrelid, > + HASH_FIND, &found); > + Assert(found); > + } > +#endif > > As I understand it does HASH_FIND after the same value just inserted by > HASH_ENT > ER above under the same if condition: > > if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) && > + typentry->tupDesc == NULL) > > Why do we need to do this re-check HASH_ENTER? Also I see "otherwise" in > comment in a quoted block, but if condition is the same. Yep, these are remains from one of my previous attempt. No sense to check for HASH_FIND right after HASH_ENTER. Removed. > 2. > For function check_delete_rel_type_cache(): > I'd modify the block: > +#ifdef USE_ASSERT_CHECKING > + > + /* > +* In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash > +* entry if it should exist. > +*/ > + if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) || > + (typentry->flags & TCFLAGS_OPERATOR_FLAGS) || > + typentry->tupDesc != NULL) > + { > + boolfound; > + > + (void) hash_search(RelIdToTypeIdCacheHash, > + &typentry->typrelid, > + HASH_FIND, &found); > + Assert(found); > + } > +#endif > > as: > + > + /* > +* In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash > +* entry if it should exist. > +*/ > + else > +{ > + #ifdef USE_ASSERT_CHECKING > + boolfound; > + > + (void) hash_search(RelIdToTypeIdCacheHash, > + &typentry->typrelid, > + HASH_FIND, &found); > + Assert(found); > +#endif > +} Changed in the way you proposed, except I put the comment inside the #ifdef. I this it's easier to understand this way. > 3. I think check_delete_rel_type_cache and check_insert_rel_type_cache are > better to be renamed to be more clear, though I don't have exact proposals > yet, Renamed to delete_rel_type_cache_if_needed and insert_rel_type_cache_if_needed. I've checked that > 4. I haven't looked into comments, though I'd recommend oid -> OID > replacement in the comments. I've changed oid -> OID in the comments and in the commit message. > Thank you for working on this patchset! Thank you for review! -- Regards, Alexander Korotkov Supabase v9-0001-Avoid-looping-over-all-type-cache-entries-in-Type.patch Description: Binary data
Re: type cache cleanup improvements
Hi, Alexander! On Tue, 20 Aug 2024 at 23:01, Alexander Korotkov wrote: > On Mon, Aug 5, 2024 at 4:16 AM Alexander Korotkov > wrote: > > I've revised the patchset. First of all, I've re-ordered the patches. > > > > 0001-0002 (former 0002-0003) > > Comprises hash_search_with_hash_value() function and its application > > to avoid full hash iteration in InvalidateAttoptCacheCallback() and > > TypeCacheTypCallback(). I think this is quite straightforward > > optimization without negative side effects. I've revised comments, > > commit message and did some code beautification. I'm going to push > > this if no objections. > > > > 0003 (former 0001) > > I've revised this patch. I think main concerns expressed in the > > thread about this path is that we don't have invalidation mechanism > > for relid => typid map. Finally due to oid wraparound same relids > > could get reused. That could lead to invalid entries in the map about > > existing relids and typeids. This is rather messy, but I don't think > > this could cause a material bug. The maps items are used only for > > cache invalidation. Extra invalidation doesn't cause a bug. If type > > with same relid will be cached, then correspoding map item will be > > overridden, so no missing invalidation. However, I see the following > > reasons for keeping consistent state of relid => typid map. > > > > 1) As the main use-case for this optimization is flood of temporary > > tables, it would be nice not let relid => typid map bloat in this > > case. I see that TypeCacheHash would get bloated, because its entries > > are never deleted. However, I would prefer to not get this situation > > even worse. > > 2) In future we may find some more use-cases for relid => typid map > > besides cache invalidation. Keeping that in consistent state could be > > advantage then. > > > > In the attached patch, I'm keeping relid => typid map when > > corresponding typentry have either TCFLAGS_HAVE_PG_TYPE_DATA, or > > TCFLAGS_OPERATOR_FLAGS, or tupdesc. Thus, when temporary table gets > > deleted, we would invalidate the map item. > > > > It will be also nice to get rid of iteration over all the cached > > domain types in TypeCacheRelCallback(). However, this typically > > shouldn't be a problem since domain types are less tended to bloat. > > Domain types are created manually, unlike composite types which are > > automatically created for every temporary table. We will probably > > need to optimize this in future, but I don't feel this to be necessary > > in present patch. > > > > I think the revised 0003 requires review. > > The rebased remaining patch is attached. > I've looked at patch v8. 1. In function check_insert_rel_type_cache() the block: +#ifdef USE_ASSERT_CHECKING + + /* +* In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash +* entry if it should exist. +*/ + if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) && + typentry->tupDesc == NULL) + { + boolfound; + + (void) hash_search(RelIdToTypeIdCacheHash, + &typentry->typrelid, + HASH_FIND, &found); + Assert(found); + } +#endif As I understand it does HASH_FIND after the same value just inserted by HASH_ENT ER above under the same if condition: if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) && + typentry->tupDesc == NULL) Why do we need to do this re-check HASH_ENTER? Also I see "otherwise" in comment in a quoted block, but if condition is the same. 2. For function check_delete_rel_type_cache(): I'd modify the block: +#ifdef USE_ASSERT_CHECKING + + /* +* In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash +* entry if it should exist. +*/ + if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) || + (typentry->flags & TCFLAGS_OPERATOR_FLAGS) || + typentry->tupDesc != NULL) + { + boolfound; + + (void) hash_search(RelIdToTypeIdCacheHash, + &typentry->typrelid, + HASH_FIND, &found); + Assert(found); + } +#endif as: + + /* +* In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash +* entry if it should exist. +*/ + else +{ + #ifdef USE_ASSERT_CHECKING + boolfound; + + (void) hash_search(RelIdToTypeIdCacheHash, + &typentry->typrelid, + HASH_FIND, &found); + Assert(found); +#endif +} 3. I think check_delete_rel_type_cache and check_insert_rel_type_cache are better to be renamed to be more clear, though I don't have exact proposals yet, 4. I h
Re: type cache cleanup improvements
On Mon, Aug 5, 2024 at 4:16 AM Alexander Korotkov wrote: > I've revised the patchset. First of all, I've re-ordered the patches. > > 0001-0002 (former 0002-0003) > Comprises hash_search_with_hash_value() function and its application > to avoid full hash iteration in InvalidateAttoptCacheCallback() and > TypeCacheTypCallback(). I think this is quite straightforward > optimization without negative side effects. I've revised comments, > commit message and did some code beautification. I'm going to push > this if no objections. > > 0003 (former 0001) > I've revised this patch. I think main concerns expressed in the > thread about this path is that we don't have invalidation mechanism > for relid => typid map. Finally due to oid wraparound same relids > could get reused. That could lead to invalid entries in the map about > existing relids and typeids. This is rather messy, but I don't think > this could cause a material bug. The maps items are used only for > cache invalidation. Extra invalidation doesn't cause a bug. If type > with same relid will be cached, then correspoding map item will be > overridden, so no missing invalidation. However, I see the following > reasons for keeping consistent state of relid => typid map. > > 1) As the main use-case for this optimization is flood of temporary > tables, it would be nice not let relid => typid map bloat in this > case. I see that TypeCacheHash would get bloated, because its entries > are never deleted. However, I would prefer to not get this situation > even worse. > 2) In future we may find some more use-cases for relid => typid map > besides cache invalidation. Keeping that in consistent state could be > advantage then. > > In the attached patch, I'm keeping relid => typid map when > corresponding typentry have either TCFLAGS_HAVE_PG_TYPE_DATA, or > TCFLAGS_OPERATOR_FLAGS, or tupdesc. Thus, when temporary table gets > deleted, we would invalidate the map item. > > It will be also nice to get rid of iteration over all the cached > domain types in TypeCacheRelCallback(). However, this typically > shouldn't be a problem since domain types are less tended to bloat. > Domain types are created manually, unlike composite types which are > automatically created for every temporary table. We will probably > need to optimize this in future, but I don't feel this to be necessary > in present patch. > > I think the revised 0003 requires review. The rebased remaining patch is attached. -- Regards, Alexander Korotkov Supabase v8-0001-Avoid-looping-over-all-type-cache-entries-in-Type.patch Description: Binary data
Re: type cache cleanup improvements
Hi! On Wed, Apr 3, 2024 at 9:07 AM Andrei Lepikhov wrote: > On 3/15/24 17:57, Teodor Sigaev wrote: > >> Okay, I've applied this piece for now. Not sure I'll have much room > >> to look at the rest. > > > > Thank you very much! > I have spent some time reviewing this feature. I think we can discuss > and apply it step-by-step. So, the 0001-* patch is at this moment. > The feature addresses the issue of TypCache being bloated by intensive > usage of non-standard types and domains. It adds significant overhead > during relcache invalidation by thoroughly scanning this hash table. > IMO, this feature will be handy soon, as we already see some patches > where TypCache is intensively used for storing composite types—for > example, look into solutions proposed in [1]. > One of my main concerns with this feature is the possibility of lost > entries, which could be mistakenly used by relations with the same oid > in the future. This seems particularly possible in cases with multiple > temporary tables. The author has attempted to address this by replacing > the typrelid and type_id fields in the mapRelType on each call of > lookup_type_cache. However, I believe we could further improve this by > removing the entry from mapRelType on invalidation, thus avoiding this > potential issue. > While reviewing the patch, I made some minor changes (see attachment) > that you're free to adopt or reject. However, it's crucial that the > patch includes a detailed explanation, not just a single sentence, to > ensure everyone understands the changes. > Upon closer inspection, I noticed that the current implementation only > invalidates the cache entry. While this is acceptable for standard > types, it may not be sufficient to maintain numerous custom types (as in > the example in the initial letter) or in cases where whole-row vars are > heavily used. In such scenarios, removing the entry and reducing the > hash table's size might be more efficient. > In toto, the 0001-* patch looks good, and I would be glad to see it in > the core. I've revised the patchset. First of all, I've re-ordered the patches. 0001-0002 (former 0002-0003) Comprises hash_search_with_hash_value() function and its application to avoid full hash iteration in InvalidateAttoptCacheCallback() and TypeCacheTypCallback(). I think this is quite straightforward optimization without negative side effects. I've revised comments, commit message and did some code beautification. I'm going to push this if no objections. 0003 (former 0001) I've revised this patch. I think main concerns expressed in the thread about this path is that we don't have invalidation mechanism for relid => typid map. Finally due to oid wraparound same relids could get reused. That could lead to invalid entries in the map about existing relids and typeids. This is rather messy, but I don't think this could cause a material bug. The maps items are used only for cache invalidation. Extra invalidation doesn't cause a bug. If type with same relid will be cached, then correspoding map item will be overridden, so no missing invalidation. However, I see the following reasons for keeping consistent state of relid => typid map. 1) As the main use-case for this optimization is flood of temporary tables, it would be nice not let relid => typid map bloat in this case. I see that TypeCacheHash would get bloated, because its entries are never deleted. However, I would prefer to not get this situation even worse. 2) In future we may find some more use-cases for relid => typid map besides cache invalidation. Keeping that in consistent state could be advantage then. In the attached patch, I'm keeping relid => typid map when corresponding typentry have either TCFLAGS_HAVE_PG_TYPE_DATA, or TCFLAGS_OPERATOR_FLAGS, or tupdesc. Thus, when temporary table gets deleted, we would invalidate the map item. It will be also nice to get rid of iteration over all the cached domain types in TypeCacheRelCallback(). However, this typically shouldn't be a problem since domain types are less tended to bloat. Domain types are created manually, unlike composite types which are automatically created for every temporary table. We will probably need to optimize this in future, but I don't feel this to be necessary in present patch. I think the revised 0003 requires review. -- Regards, Alexander Korotkov Supabase v7-0001-Introduce-hash_search_with_hash_value-function.patch Description: Binary data v7-0002-Optimize-InvalidateAttoptCacheCallback-and-TypeCa.patch Description: Binary data v7-0003-Avoid-looping-over-all-type-cache-entries-in-Type.patch Description: Binary data
Re: type cache cleanup improvements
On 3/15/24 17:57, Teodor Sigaev wrote: Okay, I've applied this piece for now. Not sure I'll have much room to look at the rest. Thank you very much! I have spent some time reviewing this feature. I think we can discuss and apply it step-by-step. So, the 0001-* patch is at this moment. The feature addresses the issue of TypCache being bloated by intensive usage of non-standard types and domains. It adds significant overhead during relcache invalidation by thoroughly scanning this hash table. IMO, this feature will be handy soon, as we already see some patches where TypCache is intensively used for storing composite types—for example, look into solutions proposed in [1]. One of my main concerns with this feature is the possibility of lost entries, which could be mistakenly used by relations with the same oid in the future. This seems particularly possible in cases with multiple temporary tables. The author has attempted to address this by replacing the typrelid and type_id fields in the mapRelType on each call of lookup_type_cache. However, I believe we could further improve this by removing the entry from mapRelType on invalidation, thus avoiding this potential issue. While reviewing the patch, I made some minor changes (see attachment) that you're free to adopt or reject. However, it's crucial that the patch includes a detailed explanation, not just a single sentence, to ensure everyone understands the changes. Upon closer inspection, I noticed that the current implementation only invalidates the cache entry. While this is acceptable for standard types, it may not be sufficient to maintain numerous custom types (as in the example in the initial letter) or in cases where whole-row vars are heavily used. In such scenarios, removing the entry and reducing the hash table's size might be more efficient. In toto, the 0001-* patch looks good, and I would be glad to see it in the core. [1] https://www.postgresql.org/message-id/flat/CAKcux6ktu-8tefLWtQuuZBYFaZA83vUzuRd7c1YHC-yEWyYFpg%40mail.gmail.com -- regards, Andrei Lepikhov Postgres Professional diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c index e3c32c7848..ed321603d5 100644 --- a/src/backend/utils/cache/typcache.c +++ b/src/backend/utils/cache/typcache.c @@ -74,16 +74,17 @@ #include "utils/typcache.h" -/* The main type cache hashtable searched by lookup_type_cache */ -static HTAB *TypeCacheHash = NULL; - /* The map from relation's oid to its type oid */ -typedef struct mapRelTypeEntry +typedef struct RelTypeMapEntry { Oid typrelid; Oid type_id; -} mapRelTypeEntry; +} RelTypeMapEntry; + +/* The main type cache hashtable searched by lookup_type_cache */ +static HTAB *TypeCacheHash = NULL; +/* Utility hash table to speed up processing of invalidation relcache events. */ static HTAB *mapRelType = NULL; /* List of type cache entries for domain types */ @@ -368,7 +369,7 @@ lookup_type_cache(Oid type_id, int flags) &ctl, HASH_ELEM | HASH_BLOBS); ctl.keysize = sizeof(Oid); - ctl.entrysize = sizeof(mapRelTypeEntry); + ctl.entrysize = sizeof(RelTypeMapEntry); mapRelType = hash_create("Map reloid to typeoid", 64, &ctl, HASH_ELEM | HASH_BLOBS); @@ -492,11 +493,11 @@ lookup_type_cache(Oid type_id, int flags) */ if (OidIsValid(typentry->typrelid) && typentry->typtype == TYPTYPE_COMPOSITE) { - mapRelTypeEntry *relentry; + RelTypeMapEntry *relentry; - relentry = (mapRelTypeEntry*) hash_search(mapRelType, - &typentry->typrelid, - HASH_ENTER, NULL); + relentry = (RelTypeMapEntry *) hash_search(mapRelType, + &typentry->typrelid, + HASH_ENTER, NULL); relentry->typrelid = typentry->typrelid; relentry->type_id = typentry->type_id; @@ -2297,7 +2298,7 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry) } static void -invalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry) +invalidateTypeCacheEntry(TypeCacheEntry *typentry) { /* Delete tupdesc if we have it */ if (typentry->tupDesc != NULL) @@ -2348,11 +2349,11 @@ TypeCacheRelCallback(Datum arg, Oid relid) if (OidIsValid(relid)) { - mapRelTypeEntry *relentry; + RelTypeMapEntry *relentry; - relentry = (mapRelTypeEntry *) hash_search(mapRelType, - &relid, - HASH_FIND, NULL); + relentry = (RelTypeMapEntry *) hash_search(mapRelType, + &relid, + HASH_FIND, NULL); if (relentry != NULL) { @@ -2365,7 +2366,7 @@ TypeCacheRelCallback(Datum arg, Oid relid) Assert(typentry->typtype == TYPTYPE_COMPOSITE); Assert(relid == typentry->typrelid); -invalidateCompositeTypeCacheEntry(typentry); +invalidateTypeCacheEntry(typentry); } } @@ -2397,7 +2398,7 @@ TypeCacheRelCallback(Datum arg, Oid relid) { if (typentry->typtype == TYPTYPE_COMPOSITE) { -invalidateCompositeTypeCacheEntry(typentry); +invalidateTypeCach
Re: type cache cleanup improvements
> Rest of patches, rebased. Hello, I read and tested only the first patch so far. Creation of temp tables and rollback in your script work 3-4 times faster with 0001-type-cache.patch on my windows laptop. In the patch I found a copy of the comment "If it's domain over composite, reset flags...". Can we move the reset flags operation and its comment into the invalidateCompositeTypeCacheEntry() function? This simplify the TypeCacheRelCallback() func, but adds two more IF statements when we need to clean up a cache entry for a specific relation. (diff attached). -- Roman Zharkov mapRelType-v2.patch Description: Binary data
Re: type cache cleanup improvements
Okay, I've applied this piece for now. Not sure I'll have much room to look at the rest. Thank you very much! Rest of patches, rebased. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/From b30af080d7768c2fdb6198e2e40ef93419f60732 Mon Sep 17 00:00:00 2001 From: Teodor Sigaev Date: Fri, 15 Mar 2024 13:55:10 +0300 Subject: [PATCH 3/3] usage of hash_search_with_hash_value --- src/backend/utils/cache/attoptcache.c | 39 src/backend/utils/cache/typcache.c| 52 +++ 2 files changed, 70 insertions(+), 21 deletions(-) diff --git a/src/backend/utils/cache/attoptcache.c b/src/backend/utils/cache/attoptcache.c index af978ccd4b1..3a18b2e9a77 100644 --- a/src/backend/utils/cache/attoptcache.c +++ b/src/backend/utils/cache/attoptcache.c @@ -44,12 +44,10 @@ typedef struct /* * InvalidateAttoptCacheCallback - * Flush all cache entries when pg_attribute is updated. + * Flush cache entry (or entries) when pg_attribute is updated. * * When pg_attribute is updated, we must flush the cache entry at least - * for that attribute. Currently, we just flush them all. Since attribute - * options are not currently used in performance-critical paths (such as - * query execution), this seems OK. + * for that attribute. */ static void InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue) @@ -57,7 +55,16 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue) HASH_SEQ_STATUS status; AttoptCacheEntry *attopt; - hash_seq_init(&status, AttoptCacheHash); + /* + * By convection, zero hash value is passed to the callback as a sign + * that it's time to invalidate the cache. See sinval.c, inval.c and + * InvalidateSystemCachesExtended(). + */ + if (hashvalue == 0) + hash_seq_init(&status, AttoptCacheHash); + else + hash_seq_init_with_hash_value(&status, AttoptCacheHash, hashvalue); + while ((attopt = (AttoptCacheEntry *) hash_seq_search(&status)) != NULL) { if (attopt->opts) @@ -70,6 +77,18 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue) } } +/* + * Hash function compatible with two-arg system cache hash function. + */ +static uint32 +relatt_cache_syshash(const void *key, Size keysize) +{ + const AttoptCacheKey* ckey = key; + + Assert(keysize == sizeof(*ckey)); + return GetSysCacheHashValue2(ATTNUM, ckey->attrelid, ckey->attnum); +} + /* * InitializeAttoptCache * Initialize the attribute options cache. @@ -82,9 +101,17 @@ InitializeAttoptCache(void) /* Initialize the hash table. */ ctl.keysize = sizeof(AttoptCacheKey); ctl.entrysize = sizeof(AttoptCacheEntry); + + /* + * AttoptCacheEntry takes hash value from the system cache. For + * AttoptCacheHash we use the same hash in order to speedup search by hash + * value. This is used by hash_seq_init_with_hash_value(). + */ + ctl.hash = relatt_cache_syshash; + AttoptCacheHash = hash_create("Attopt cache", 256, &ctl, - HASH_ELEM | HASH_BLOBS); + HASH_ELEM | HASH_FUNCTION); /* Make sure we've initialized CacheMemoryContext. */ if (!CacheMemoryContext) diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c index 7936c3b46d0..9145088f44d 100644 --- a/src/backend/utils/cache/typcache.c +++ b/src/backend/utils/cache/typcache.c @@ -339,6 +339,15 @@ static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc); static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc, uint32 typmod); +/* + * Hash function compatible with one-arg system cache hash function. + */ +static uint32 +type_cache_syshash(const void *key, Size keysize) +{ + Assert(keysize == sizeof(Oid)); + return GetSysCacheHashValue1(TYPEOID, ObjectIdGetDatum(*(const Oid*)key)); +} /* * lookup_type_cache @@ -364,8 +373,15 @@ lookup_type_cache(Oid type_id, int flags) ctl.keysize = sizeof(Oid); ctl.entrysize = sizeof(TypeCacheEntry); + /* + * TypeEntry takes hash value from the system cache. For TypeCacheHash + * we use the same hash in order to speedup search by hash value. This + * is used by hash_seq_init_with_hash_value(). + */ + ctl.hash = type_cache_syshash; + TypeCacheHash = hash_create("Type information cache", 64, - &ctl, HASH_ELEM | HASH_BLOBS); + &ctl, HASH_ELEM | HASH_FUNCTION); ctl.keysize = sizeof(Oid); ctl.entrysize = sizeof(mapRelTypeEntry); @@ -421,8 +437,7 @@ lookup_type_cache(Oid type_id, int flags) /* These fields can never change, by definition */ typentry->type_id = type_id; - typentry->type_id_hash = GetSysCacheHashValue1(TYPEOID, - ObjectIdGetDatum(type_id)); + typentry->type_id_hash = get_hash_value(TypeCacheHash, &type_id); /* Keep this part in sync with the code below */ typentry->typlen = typtup->typlen; @@ -2429,20 +2444,27 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint
Re: type cache cleanup improvements
On Thu, Mar 14, 2024 at 04:27:43PM +0300, Teodor Sigaev wrote: >> So I would like to suggest the attached patch for this first piece. >> What do you think? > > I have not any objections Okay, I've applied this piece for now. Not sure I'll have much room to look at the rest. -- Michael signature.asc Description: PGP signature
Re: type cache cleanup improvements
One thing that first pops out to me is that we can do the refactor of hash_initial_lookup() as an independent piece, without the extra paths introduced. But rather than returning the bucket hash and have the bucket number as an in/out argument of hash_initial_lookup(), there is an argument for reversing them: hash_search_with_hash_value() does not care about the bucket number. Ok, no problem 02-hash_seq_init_with_hash_value.v5.patch - introduces a hash_seq_init_with_hash_value() method. hash_initial_lookup() is marked as inline, but I suppose, modern compilers are smart enough to inline it automatically. Likely so, though that does not hurt to show the intention to the reader. Agree So I would like to suggest the attached patch for this first piece. What do you think? I have not any objections It may also be an idea to use `git format-patch` when generating a series of patches. That makes for easier reviews. Thanks, will try -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/
Re: type cache cleanup improvements
On Wed, Mar 13, 2024 at 04:40:38PM +0300, Teodor Sigaev wrote: > Done, all patches should be applied consequentially. One thing that first pops out to me is that we can do the refactor of hash_initial_lookup() as an independent piece, without the extra paths introduced. But rather than returning the bucket hash and have the bucket number as an in/out argument of hash_initial_lookup(), there is an argument for reversing them: hash_search_with_hash_value() does not care about the bucket number. > 02-hash_seq_init_with_hash_value.v5.patch - introduces a > hash_seq_init_with_hash_value() method. hash_initial_lookup() is marked as > inline, but I suppose, modern compilers are smart enough to inline it > automatically. Likely so, though that does not hurt to show the intention to the reader. So I would like to suggest the attached patch for this first piece. What do you think? It may also be an idea to use `git format-patch` when generating a series of patches. That makes for easier reviews. -- Michael From 6b1fe126b9f72ff27aca08128948f4e617ba70dd Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Thu, 14 Mar 2024 16:35:40 +0900 Subject: [PATCH] Refactor initial hash lookup in dynahash.c --- src/backend/utils/hash/dynahash.c | 75 ++- 1 file changed, 33 insertions(+), 42 deletions(-) diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c index a4152080b5..e1bd92a01c 100644 --- a/src/backend/utils/hash/dynahash.c +++ b/src/backend/utils/hash/dynahash.c @@ -273,6 +273,8 @@ static void hdefault(HTAB *hashp); static int choose_nelem_alloc(Size entrysize); static bool init_htab(HTAB *hashp, long nelem); static void hash_corrupted(HTAB *hashp); +static uint32 hash_initial_lookup(HTAB *hashp, uint32 hashvalue, + HASHBUCKET **bucketptr); static long next_pow2_long(long num); static int next_pow2_int(long num); static void register_seq_scan(HTAB *hashp); @@ -972,10 +974,6 @@ hash_search_with_hash_value(HTAB *hashp, HASHHDR*hctl = hashp->hctl; int freelist_idx = FREELIST_IDX(hctl, hashvalue); Size keysize; - uint32 bucket; - long segment_num; - long segment_ndx; - HASHSEGMENT segp; HASHBUCKET currBucket; HASHBUCKET *prevBucketPtr; HashCompareFunc match; @@ -1008,17 +1006,7 @@ hash_search_with_hash_value(HTAB *hashp, /* * Do the initial lookup */ - bucket = calc_bucket(hctl, hashvalue); - - segment_num = bucket >> hashp->sshift; - segment_ndx = MOD(bucket, hashp->ssize); - - segp = hashp->dir[segment_num]; - - if (segp == NULL) - hash_corrupted(hashp); - - prevBucketPtr = &segp[segment_ndx]; + (void) hash_initial_lookup(hashp, hashvalue, &prevBucketPtr); currBucket = *prevBucketPtr; /* @@ -1159,14 +1147,10 @@ hash_update_hash_key(HTAB *hashp, const void *newKeyPtr) { HASHELEMENT *existingElement = ELEMENT_FROM_KEY(existingEntry); - HASHHDR*hctl = hashp->hctl; uint32 newhashvalue; Size keysize; uint32 bucket; uint32 newbucket; - long segment_num; - long segment_ndx; - HASHSEGMENT segp; HASHBUCKET currBucket; HASHBUCKET *prevBucketPtr; HASHBUCKET *oldPrevPtr; @@ -1187,17 +1171,8 @@ hash_update_hash_key(HTAB *hashp, * this to be able to unlink it from its hash chain, but as a side benefit * we can verify the validity of the passed existingEntry pointer. */ - bucket = calc_bucket(hctl, existingElement->hashvalue); - - segment_num = bucket >> hashp->sshift; - segment_ndx = MOD(bucket, hashp->ssize); - - segp = hashp->dir[segment_num]; - - if (segp == NULL) - hash_corrupted(hashp); - - prevBucketPtr = &segp[segment_ndx]; + bucket = hash_initial_lookup(hashp, existingElement->hashvalue, + &prevBucketPtr); currBucket = *prevBucketPtr; while (currBucket != NULL) @@ -1219,18 +1194,7 @@ hash_update_hash_key(HTAB *hashp, * chain we want to put the entry into. */ newhashvalue = hashp->hash(newKeyPtr, hashp->keysize); - - newbucket = calc_bucket(hctl, newhashvalue); - - segment_num = newbucket >> hashp->sshift; - segment_ndx = MOD(newbucket, hashp->ssize); - - segp = hashp->dir[segment_num]; - - if (segp == NULL) - hash_corrupted(hashp); - - prevBucketPtr = &segp[segment_ndx]; + newbucket = hash_initial_lookup(hashp, newhashvalue, &prevBucketPtr); currBucket = *prevBucketPtr; /* @@ -1741,6 +1705,33 @@ element_alloc(HTAB *hashp, int nelem, int freelist_idx) return true; } +/* + * Do initial lookup of a bucket for the given hash value, retrieving its + * bucket number and its hash bucket. + */ +static inline uint32 +hash_initial_lookup(HTAB *hashp, uint32 hashvalue, HASHBUCKET **bucketptr) +{ + HASHHDR*hctl = hashp->hctl; + HASHSEGMENT segp; + long segment_num; + long segment_ndx; + uint32 bucket; + + bucket = calc_bucket(hctl, hashvalue); + + segment_num = bucket >> hashp->sshift; + segment_ndx = MOD(bucket, hashp->ssize); + + segp = hashp->dir[segment_num]; + + if (segp == NULL) + hash_corrupted(hashp); + + *bucketptr = &segp[segmen
Re: type cache cleanup improvements
I think that this patch should be split for clarity, as there are a few things that are independently useful. I guess something like that: Done, all patches should be applied consequentially. > - The typcache changes. 01-map_rel_to_type.v5.patch adds map relation to its type - Introduction of hash_initial_lookup(), that simplifies 3 places of dynahash.c where the same code is used. The routine should be inlined. - The split in hash_seq_search to force a different type of search is weird, complicating the dynahash interface by hiding what seems like a search mode. Rather than hasHashvalue that's hidden in the middle of HASH_SEQ_STATUS, could it be better to have an entirely different API for the search? That should be a patch on its own, as well. 02-hash_seq_init_with_hash_value.v5.patch - introduces a hash_seq_init_with_hash_value() method. hash_initial_lookup() is marked as inline, but I suppose, modern compilers are smart enough to inline it automatically. Using separate interface for scanning hash with hash value will make scan code more ugly in case when we need to use special value of hash value as it is done in cache's scans. Look, instead of this simplified code: if (hashvalue == 0) hash_seq_init(&status, TypeCacheHash); else hash_seq_init_with_hash_value(&status, TypeCacheHash, hashvalue); while ((typentry = hash_seq_search(&status)) != NULL) { ... } we will need to code something like that: if (hashvalue == 0) { hash_seq_init(&status, TypeCacheHash); while ((typentry = hash_seq_search(&status)) != NULL) { ... } } else { hash_seq_init_with_hash_value(&status, TypeCacheHash, hashvalue); while ((typentry = hash_seq_search_with_hash_value(&status)) != NULL) { ... } } Or I didn't understand you. I thought about integrate check inside existing loop in hash_seq_search() : + rerun: if ((curElem = status->curEntry) != NULL) { /* Continuing scan of curBucket... */ status->curEntry = curElem->link; if (status->curEntry == NULL) /* end of this bucket */ + { + if (status->hasHashvalue) + hash_seq_term(status); ++status->curBucket; + } + else if (status->hasHashvalue && status->hashvalue != +curElem->hashvalue) + goto rerun; return (void *) ELEMENTKEY(curElem); } But for me it looks weird and adds some checks which will takes some CPU time. 03-att_with_hash_value.v5.patch - adds usage of previous patch. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/diff --git a/src/backend/utils/cache/attoptcache.c b/src/backend/utils/cache/attoptcache.c index af978ccd4b1..3a18b2e9a77 100644 --- a/src/backend/utils/cache/attoptcache.c +++ b/src/backend/utils/cache/attoptcache.c @@ -44,12 +44,10 @@ typedef struct /* * InvalidateAttoptCacheCallback - * Flush all cache entries when pg_attribute is updated. + * Flush cache entry (or entries) when pg_attribute is updated. * * When pg_attribute is updated, we must flush the cache entry at least - * for that attribute. Currently, we just flush them all. Since attribute - * options are not currently used in performance-critical paths (such as - * query execution), this seems OK. + * for that attribute. */ static void InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue) @@ -57,7 +55,16 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue) HASH_SEQ_STATUS status; AttoptCacheEntry *attopt; - hash_seq_init(&status, AttoptCacheHash); + /* + * By convection, zero hash value is passed to the callback as a sign + * that it's time to invalidate the cache. See sinval.c, inval.c and + * InvalidateSystemCachesExtended(). + */ + if (hashvalue == 0) + hash_seq_init(&status, AttoptCacheHash); + else + hash_seq_init_with_hash_value(&status, AttoptCacheHash, hashvalue); + while ((attopt = (AttoptCacheEntry *) hash_seq_search(&status)) != NULL) { if (attopt->opts) @@ -70,6 +77,18 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue) } } +/* + * Hash function compatible with two-arg system cache hash function. + */ +static uint32 +relatt_cache_syshash(const void *key, Size keysize) +{ + const AttoptCacheKey* ckey = key; + + Assert(keysize == sizeof(*ckey)); + return GetSysCacheHashValue2(ATTNUM, ckey->attrelid, ckey->attnum); +} + /* * InitializeAttoptCache * Initialize the attribute options cache. @@ -82,9 +101,17 @@ InitializeAttoptCache(void) /* Initialize the hash table. */ ctl.keysize = sizeof(AttoptCacheKey); ctl.entrysize = sizeof(AttoptCacheEntry); + + /* + * AttoptCacheEntry takes hash value from the system cache. For + * AttoptCacheHash we use the same hash in order to speedup
Re: type cache cleanup improvements
On Tue, Mar 12, 2024 at 06:55:41PM +0300, Teodor Sigaev wrote: > Playing around I found one more place which could easily modified with > hash_seq_init_with_hash_value() call. I think that this patch should be split for clarity, as there are a few things that are independently useful. I guess something like that: - Introduction of hash_initial_lookup(), that simplifies 3 places of dynahash.c where the same code is used. The routine should be inlined. - The split in hash_seq_search to force a different type of search is weird, complicating the dynahash interface by hiding what seems like a search mode. Rather than hasHashvalue that's hidden in the middle of HASH_SEQ_STATUS, could it be better to have an entirely different API for the search? That should be a patch on its own, as well. - The typcache changes. -- Michael signature.asc Description: PGP signature
Re: type cache cleanup improvements
Got it. Here is an updated patch where I added a corresponding comment. Thank you! Playing around I found one more place which could easily modified with hash_seq_init_with_hash_value() call. -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/diff --git a/src/backend/utils/cache/attoptcache.c b/src/backend/utils/cache/attoptcache.c index af978ccd4b1..28980620662 100644 --- a/src/backend/utils/cache/attoptcache.c +++ b/src/backend/utils/cache/attoptcache.c @@ -44,12 +44,10 @@ typedef struct /* * InvalidateAttoptCacheCallback - * Flush all cache entries when pg_attribute is updated. + * Flush cache entry (or entries) when pg_attribute is updated. * * When pg_attribute is updated, we must flush the cache entry at least - * for that attribute. Currently, we just flush them all. Since attribute - * options are not currently used in performance-critical paths (such as - * query execution), this seems OK. + * for that attribute. */ static void InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue) @@ -57,7 +55,16 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue) HASH_SEQ_STATUS status; AttoptCacheEntry *attopt; - hash_seq_init(&status, AttoptCacheHash); + /* + * By convection, zero hash value is passed to the callback as a sign + * that it's time to invalidate the cache. See sinval.c, inval.c and + * InvalidateSystemCachesExtended(). + */ + if (hashvalue == 0) + hash_seq_init(&status, AttoptCacheHash); + else + hash_seq_init_with_hash_value(&status, AttoptCacheHash, hashvalue); + while ((attopt = (AttoptCacheEntry *) hash_seq_search(&status)) != NULL) { if (attopt->opts) @@ -70,6 +77,17 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue) } } +/* + * Hash function compatible with two-arg system cache hash function. + */ +static uint32 +relatt_cache_syshash(const void *key, Size keysize) +{ + const AttoptCacheKey* ckey = key; + + return GetSysCacheHashValue2(ATTNUM, ckey->attrelid, ckey->attnum); +} + /* * InitializeAttoptCache * Initialize the attribute options cache. @@ -82,9 +100,17 @@ InitializeAttoptCache(void) /* Initialize the hash table. */ ctl.keysize = sizeof(AttoptCacheKey); ctl.entrysize = sizeof(AttoptCacheEntry); + + /* + * AttoptCacheEntry takes hash value from the system cache. For + * AttoptCacheHash we use the same hash in order to speedup search by hash + * value. This is used by hash_seq_init_with_hash_value(). + */ + ctl.hash = relatt_cache_syshash; + AttoptCacheHash = hash_create("Attopt cache", 256, &ctl, - HASH_ELEM | HASH_BLOBS); + HASH_ELEM | HASH_FUNCTION); /* Make sure we've initialized CacheMemoryContext. */ if (!CacheMemoryContext)
Re: type cache cleanup improvements
Hi, > Yep, exacly. One time from 2^32 we reset whole cache instead of one (or > several) > entry with hash value = 0. Got it. Here is an updated patch where I added a corresponding comment. Now the patch LGTM. I'm going to change its status to RfC unless anyone wants to review it too. -- Best regards, Aleksander Alekseev v4-0001-Improve-performance-of-type-cache-cleanup.patch Description: Binary data
Re: type cache cleanup improvements
Yep, exacly. One time from 2^32 we reset whole cache instead of one (or several) entry with hash value = 0. On 08.03.2024 18:35, Tom Lane wrote: Aleksander Alekseev writes: One thing that I couldn't immediately figure out is why 0 hash value is treated as a magic invalid value in TypeCacheTypCallback(): I've not read this patch, but IIRC in some places we have a convention that hash value zero is passed for an sinval reset event (that is, "flush all cache entries"). regards, tom lane -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/
Re: type cache cleanup improvements
Aleksander Alekseev writes: > One thing that I couldn't immediately figure out is why 0 hash value > is treated as a magic invalid value in TypeCacheTypCallback(): I've not read this patch, but IIRC in some places we have a convention that hash value zero is passed for an sinval reset event (that is, "flush all cache entries"). regards, tom lane
Re: type cache cleanup improvements
Hi, > > I would like to tweak the patch a little bit - change some comments, > > add some Asserts, etc. Don't you mind? > You are welcome! Thanks. PFA the updated patch with some tweaks by me. I added the commit message as well. One thing that I couldn't immediately figure out is why 0 hash value is treated as a magic invalid value in TypeCacheTypCallback(): ``` - hash_seq_init(&status, TypeCacheHash); + if (hashvalue == 0) + hash_seq_init(&status, TypeCacheHash); + else + hash_seq_init_with_hash_value(&status, TypeCacheHash, hashvalue); ``` Is there anything that prevents the actual hash value from being zero? I don't think so, but maybe I missed something. If zero is indeed an invalid hash value I would like to reference the corresponding code. If zero is a correct hash value we should either change this by adding something like `if(!hash) hash++` or use an additional boolean argument here. -- Best regards, Aleksander Alekseev v3-0001-Improve-performance-of-type-cache-cleanup.patch Description: Binary data
Re: type cache cleanup improvements
I would like to tweak the patch a little bit - change some comments, add some Asserts, etc. Don't you mind? You are welcome! -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/
Re: type cache cleanup improvements
Hi, > Thank you for interesting in it! > > > These changes look very promising. Unfortunately the proposed patches > > conflict with each other regardless the order of applying: > > > > ``` > > error: patch failed: src/backend/utils/cache/typcache.c:356 > > error: src/backend/utils/cache/typcache.c: patch does not apply > > ``` > Try increase -F option of patch. > > Anyway, union of both patches in attachment Thanks for the quick update. I tested the patch on an Intel MacBook. A release build was used with my typical configuration, TWIMC see single-install-meson.sh [1]. The speedup I got on the provided benchmark is about 150 times. cfbot seems to be happy with the patch. I would like to tweak the patch a little bit - change some comments, add some Asserts, etc. Don't you mind? [1]: https://github.com/afiskon/pgscripts/ -- Best regards, Aleksander Alekseev
Re: type cache cleanup improvements
Hi! Thank you for interesting in it! These changes look very promising. Unfortunately the proposed patches conflict with each other regardless the order of applying: ``` error: patch failed: src/backend/utils/cache/typcache.c:356 error: src/backend/utils/cache/typcache.c: patch does not apply ``` Try increase -F option of patch. Anyway, union of both patches in attachment -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c index 0d4d0b0a154..ccf798c440c 100644 --- a/src/backend/utils/cache/typcache.c +++ b/src/backend/utils/cache/typcache.c @@ -77,6 +77,15 @@ /* The main type cache hashtable searched by lookup_type_cache */ static HTAB *TypeCacheHash = NULL; +/* The map from relation's oid to its type oid */ +typedef struct mapRelTypeEntry +{ + Oid typrelid; + Oid type_id; +} mapRelTypeEntry; + +static HTAB *mapRelType = NULL; + /* List of type cache entries for domain types */ static TypeCacheEntry *firstDomainTypeEntry = NULL; @@ -330,6 +339,14 @@ static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc); static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc, uint32 typmod); +/* + * Hash function compatible with one-arg system cache hash function. + */ +static uint32 +type_cache_syshash(const void *key, Size keysize) +{ + return GetSysCacheHashValue1(TYPEOID, ObjectIdGetDatum(*(const Oid*)key)); +} /* * lookup_type_cache @@ -355,8 +372,21 @@ lookup_type_cache(Oid type_id, int flags) ctl.keysize = sizeof(Oid); ctl.entrysize = sizeof(TypeCacheEntry); + /* + * Hash function must be compatible to make possible work + * hash_seq_init_with_hash_value(). Hash value in TypeEntry is taken + * from system cache and we use the same (compatible) to use it's hash + * value to speedup search by hash value instead of scanning whole hash + */ + ctl.hash = type_cache_syshash; + TypeCacheHash = hash_create("Type information cache", 64, - &ctl, HASH_ELEM | HASH_BLOBS); + &ctl, HASH_ELEM | HASH_FUNCTION); + + ctl.keysize = sizeof(Oid); + ctl.entrysize = sizeof(mapRelTypeEntry); + mapRelType = hash_create("Map reloid to typeoid", 64, + &ctl, HASH_ELEM | HASH_BLOBS); /* Also set up callbacks for SI invalidations */ CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0); @@ -407,8 +437,7 @@ lookup_type_cache(Oid type_id, int flags) /* These fields can never change, by definition */ typentry->type_id = type_id; - typentry->type_id_hash = GetSysCacheHashValue1(TYPEOID, - ObjectIdGetDatum(type_id)); + typentry->type_id_hash = get_hash_value(TypeCacheHash, &type_id); /* Keep this part in sync with the code below */ typentry->typlen = typtup->typlen; @@ -470,6 +499,24 @@ lookup_type_cache(Oid type_id, int flags) ReleaseSysCache(tp); } + /* + * Add record to map relation->type. We don't bother even of type became + * disconnected from relation, it seems to be impossible, but anyway, + * storing old data is safe, in a worstest case we will just do an extra + * cleanup cache entry. + */ + if (OidIsValid(typentry->typrelid) && typentry->typtype == TYPTYPE_COMPOSITE) + { + mapRelTypeEntry *relentry; + + relentry = (mapRelTypeEntry*) hash_search(mapRelType, + &typentry->typrelid, + HASH_ENTER, NULL); + + relentry->typrelid = typentry->typrelid; + relentry->type_id = typentry->type_id; + } + /* * Look up opclasses if we haven't already and any dependent info is * requested. @@ -2264,6 +2311,37 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry) CurrentSession->shared_typmod_table = typmod_table; } +static void +invalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry) +{ + /* Delete tupdesc if we have it */ + if (typentry->tupDesc != NULL) + { + /* + * Release our refcount, and free the tupdesc if none remain. + * (Can't use DecrTupleDescRefCount because this reference is + * not logged in current resource owner.) + */ + Assert(typentry->tupDesc->tdrefcount > 0); + if (--typentry->tupDesc->tdrefcount == 0) + FreeTupleDesc(typentry->tupDesc); + typentry->tupDesc = NULL; + + /* + * Also clear tupDesc_identifier, so that anything watching + * that will realize that the tupdesc has possibly changed. + * (Alternatively, we could specify that to detect possible + * tupdesc change, one must check for tupDesc != NULL as well + * as tupDesc_identifier being the same as what was previously + * seen. That seems error-prone.) + */ + typentry->tupDesc_identifier = 0; + } + + /* Reset equality/comparison/hashing validity information */ + typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS; +} + /* * TypeCacheRelCallback * Relcache inval callback function @@ -2273,63 +2351,42 @@ SharedRecordTypmodRegi
Re: type cache cleanup improvements
Hi Teodor, > I'd like to suggest two independent patches to improve performance of type > cache > cleanup. I found a case where type cache cleanup was a reason for low > performance. In short, customer makes 100 thousand temporary tables in one > transaction. > > 1 mapRelType.patch >It just adds a local map between relation and its type as it was suggested > in > comment above TypeCacheRelCallback(). Unfortunately, using syscache here was > impossible because this call back could be called outside transaction and it > makes impossible catalog lookups. > > 2 hash_seq_init_with_hash_value.patch > TypeCacheTypCallback() loop over type hash to find entry with given hash > value. Here there are two problems: 1) there isn't interface to dynahash to > search entry with given hash value and 2) hash value calculation algorithm is > differ from system cache. But coming hashvalue is came from system cache. > Patch > is addressed to both issues. It suggests hash_seq_init_with_hash_value() call > which inits hash sequential scan over the single bucket which could contain > entry with given hash value, and hash_seq_search() will iterate only over such > entries. Anf patch changes hash algorithm to match syscache. Actually, patch > makes small refactoring of dynahash, it makes common function hash_do_lookup() > which does initial lookup in hash. > > Some artificial performance test is in attachment, command to test is 'time > psql > < custom_types_and_array.sql', here I show only last rollback time and total > execution time: > 1) master 92d2ab7554f92b841ea71bcc72eaa8ab11aae662 > Time: 33353,288 ms (00:33,353) > psql < custom_types_and_array.sql 0,82s user 0,71s system 1% cpu 1:28,36 > total > > 2) mapRelType.patch > Time: 7455,581 ms (00:07,456) > psql < custom_types_and_array.sql 1,39s user 1,19s system 6% cpu 41,220 total > > 3) hash_seq_init_with_hash_value.patch > Time: 24975,886 ms (00:24,976) > psql < custom_types_and_array.sql 1,33s user 1,25s system 3% cpu 1:19,77 > total > > 4) both > Time: 89,446 ms > psql < custom_types_and_array.sql 0,72s user 0,52s system 10% cpu 12,137 > total These changes look very promising. Unfortunately the proposed patches conflict with each other regardless the order of applying: ``` error: patch failed: src/backend/utils/cache/typcache.c:356 error: src/backend/utils/cache/typcache.c: patch does not apply ``` So it's difficult to confirm case 4, not to mention the fact that we are unable to test the patches on cfbot. Could you please rebase the patches against the recent master branch (in any order) and submit the result of `git format-patch` ? -- Best regards, Aleksander Alekseev