Addshore added a comment.
I dug into some of this code a little bit more and put some findings below. I believe that rather than use and abuse this entity info thing, we should create our own new things specifically for prefetching referenced entities if needed. EntityInfo as a concept is not currently intentionally used and should be removed. **What is EntityInfo, why does it exist and is it still used?** - EntityInfo was introduced back in the day to "prefetch" entity info needed for rendering a page, this includes terms needed for rendering a page - The EntityInfo builder currently uses the terms storage tables, and during the rollout of wb_terms towards the end another layer of caching needed to be added in front of it https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Wikibase/+/546659/ - When the "formattercache" was introduced in https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Wikibase/+/441010/30/lib/includes/Formatters/WikibaseValueFormatterBuilders.php the use of `labelDescriptionLookupFactory->getLabelDescriptionLookup` was removed, which would have use EntityInfo based term lookups for formatting entity ID values. The new behaviour instead introduced a caching lookup that always used a EntityRetrievingTermLookup - Despite not using the entityinfo, it hasn't since been cleaned up / removed (I seem to remember there being a ticket relating to this? but I can't find it) As a result of the above, one of the first things that happens when rendering an entity is getting the entity info, which is then not used at all during rendering, as this is all now done via the formatted cache things. **Some other interesting observations (hinting toward removing entity info):** - `FormatterLabelDescriptionLookupFactory::OPT_LABEL_DESCRIPTION_LOOKUP` is likely fairly pointless now, as this option is not looked at (see above) when creating the formatted to be used for most things. However:::: - `WikibaseValueFormatterBuilders::newItemPropertyIdHtmlLinkFormatter` uses the formatted cache, and creates its own lookup - Other methods in this class still call `labelDescriptionLookupFactory->getLabelDescriptionLookup` which includes a check for OPT_LABEL_DESCRIPTION_LOOKUP These include: - newPlainEntityIdFormatter (seems it might need terms? it is a plain formatter that is used in newEntityIdFormatter (see next line) - newEntityIdFormatter - getVocabularyUriFormatter - However adding breakpoints locally to the EntityInfoTermLookup class methods didn't result in being hit - Adding a breakpoint to the OPT_LABEL_DESCRIPTION_LOOKUP case in `getLabelDescriptionLookup` resulted in hits for: - getVocabularyUriFormatter (so this code path still gets the terminfo lookup specified) - newLabelsProviderEntityIdHtmlLinkFormatter (we just touched / are touching this in fed prop) (it is called by newEntityIdFormatter) - All calls to this should be able to use the formatter cache and we remove EntityInfo **Some bookmarks of the code areas I was looking at throughout this** F31841798: image.png <https://phabricator.wikimedia.org/F31841798> **Impact of removal** The current rate of requests to this EntityInfo builder stuff: https://grafana.wikimedia.org/d/u5wAugyik/wikibase-statsdrecordingsimplecache?panelId=2&fullscreen&orgId=1&from=now-1h&to=now The rate of requests for EntityInfo is essentially the same as the # of parser output generations for wikibase data. Looking at the cache metrics In a 12 hour period this is ~120million calls, so 10 million an hour, or 166k per min or ~2700 calls per second The cache hits come from local server cache, but the misses result in lookups from the terms storage db tables. We also track the db calls for this code path that may not be needed on https://grafana.wikimedia.org/d/000000548/wikibase-sql-term-storage?orgId=1&refresh=30s You can see these in the top panel for the "select.EntityInfoBuilder_collectTermsForEntities" metric. Averaging 4.7k ops a second/minute (TBA confirm this??) Needless to say that not doing all of this stuff that is not needed should speed things up and reduce memory consumption. It will also remove an old outdated concept that should have been killed some while ago. TASK DETAIL https://phabricator.wikimedia.org/T253125 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Addshore Cc: RhinosF1, Aklapper, Addshore, Jimfhahn, darthmon_wmde, Nandana, lucamauri, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs