ori created this task.
ori added subscribers: ori, aude, hoo, daniel, JeroenDeDauw.
ori added projects: Performance-Team, Wikidata.
Herald added a subscriber: Aklapper.

TASK DESCRIPTION
  Looking at performance data for `load.php` requests, I notice that we spend 
quite a lot of time in `SitesModuleWorker::getSitesHash()`. So I would like to 
know what that method does.
  
  I grep for it, and find that it calls `SitesModuleWorker::getSites()`. 
`SitesModuleWorker::getSites()`, in turn, calls `SiteStore::getSites()`.
  
  But SiteStore is an interface, and I don't know which concrete implementation 
SitesModuleWorker is using, because the implementation is specified via a 
parameter to `SitesModuleWorker::__construct()`. So I need to look and see 
where SitesModuleWorker is instantiated.
  
  That takes me to `lib/includes/modules/SitesModule.php:29`, where I see that 
it is `SiteSQLStore::newInstance()`. OK.
  
  But where is SiteSQLStore? Oh, I see that it's in Core. OK, let's pull it up 
and see what its `SiteSQLStore::getSites()` method does.
  
  Looks like it doesn't have one! SiteSQLStore extends CachingSiteStore, so I 
guess the method is implemented there. OK, let's go look at 
`CachingSiteStore::getSites()`.
  
  OK, so CachingSiteStore tries to get the sites from its cache object, which 
is a BagOStuff. I don't know which, because, again, that is up to the caller of 
`CachingSiteStore::__construct()`. Which is, it turns out, 
`SiteSQLStore::newInstance()`. But `SiteSQLStore::newInstance()` itself takes a 
`$cache` parameter. Fortunately, I remember that the caller is back in 
`SitesModule.php:29`, and I see that it does not specify a value for `$cache`, 
which means that it must be defaulting to `CACHE_ACCEL` for HHVM.
  
  Now I know what happens on a cache hit. Progress. But what about a cache 
miss? In that case, `CachingSiteStore::getSites()` calls 
`$this->siteStore->getSites();`. But what is its `$siteStore`? That is up to 
the caller of `CachingSiteStore::__construct()`. So I go back to 
`SiteSQLStore::newInstance()`, and find out that it is passing in a 
`DBSiteStore` instance.
  
  OK, let's go to DBSiteStore. `DBSiteStore::getSites()` calls 
`$this->loadSites();`. That leads me to `DBSiteStore::loadSites()`. Here, 
finally, I see an actual database query.
  
  All in all, to figure out what `SitesModuleWorker::getSitesHash()` is doing, 
I had to examine eight methods on no fewer than **six classes**.
  
  I still don't know why it is slow: cache hits and misses are not 
instrumented, and I haven't analyzed the query to see how the database executes 
it, but because I spent half an hour opening and closing files I am too 
exasperated to do that work.
  
  I have repeatedly suggested using a static array in `wmf-config` for this 
data, and at one point I believe this may have actually been implemented. Why 
are we not doing that?

TASK DETAIL
  https://phabricator.wikimedia.org/T113665

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ori
Cc: JeroenDeDauw, daniel, hoo, aude, ori, Aklapper, Wikidata-bugs



_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to