Addshore added a comment.

  **Reproduction case:**
  
  - Come up with some easy grepable string, such as "IMGREPABLE87654"
  - Create a new item on testwikidatawiki with this string as the label
  - Create a new page on testcommonswiki with this string as the title
  - Close the page on testcommonswiki and do not re open it or query the api 
etc about this page at all
  - Add a sitelink to testcommonswiki and the page you create
  - Wait and do not touch anything for some time.
  - You have reproduced the issue
  
  **Reproduction verification:**
  
  - You can see that a `wb_changes` entry was made for the item using something 
like `select * from wb_changes_subscription where cs_subscriber_id = 
"testcommonswiki" AND cs_entity_id = "Q215220";`
  - You can assume that the dispatch process did look at this wb_changes entry 
(not code snippet provided but I tested this separately)
  - You can see that no `ChangeNotificationJob` was scheduled for the entry in 
`wb_changes` using something like `kafkacat -C -b kafka-main1001.eqiad.wmnet -p 
0 -t 'eqiad.mediawiki.job.ChangeNotification' -o -10000 | grep test | grep 
547009` where the last grep is the ID of the `wb_changes` row.
  - As this job was not scheduled you would also not expect to see the html 
cache update job or the links update job.
  - You can confirm that a html cache update job has not run, as the page still 
has a parser cache entry 
`(MediaWiki\MediaWikiServices::getInstance()->getParserCache())->get(Title::newFromText($t)->toPageRecord(),ParserOptions::newCanonical())->getCacheTime();`
 where $t is the page title.
  - You can also confirm nothing is in the page_props table 
`ageProps::getInstance()->getProperties([Title::newFromText($t)],'wikibase_item');`
  - You can also confirm that the page on testcommonswiki is NOT subscribed to 
the Item `select * from wb_changes_subscription where cs_subscriber_id = 
"testcommonswiki" AND cs_entity_id = "Q215220";`
  
  **The issue**
  The dispatch process relies on clients being subscribed to entities in order 
for the `ChangeNotificationJobs` to be sent.
  This can be seen in 
https://github.com/wikimedia/Wikibase/blob/master/repo/includes/ChangeDispatcher.php#L400
 
  Which ultimately ends up querying this `wb_changes_suvscription` table
  
https://github.com/wikimedia/Wikibase/blob/33c8fb8a6fd09d15cc7a3c47bdc6c60e398ce0e2/client/includes/Usage/Sql/SqlSubscriptionManager.php#L100
  This is ONLY written to by a job on client, on parser output save
  
https://github.com/wikimedia/Wikibase/blob/33c8fb8a6fd09d15cc7a3c47bdc6c60e398ce0e2/client/includes/Hooks/DataUpdateHookHandler.php#L195
  Thus if either of the following are not the case, the dispatch process will 
not happen for the client page:
  
  - The page is not already subscribed for some reason
  - The page does not get it's parser output generated for some reason between 
the wikidata edit and the dispatch process processing the change
  
  **Making the process work with 1 more edit:**
  
  - It's now time to load the commons page again and purge the page with 
`action=purge`
  - This will trigger the job adding the subscription, which can be seen with 
something like `select * from wb_changes_subscription where cs_subscriber_id = 
"testcommonswiki" AND cs_entity_id = "Q215220";`
  - We can now make another edit to the item (adding a description for example) 
https://test.wikidata.org/w/index.php?title=Q215220&diff=540854&oldid=540853
  - Which should make a new change that we can grep for a job for and see such 
as `kafkacat -C -b kafka-main1001.eqiad.wmnet -p 0 -t 
'eqiad.mediawiki.job.ChangeNotification' -o -100000 | grep 547010`
  - This will in turn schedule the 2 jobs we expect (htmlcacheupdate and links 
update)
  - We can see that these have happened as now the parsercache timestamp will 
be bumped again, and the page_props table will be updated shortly ( a few mins 
at most)
  
  **Suspicion**
  This code has existed in this way forever (ish)
  At some point Wikibase used to fully purge the pages on client, maybe even in 
a slightly different place in code? at some point this changed, and this flow 
and assumptions of order was broken somehow?
  
  Will wait until next week for @Jakob_WMDE and or @Michael to confirm this 
reproduction, then I think we could close the investigation and come up with a 
plan of attack

TASK DETAIL
  https://phabricator.wikimedia.org/T280627

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Michael, Addshore
Cc: Krinkle, daniel, Jakob_WMDE, Mike_Peel, Aklapper, Jheald, 
Lucas_Werkmeister_WMDE, Addshore, WMDE-leszek, aaron, tstarling, LibrErli, 
Lea_Lacroix_WMDE, Ladsgroup, RolandUnger, Urbanecm, Bencemac, Tacsipacsi, 
Kizule, CCicalese_WMF, Lydia_Pintscher, Invadibot, maantietaja, Muchiri124, 
Hazizibinmahdi, CBogen, Akuckartz, Iflorez, WDoranWMF, alaa_wmde, holger.knust, 
EvanProdromou, Nandana, Lahi, Gq86, Ramsey-WMF, GoranSMilovanovic, QZanden, 
LawExplorer, Poyekhali, _jensen, rosalieper, Agabi10, Taiwania_Justo, 
Scott_WUaS, Pchelolo, Jonas, Ixocactus, Wong128hk, Wikidata-bugs, aude, 
El_Grafo, Dinoguy1000, Steinsplitter, Mbch331, Keegan
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to