Michael added a subscriber: hoo.
Michael added a comment.

  Please take everything I write here with a grain of salt, I'm really not firm 
with databases. This is mainly me thinking aloud.
  
  Reading the section in the wiki about transaction scope was helpful for me to 
understand the mediawiki specific context: 
https://www.mediawiki.org/wiki/Database_transactions#Transaction_scope
  
  From reading the comments on the related changes on Gerrit and linked 
Phabricator tasks, it seems to me that the current code expects to not run in a 
transaction at all and to directly write to the database and then wait for 
replication after each batch.
  
  And `AddUsagesForPageJob` seems to indeed not set the 
`JOB_NO_EXPLICIT_TRX_ROUND` flag.
  
  So, what @aaron writes seems to be good avenues to explore.
  
  I might have another one:
  
  name=\Wikibase\Client\Usage\Sql\EntityUsageTable::addUsages
                $writeConnection = $this->getWriteConnection();
                foreach ( $batches as $rows ) {
                        $writeConnection->newInsertQueryBuilder()
                                ->insertInto( $this->tableName )
                                ->ignore()
                                ->rows( $rows )
                                ->caller( __METHOD__ )->execute();
                        $c += $writeConnection->affectedRows();
    
                        // Wait for all database replicas to be updated, but 
only for the affected client wiki.
                        $this->db->replication()->wait();
                }
  
  The waiting seems to happen //between// batches, but is it also important to 
happen after the //last// batch? Or, if there is only one batch, then must it 
also happen after that one batch?
  
  If there are many pages that have only a few usages that do not fill a whole 
badge, then that might lead to a lot of needless waiting in transactions. That 
might amount to more than the actual transactions with multiple batches?
  
  Also, I noticed @hoo in many of the changes touching this code.
  
  I still have to understand the implications of named locks better.
  Also for the suggestion of checking what rows already exist needs more 
thinking. I could see that making a difference if we have many pages with a lot 
of usages that rarely change, and so pruning them might reduce the number of 
batches.

TASK DETAIL
  https://phabricator.wikimedia.org/T255706

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Michael
Cc: hoo, Lucas_Werkmeister_WMDE, ItamarWMDE, Ladsgroup, Krinkle, eprodromou, 
aaron, Michael, Aklapper, thcipriani, Danny_Benjafield_WMDE, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, Akuckartz, darthmon_wmde, Rosalie_WMDE, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Verdy_p, Wikidata-bugs, aude, Jdforrester-WMF, Mbch331, 
Jay8g
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to