[Wikidata-bugs] [Maniphest] T255706: [SW] [WB-Client] [TECH] Wikibase\Client\Usage\Sql\EntityUsageTable::addUsages Deadlock

2023-10-06 Thread Michael
Michael added a subscriber: hoo.
Michael added a comment.


  Please take everything I write here with a grain of salt, I'm really not firm 
with databases. This is mainly me thinking aloud.
  
  Reading the section in the wiki about transaction scope was helpful for me to 
understand the mediawiki specific context: 
https://www.mediawiki.org/wiki/Database_transactions#Transaction_scope
  
  From reading the comments on the related changes on Gerrit and linked 
Phabricator tasks, it seems to me that the current code expects to not run in a 
transaction at all and to directly write to the database and then wait for 
replication after each batch.
  
  And `AddUsagesForPageJob` seems to indeed not set the 
`JOB_NO_EXPLICIT_TRX_ROUND` flag.
  
  So, what @aaron writes seems to be good avenues to explore.
  
  I might have another one:
  
  name=\Wikibase\Client\Usage\Sql\EntityUsageTable::addUsages
$writeConnection = $this->getWriteConnection();
foreach ( $batches as $rows ) {
$writeConnection->newInsertQueryBuilder()
->insertInto( $this->tableName )
->ignore()
->rows( $rows )
->caller( __METHOD__ )->execute();
$c += $writeConnection->affectedRows();

// Wait for all database replicas to be updated, but 
only for the affected client wiki.
$this->db->replication()->wait();
}
  
  The waiting seems to happen //between// batches, but is it also important to 
happen after the //last// batch? Or, if there is only one batch, then must it 
also happen after that one batch?
  
  If there are many pages that have only a few usages that do not fill a whole 
badge, then that might lead to a lot of needless waiting in transactions. That 
might amount to more than the actual transactions with multiple batches?
  
  Also, I noticed @hoo in many of the changes touching this code.
  
  I still have to understand the implications of named locks better.
  Also for the suggestion of checking what rows already exist needs more 
thinking. I could see that making a difference if we have many pages with a lot 
of usages that rarely change, and so pruning them might reduce the number of 
batches.

TASK DETAIL
  https://phabricator.wikimedia.org/T255706

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Michael
Cc: hoo, Lucas_Werkmeister_WMDE, ItamarWMDE, Ladsgroup, Krinkle, eprodromou, 
aaron, Michael, Aklapper, thcipriani, Danny_Benjafield_WMDE, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, Akuckartz, darthmon_wmde, Rosalie_WMDE, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Verdy_p, Wikidata-bugs, aude, Jdforrester-WMF, Mbch331, 
Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T255706: [SW] [WB-Client] [TECH] Wikibase\Client\Usage\Sql\EntityUsageTable::addUsages Deadlock

2023-10-06 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment.


  In T255706#9200496 , 
@aaron wrote:
  
  > I noticed that addUsages() uses `$this->db->replication()->wait();` which 
uses `LBFactory::waitForReplication()`. Doesn't that mean it's waiting for 
replicas without committing each batch? It seems like it would just hold more 
and more locks while waiting for unrelated events to replicate.
  
  Hm, that sounds like a good point, but I’m skeptical – could we really have 
missed this when the waiting was introduced and in the five years since then? 
(Though admittedly this task is also three years old already.) Shouldn’t 
waiting for replication in an uncommitted transaction cause visible warnings or 
something?
  
  The waiting was added, with explicit `startAtomic()`+`endAtomic()`, in Make 
batches in EntityUsageTable::addUsages atomic 
, to 
address T192349: deadlocks on INSERT IGNORE INTO wbc_entity_usage 
 – and even though AFAICT jobs 
already ran with an outer transaction at the time, and usages were also already 
added via jobs at the time, this apparently still helped with the deadlocks, 
somehow. (Not long after, EntityUsageTable: Remove useless atomic section 
 
removed the `startAtomic()`+`endAtomic()` calls.)
  
  There’s also this:
  
  name=SqlUsageTracker::addUsedEntities()
// NOTE: while logically we'd like the below to be atomic, we 
don't wrap it in a
// transaction to prevent long lock retention during big 
updates.
$db = $this->connectionManager->getWriteConnection();
$usageTable = $this->newUsageTable( $db );
  
  This sounds a lot like the code is expecting to run without a surrounding 
transaction…
  
  AFAICT, “`Job::run()` has outer transaction scope” currently means “it is 
called after a transaction has been started (unless the 
`JOB_NO_EXPLICIT_TRX_ROUND` flag is set), but the transaction is ‘registered’ 
to the job’s `__METHOD__`, so the job is free to commit this transaction or 
start other ones as much as it wants” – and I wonder if this was at some point 
mistaken for “the job has no open transaction at all unless it starts one”? 
(That’s what I first assumed “outer transaction scope” meant, at least.)

TASK DETAIL
  https://phabricator.wikimedia.org/T255706

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lucas_Werkmeister_WMDE
Cc: Lucas_Werkmeister_WMDE, ItamarWMDE, Ladsgroup, Krinkle, eprodromou, aaron, 
Michael, Aklapper, thcipriani, Danny_Benjafield_WMDE, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, Akuckartz, darthmon_wmde, Rosalie_WMDE, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Verdy_p, Wikidata-bugs, aude, Jdforrester-WMF, Mbch331, 
Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T255706: [SW] [WB-Client] [TECH] Wikibase\Client\Usage\Sql\EntityUsageTable::addUsages Deadlock

2023-09-26 Thread aaron
aaron added a comment.


  I noticed that addUsages() uses 
$this->db->replication()->wait(); which uses 
LBFactory::waitForReplication(). Doesn't that mean it's waiting 
for replicas without committing each batch? It seems like it would just hold 
more and more locks while waiting for unrelated events to replicate. Maybe 
AddUsagesForPageJob could use self::JOB_NO_EXPLICIT_TRX_ROUND (like 
RefreshLinksJob) and LBFactory::commitAndWaitForReplication() 
could be used instead. Perhaps addUsagesForPage() could wrap the update in a 
named lock based on the page ID, similar to 
LinksUpdate::acquirePageLock. Maybe addUsedEntities() could also 
do a SELECT first to see which rows are already there.
  
  I see the job spec for AddUsagesForPageJob has "removeDuplicates" but the 
parameters includes "usages". Maybe successive edits result in slight 
differences in params that bypass de-duplication?

TASK DETAIL
  https://phabricator.wikimedia.org/T255706

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: aaron
Cc: ItamarWMDE, Ladsgroup, Krinkle, eprodromou, aaron, Michael, Aklapper, 
thcipriani, Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, Akuckartz, darthmon_wmde, Rosalie_WMDE, Nandana, Lahi, Gq86, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Verdy_p, Wikidata-bugs, aude, Jdforrester-WMF, Mbch331, Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T255706: [SW] [WB-Client] [TECH] Wikibase\Client\Usage\Sql\EntityUsageTable::addUsages Deadlock

2023-09-26 Thread ItamarWMDE
ItamarWMDE renamed this task from "[WB-Client] [TECH] 
Wikibase\Client\Usage\Sql\EntityUsageTable::addUsages Deadlock" to "[SW] 
[WB-Client] [TECH] Wikibase\Client\Usage\Sql\EntityUsageTable::addUsages 
Deadlock".

TASK DETAIL
  https://phabricator.wikimedia.org/T255706

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ItamarWMDE
Cc: ItamarWMDE, brennen, Ladsgroup, Krinkle, eprodromou, aaron, Michael, 
Aklapper, thcipriani, Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, darthmon_wmde, Rosalie_WMDE, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Verdy_p, Wikidata-bugs, aude, Jdforrester-WMF, Mbch331, Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org