Gehel added a comment.

Per T192963: Store Kafka poller position data in the WDQS database does this mean that right now kafka doesn't actually store the latest timestamp or any position in the query service? How does this affect the dateModified triple used for lag detection?

Kafka does store the offset for each partition, and we do use it during an updater run. We ignore it across updater restarts. This is done so that we can easily restart / replay updates from a time in the past, without messing with kafka (I'm not entirely convinced, but this does have some merit). So the offsets stored in the WDQS database are only used at the start of the updater.

The dateModified triple is a global check, so yes if we do have multiple updaters running in parallel, we might not catch the failure of only one of them with a single check. There are other things that could be checked in that case (like the batch progress), which can be published by updater.


TASK DETAIL
https://phabricator.wikimedia.org/T192871

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel
Cc: Gehel, Addshore, Daniel_Mietchen, Smalyshev, Aklapper, RazShuty, LJ, Lahi, Gq86, Darkminds3113, Andrawaag, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, Avner, Jonas, FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Lydia_Pintscher, Mbch331
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to