[Wikidata-bugs] [Maniphest] T286938: Create a plan for a final streaming updater rollout as source of truth for blazegraph instances

2021-09-13 Thread Gehel
Gehel closed this task as "Resolved".

TASK DETAIL
  https://phabricator.wikimedia.org/T286938

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse, Gehel
Cc: EBernhardson, RKemper, Aklapper, dcausse, Gehel, MPhamWMF, Zbyszko, 
Invadibot, maantietaja, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T286938: Create a plan for a final streaming updater rollout as source of truth for blazegraph instances

2021-08-24 Thread dcausse
dcausse added a comment.


  In T286938#7302853 , 
@EBernhardson wrote:
  
  > A couple thoughts, perhaps one will even be useful:
  >
  >> start import on wdqs1009 and wdqs2008 with --skolemize: best case 10 days 
(import from 2 machines to maximize the chances of success)
  >
  > I have some memory that we thought this could be sped up with skolemizing 
in hadoop, that currently runs weekly and take a few hours.  How far are we 
from being able to feed those outputs into blazegraph, and would we expect much 
improvement? Or maybe the process is fragile enough it's not worth adding risks 
here.
  
  Indeed, munging on a single core will take around ~20hours IIRC (around 8% of 
the import time) compared to 3hours in hadoop, unfortunately we don't have the 
process to serialize the resulting hive table back to plain TTL files and ship 
them to the target machine. I don't think anything there is complicated but 
these data-sharing/transfer tasks tend to be complex to put in place and 
stabilize (this one does not have to be automated though).
  
  >> start data-transfer + updater-consumer activation, wdqs2008 -> all codfw 
machines (EST: 2 to 3days: 3h/machine*7
  >>
  >> - Figure out if there is a way to optimize and parallelize this process
  >
  > With 7 machines, i guess we could cut it to 3 steps by also copying from 
the machines we copied to in a previous step. Plausibly brings runtime to 
single day, next step is live deployment so mostly it frees us up for testing 
the service thurs/fri before we go live. Should mostly amount to starting the 
transfer from more machines each round.
  >
  > 1. a->b
  > 2. a->c, b->d
  > 3. a->e, b->f, c->g
  
  Makes sense, thanks!
  Given some of these tasks will be launched manually I guess it would make 
sense to make these actions more concrete and write them down as you did with 
real hostnames.
  
  >> except wdqs1010 that we could use as source for emergency rollback
  >
  > I worry about having only a single source for emergency rollback. If we 
think we still need that option then keeping at least two copies would be 
typical, but do we have enough machines to keep two back reasonably?  Also it 
might be worth figuring out how we can decide when the emergency rollback can 
be decom'd, but then again we could wait until it's obvious that we can't go 
back anymore.
  
  True, I think we can keep one additional machine in codfw from the internal 
cluster.
  I think blockers are likely to be detected while the spare DC is being 
migrated but it might be good to keep these two machines for a couple months.

TASK DETAIL
  https://phabricator.wikimedia.org/T286938

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: EBernhardson, RKemper, Aklapper, dcausse, Gehel, MPhamWMF, Zbyszko, 
Invadibot, maantietaja, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T286938: Create a plan for a final streaming updater rollout as source of truth for blazegraph instances

2021-08-23 Thread EBernhardson
EBernhardson added a comment.


  A couple thoughts, perhaps one will even be useful:
  
  > start import on wdqs1009 and wdqs2008 with --skolemize: best case 10 days 
(import from 2 machines to maximize the chances of success)
  
  I have some memory that we thought this could be sped up with skolemizing in 
hadoop, that currently runs weekly and take a few hours.  How far are we from 
being able to feed those outputs into blazegraph, and would we expect much 
improvement? Or maybe the process is fragile enough it's not worth adding risks 
here.
  
  > start data-transfer + updater-consumer activation, wdqs2008 -> all codfw 
machines (EST: 2 to 3days: 3h/machine*7
  >
  > - Figure out if there is a way to optimize and parallelize this process
  
  With 7 machines, i guess we could cut it to 3 steps by also copying from the 
machines we copied to in a previous step. Plausibly brings runtime to single 
day, next step is live deployment so mostly it frees us up for testing the 
service thurs/fri before we go live. Should mostly amount to starting the 
transfer from more machines each round.
  
  1. a->b
  2. a->c, b->d
  3. a->e, b->f, c->g
  
  > except wdqs1010 that we could use as source for emergency rollback
  
  I worry about having only a single source for emergency rollback. If we think 
we still need that option then keeping at least two copies would be typical, 
but do we have enough machines to keep two back reasonably?  Also it might be 
worth figuring out how we can decide when the emergency rollback can be 
decom'd, but then again we could wait until it's obvious that we can't go back 
anymore.

TASK DETAIL
  https://phabricator.wikimedia.org/T286938

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse, EBernhardson
Cc: EBernhardson, RKemper, Aklapper, dcausse, Gehel, MPhamWMF, Zbyszko, 
Invadibot, maantietaja, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T286938: Create a plan for a final streaming updater rollout as source of truth for blazegraph instances

2021-08-04 Thread dcausse
dcausse moved this task from Ready for Development to Needs review on the 
Discovery-Search (Current work) board.
dcausse claimed this task.
dcausse added a comment.


  Suggested plan: 
https://wikitech.wikimedia.org/wiki/Wikidata_Query_Service/Streaming_Updater_Rollout_Plan

TASK DETAIL
  https://phabricator.wikimedia.org/T286938

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: RKemper, Aklapper, dcausse, Gehel, MPhamWMF, Zbyszko, Invadibot, 
maantietaja, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T286938: Create a plan for a final streaming updater rollout as source of truth for blazegraph instances

2021-07-26 Thread MPhamWMF
MPhamWMF set the point value for this task to "3".

TASK DETAIL
  https://phabricator.wikimedia.org/T286938

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: MPhamWMF
Cc: RKemper, Aklapper, dcausse, Gehel, MPhamWMF, Zbyszko, Invadibot, 
maantietaja, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T286938: Create a plan for a final streaming updater rollout as source of truth for blazegraph instances

2021-07-26 Thread MPhamWMF
MPhamWMF triaged this task as "High" priority.

TASK DETAIL
  https://phabricator.wikimedia.org/T286938

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: MPhamWMF
Cc: RKemper, Aklapper, dcausse, Gehel, MPhamWMF, Zbyszko, Invadibot, 
maantietaja, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T286938: Create a plan for a final streaming updater rollout as source of truth for blazegraph instances

2021-07-19 Thread Zbyszko
Zbyszko created this task.
Zbyszko added projects: Wikidata-Query-Service, Wikidata, Discovery-Search 
(Current work).
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  As a streaming updater deployer I want to have a plan for the production 
rollout so that the production rollout and switch for the source of truth for 
Blazegraph instances goes smoothly.
  
  AC:
  
  - Plan for rollout (including Streaming Updater Producer, initial state 
configuration and streaming updater consumer) is written down
  - tickets for each stage are created.

TASK DETAIL
  https://phabricator.wikimedia.org/T286938

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Zbyszko
Cc: Aklapper, dcausse, Gehel, MPhamWMF, Zbyszko, Invadibot, maantietaja, 
CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org