[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-07-07 Thread Manuel
Manuel closed this task as "Resolved".
Manuel moved this task from Incoming to Needs product sign-off on the Wikidata 
Analytics (Kanban) board.

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

WORKBOARD
  https://phabricator.wikimedia.org/project/board/6546/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: odimitrijevic, ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, 
Aklapper, Manuel, JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-07-07 Thread Manuel
Manuel edited projects, added Wikidata Analytics (Kanban); removed Wikidata 
Analytics.

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: odimitrijevic, ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, 
Aklapper, Manuel, JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-07-04 Thread Manuel
Manuel merged a task: T332898: Wikidata Concepts Monitor ETL migration to 
Spark3.

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-07-04 Thread Manuel
Manuel added a parent task: T332899: [EPIC] Migrate our selected R-based 
Wikidata products .

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-07-04 Thread Manuel
Manuel added a comment.


  > With those changes there is no more blocker in migrating to the 
spark3-shuffler from this task :)
  
  \o/
  
  Thank you again for your super helpful support on this, @JAllemandou!

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, Manuel
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-07-03 Thread JAllemandou
JAllemandou added a comment.


  We met this morning with @AndrewTavis_WMDE and @Manuel - Thank you folks for 
the great meeting.
  The detailed Meeting notes are here: 
https://docs.google.com/document/d/1REsolXnZf2KqApL0p-DE8X4eWXI_zxHgrCe3k1hcZnw
  
  From the job list in previous comment:
  
  - 4 don't run spark andare kept as-is: `WMDE_BannerImpressions`, 
`Wiktionary_CognateDashboard`, `2021_WMDE_Mitmachen_Bereich_2021_Campaign`, 
`WDCM_Sqoop_Clients`)
  - 3 are stopped (crontaab commented): `Qurator_CuriousFacts`, 
`WDCM_EngineBiases`, `WD_PageviewsPerType`
  - 3 have been updated to run spark2 in fixed-resource mode, thus normally not 
failing after the migration to the spark3-shuffler: `WD_UsageCoverage`, 
`WD_languagesLandscape`, `NewEditors_comprehensive_report`
  
  With those changes there is no more blocker in migrating to the 
spark3-shuffler from this task :)

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, JAllemandou
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-06-30 Thread JArguello-WMF
JArguello-WMF edited projects, added Data Engineering and Event Platform Team; 
removed Data Pipelines.
JArguello-WMF moved this task from Data Eng Backlog to Radar on the Data 
Engineering and Event Platform Team board.

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

WORKBOARD
  https://phabricator.wikimedia.org/project/board/6628/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, JArguello-WMF
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331, EChetty
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-06-30 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Hey @JAllemandou!
  
  Thanks for all your efforts to find these jobs! Really appreciate it  Has 
been a bit difficult to figure out what the infrastructure we have everywhere 
is. I'll update T340718  with the 
information you posted above.
  
  Thanks for booking the time! Accepted the meeting and also invited @Manuel :)
  
  Hope you have a nice weekend!

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-06-30 Thread JAllemandou
JAllemandou added a comment.


  Hi @AndrewTavis_WMDE,
  I've done some investigation, and here is what I have: Goran has 11 CRON jobs 
running from various hosts on our system (1on `stat1004`, 2 on `stat1007`, 7 on 
`stat1008`).
  
  - `WDCM_Sqoop_Clients` runs on`stat1004` weekly - It doesn't run spark (but 
Sqoop)
  - `2021_WMDE_Mitmachen_Bereich_2021_Campaign` runs on `stat1007` daily -  It 
doesn't run spark (but Hive)
  - `WD_PageviewsPerType` runs on `stat1007` daily but has been failing since 
February 17th - It runs a spark job
  - `WD_UsageCoverage` runs on `stat1008` daily - It runs a spark job
  - `WD_languagesLandscape` runs on `stat1008` monthly (30th of the month) - It 
runs a spark job
  - `Wiktionary_CognateDashboard` runs on `stat1008` daily - It doesn't run 
spark
  - `WDCM_EngineBiases` runs on `stat1008` weekly - It runs a spark job
  - `Qurator_CuriousFacts` runs on `stat1008` monthly (10th of the month) - It 
runs a spark job
  - `WMDE_BannerImpressions` runs on `stat1008` hourly - It doesn't runspark 
(but Hive)
  - `NewEditors_comprehensive_report` runs on `stat1008` daily - It runs a 
spark job
  
  We need to meet and talk about your usage of the data generated by those 
scripts, and see what you wish us to try to make work versus stop.
  I'm booking some time on your calendar next Monday :)

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, JAllemandou
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-06-29 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  > We can test that :)
  
  @JAllemandou, not sure what the tests entail, but feel free to look into it 
and please let us know what the results are  As long as the tests turn out ok 
and it's not too much of a bother, then we're fine with this and going with 
option two to use fixed-resource for now :)

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-06-22 Thread JAllemandou
JAllemandou added a comment.


  In T334951#8952583 , 
@AndrewTavis_WMDE wrote:
  
  > - If the answer to the above question of permanently losing some data 
that's being produced by Concepts Monitor and other WMDE jobs is no, then we're 
ok with option one above of stopping the job.
  
  I am not knowledgeable at all about the data generated by the  job 
unfortunately, preventing me to assess whether there is data generated by the 
job that we would not be able to regenerate.
  Also, I have not been told about intermediary data stored on the cluster, 
making me think that all the data generated by the job is small enough to be 
saved for the reports only.
  But as stated befoe, those are  uninformed ideas :(
  
  > - Aside from this we'd prefer option two of configuring it to use 
fixed-resource.
  
  We can test that :)

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, JAllemandou
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-06-21 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Checking in with you all on this:
  
  The big question that @Manuel and I have on this is whether or not this 
process is currently generating some time series data that would be of 
particular value and then would be lost and not able to be recreated from the 
data lake. I have been able to get the pageviews per namespace time series 
dashboard up and running locally, but that has not updated since February. The 
other data that's being produced by this process, which can be found here 
,
 seems to just be current values and not time series (except for propertypairs, 
which we're investigating further). Do you all have any knowledge of some 
tables within the data lake that are being updated by this process that would 
then be broken by us stopping the job?
  
  A broader question: does anyone at WMF know of tables in the data lake that 
Wikimedia Deutschland created/maintained in the past that are still being 
updated? Would be great to be pointed towards them so we can investigate 
further :)
  
  Our thoughts on this:
  
  - If the answer to the above question of permanently losing some data that's 
being produced by Concepts Monitor and other WMDE jobs is no, then we're ok 
with option one above of stopping the job.
  - Aside from this we'd prefer option two of configuring it to use 
fixed-resource.

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-06-19 Thread JAllemandou
JAllemandou added a comment.


  In T334951#8946790 , 
@AndrewTavis_WMDE wrote:
  
  > I'll async with him now and see if we can come to a decision sooner than 
that, but you all will have the answer by Wednesday at the latest 
  
  Awesome, thank you :)

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, JAllemandou
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-06-19 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  I'll async with him now and see if we can come to a decision sooner than 
that, but you all will have the answer by Wednesday at the latest 

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-06-19 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Thanks for reaching out, @JAllemandou!
  
  We're making progress in getting local copies of the dashboards up and 
running. @Manuel and I will be discussing them on Wednesday and can get back to 
you all then :)
  
  Hope all had a nice weekend!

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-06-19 Thread JAllemandou
JAllemandou added a comment.


  Hi Folks - What is the status on this one?
  
  I'd like Data-Engineering to announce the deprecation of Spark2 for this end 
of month, but not without knowing how we plan on tackling your job :)
  Here are the 2 possible solutions I can think of:
  
  - Stopping the job while it is revamped to spark3 (Knowing that the dashboard 
is broken, is it a possible solution?)
  - Configure the job not to use DynamicAllocation but to use fixed-resource, 
making the job work in spark2 despite spark2 being deprecated, but using more 
cluster resources than really needed
  - Postpone deprecating spark2 (if we could not do that, I'd be super happy :)
  
  Let me know your thoughts :)

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, JAllemandou
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-05-09 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  Thanks all for your willingness to help! We'll be in touch in June once we've 
had the initial meetings with @ItamarWMDE. Those are planned for the 13th and 
14th :)

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-05-09 Thread xcollazo
xcollazo reassigned this task from xcollazo to AndrewTavis_WMDE.
xcollazo added a comment.


  (Switching ownership to reflect @Manuel's comment above. )

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE, xcollazo
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-05-09 Thread xcollazo
xcollazo added a comment.


  > I'm not sure what plans @xcollazo has for the migration either, so it could 
be that these parts of the process are migrated away from the stats servers to 
an airflow based pipeline.
  > Anyway, I'm happy to try to help if I can, but Xabriel almost certainly 
knows more about the plan than I do.
  
  When I first looked at this, I definitely thought that the code that runs on 
the stat servers could benefit from moving to Airflow, since we are 
discouraging folks from doing production work on stat machines. But I hesitated 
to put it as a target for this task because it is not strictly necessary to 
move this codebase to Spark3.
  
  Having said that, I am super happy to help on that front if folks are 
interested to get this codebase setup for the long run.

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: xcollazo
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-05-09 Thread BTullis
BTullis added a comment.


  In T334951#8836832 , 
@Manuel wrote:
  
  > @AndrewTavis_WMDE is our newly hired Data Analyst for Wikidata. The plan is 
that he will mainly work on this with support from @xcollazo (WMF), @ItamarWMDE 
(Staff Engineer for Wikidata), and me (Analytics Product Manager for Wikidata). 
We plan to first evaluate the situation of the WDCM in June. Ideally, we would 
start the migration only based on that evaluation.
  >
  > Would that fit your plans, or is there already a risk of losing data by 
then?
  
  Great. I don't believe that there is any risk of losing data by then.
  Spark2 will no longer be available when the Hadoop cluster is upgraded to 
Debian bullseye, but that's a little way off yet.
  
  It might also be relevant to look at those parts of the job that currently 
run on the stats servers. By the look of it that's stat1004 and stat1007.
  Both of these run Debian buster. We're just starting to bring in bullseye 
based stats servers: e.g. T336036: Bring stat1009 into service 
 and T336040: Bring stat1010 into 
service with GPU from stat1005  at 
which point we will start on decommissioning some buster based stats servers 
and upgrading those that remain.
  
  I'm not sure what plans @xcollazo has for the migration either, so it could 
be that these parts of the process are migrated away from the stats servers to 
an airflow based pipeline.
  Anyway, I'm happy to try to help if I can, but Xabriel almost certainly knows 
more about the plan than I do.

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: xcollazo, BTullis
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-05-09 Thread Manuel
Manuel added a comment.


  No excuse is needed whatsoever! Wikimedia is now responsible for the WDCM, 
and we will deal with this. There is no question about it. And I am grateful 
that we can still ask you questions in case we get lost. 

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: xcollazo, Manuel
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-05-09 Thread GoranSMilovanovic
GoranSMilovanovic added a comment.


  @Manuel It is not that I am very much involved, but the professional 
situation with me is simply as it is: I can barely find any time besides the 
responsibilities that I carry. However, I will really make an effort in the end 
of May to clear up some space for our work in June. Objective constraints - 
that is all that I can offer as an excuse.

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: xcollazo, GoranSMilovanovic
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-05-09 Thread Manuel
Manuel added a subscriber: ItamarWMDE.
Manuel added a comment.
Restricted Application added a project: User-ItamarWMDE.


  Hi @BTullis, thank you for your offer, we might take you up un that!
  
  The plan is as follows: As @AndrewTavis_WMDE (Data Analyst for Wikidata) 
mentioned already, he will mainly work on this with support from @xcollazo 
(WMF), @ItamarWMDE (Staff Engineer for Wikidata), and me (Analytics Product 
Manager for Wikidata). Our plan is to evaluate the situation of the WDCM in 
June. Ideally, we would start the migration only based on that evaluation.
  
  Would that fit your plans, or is there already a risk of loosing data by then?
  
  @GoranSMilovanovic: I am very aware that you are only working on this as a 
volunteer. So, no worries, we will try to solve this as good as we can on our 
own, and only ask you about stuff were we are lost otherwise. And thank you for 
staying involved!

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: xcollazo, Manuel
Cc: ItamarWMDE, BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-05-09 Thread BTullis
BTullis added a comment.


  Thanks both for the input. Let me know if I can help at all.

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: xcollazo, BTullis
Cc: BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-05-09 Thread AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.


  @Manuel can give a more up to date rundown of our plans for all this. I'll be 
working on the migration with him :)

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: xcollazo, AndrewTavis_WMDE
Cc: BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-05-09 Thread GoranSMilovanovic
GoranSMilovanovic added a comment.


  @BTullis With all the good will I have to help with this, there's no chance 
I'll be able to help before June 2023.

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: xcollazo, GoranSMilovanovic
Cc: BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-05-09 Thread BTullis
BTullis added subscribers: GoranSMilovanovic, BTullis.
BTullis added a comment.


  Is @GoranSMilovanovic available to help steward the migration and test that 
the output is as expected?
  
  Is there any chance that we could get the https://wdcm.wmflabs.org/ site 
working again before we do the migration, so that we may more easily test 
results? Or is there not really any value in that for us?

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: xcollazo, BTullis
Cc: BTullis, GoranSMilovanovic, AndrewTavis_WMDE, Aklapper, Manuel, 
JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, karapayneWMDE, 
Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T334951: Wikidata Concepts Monitor ETL Migration to Spark3

2023-04-26 Thread Maintenance_bot
Maintenance_bot added a project: Wikidata.

TASK DETAIL
  https://phabricator.wikimedia.org/T334951

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: xcollazo, Maintenance_bot
Cc: Aklapper, Manuel, JAllemandou, lbowmaker, xcollazo, Astuthiodit_1, EChetty, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org