[Wikidata-bugs] [Maniphest] T281267: various weekly and daily dumps run from systemd timers are broken

2023-06-21 Thread ArielGlenn
ArielGlenn added a comment.


  @fgiunchedi I notice that in some cases phab tasks are autocreated when 
systemd units fail. Is that true for systemd jobs on snapshot hosts? Could we 
get tagged on those (Dumps-Generation) or could we get emails from those 
(ops-dumps@wm.o)?

TASK DETAIL
  https://phabricator.wikimedia.org/T281267

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Addshore, Tonina_Zhelyazkova_WMDE, WMDE-leszek, JAllemandou, fgiunchedi, 
jbond, hoo, dcausse, ArielGlenn, Protsack.stephan, Busfault, Astuthiodit_1, 
Atieno, karapayneWMDE, joanna_borun, Invadibot, Devnull, maantietaja, lmata, 
Muchiri124, jannee_e, ItamarWMDE, Akuckartz, holger.knust, Legado_Shulgin, 
ReaperDawn, Nandana, Davinaclare77, Techguru.pc, Lahi, Gq86, herron, 
GoranSMilovanovic, Chicocvenancio, Lunewa, Hfbn0, QZanden, LawExplorer, Zppix, 
Volans, _jensen, rosalieper, Scott_WUaS, Wong128hk, gnosygnu, Wikidata-bugs, 
aude, faidon, Mbch331, Jay8g, Hokwelum
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T68108: [Epic] Store media information for files on Wikimedia Commons as structured data

2023-06-21 Thread ArielGlenn
ArielGlenn closed subtask T226093: Capacity planning for Commons Structured 
Data as Resolved.

TASK DETAIL
  https://phabricator.wikimedia.org/T68108

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Mholloway, Ladsgroup, MarkTraceur, WMDE-leszek, jcrespo, Marostegui, 
AfroThundr3007730, Stashbot, _jensen, SandraF_WMF, Ramsey-WMF, CCicalese_WMF, 
PokestarFan, Saerdnaer, Juandev, Wesalius, Zppix, NMaia, Mattias_Ostmar-WMSE, 
Sadads, Poyekhali, -jem-, Deskana, Tfinc, Smalyshev, Jheald, LikeLifer, Yann, 
intracer, Spinster, Orofarne, Filceolaire, MZMcBride, bzimport, TheDJ, 
zhuyifei1999, DixonD, Bugreporter, RP88, Aklapper, Matanya, waldyrious, 
El_Grafo, Daniel_Mietchen, Jdforrester-WMF, GPHemsley, Bene, Legoktm, Nemo_bis, 
Lokal_Profil, Tobi_WMDE_SW, He7d3r, Petrb, jayvdb, Kelson, Steinsplitter, 
JeroenDeDauw, iecetcwcpggwqpgciazwvzpfjpwomjxn, revi, JanZerebecki, JeanFred, 
Ricordisamoa, Snowolf, Keegan, Rillke, Bawolff, Fabrice_Florin, Multichill, 
Liuxinyu970226, Ainali, Tgr, Lydia_Pintscher, jeremyb, Stryn, Ltrlg, daniel, 
Dereckson, JohnLewis, Udehb-WMF, Astuthiodit_1, BeautifulBold, Suran38, 
karapayneWMDE, Invadibot, GFontenelle_WMF, maantietaja, Y.ssk, FRomeo_WMF, 
Zblace, Peteosx1x, Muchiri124, NavinRizwi, CBogen, ItamarWMDE, Nintendofan885, 
Akuckartz, Nandana, JKSTNK, Lahi, Gq86, E1presidente, Cparle, 
GoranSMilovanovic, QZanden, Tramullas, Acer, V4switch, LawExplorer, Salgo60, 
Silverfish, rosalieper, Taiwania_Justo, Scott_WUaS, Susannaanas, Ixocactus, 
Wong128hk, Fuzheado, Jane023, Wikidata-bugs, Base, matthiasmullie, aude, 
Dinoguy1000, Raymond, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T226093: Capacity planning for Commons Structured Data

2023-06-21 Thread ArielGlenn
ArielGlenn closed this task as "Resolved".
ArielGlenn claimed this task.
ArielGlenn added a comment.


  There's no point in having this open for a once a year check in, so I'll go 
ahead and close it. When capacity planning needs to be done for dbs in the 
regular course of things, this can be discussed.

TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LSobanski, Nintendofan885, Ladsgroup, Abit, matthiasmullie, Marostegui, 
Addshore, Ramsey-WMF, jcrespo, Yann, MarkTraceur, ArielGlenn, Aklapper, 
Busfault, Astuthiodit_1, Atieno, karapayneWMDE, Invadibot, GFontenelle_WMF, 
maantietaja, FRomeo_WMF, jannee_e, CBogen, ItamarWMDE, Akuckartz, holger.knust, 
Nandana, JKSTNK, Lahi, Gq86, E1presidente, Cparle, SandraF_WMF, 
GoranSMilovanovic, Lunewa, QZanden, Tramullas, Acer, LawExplorer, Salgo60, 
Silverfish, _jensen, rosalieper, Scott_WUaS, Susannaanas, gnosygnu, Fuzheado, 
Jane023, Wikidata-bugs, Base, aude, Daniel_Mietchen, Ricordisamoa, Wesalius, 
Lydia_Pintscher, Raymond, Steinsplitter, Mbch331, Hokwelum
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T226093: Capacity planning for Commons Structured Data

2023-01-10 Thread ArielGlenn
ArielGlenn added a comment.


  In T226093#8512308 <https://phabricator.wikimedia.org/T226093#8512308>, 
@LSobanski wrote:
  
  > The task's original intent was to cover planning "over the next 3 years" 
starting in 2019. @ArielGlenn is the task still relevant, can it be closed, do 
we need a new one?
  
  It depends on whether any tables are expected to grow a fair amount in the 
next three years. @Ladsgroup will have a better handle on that now.

TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LSobanski, Nintendofan885, Ladsgroup, Abit, matthiasmullie, Marostegui, 
Addshore, Ramsey-WMF, jcrespo, Yann, MarkTraceur, ArielGlenn, Aklapper, 
Busfault, Astuthiodit_1, Atieno, karapayneWMDE, joanna_borun, Invadibot, 
GFontenelle_WMF, Devnull, maantietaja, FRomeo_WMF, Muchiri124, jannee_e, 
CBogen, ItamarWMDE, Akuckartz, holger.knust, Legado_Shulgin, ReaperDawn, 
Nandana, JKSTNK, Davinaclare77, Techguru.pc, Lahi, Gq86, E1presidente, Cparle, 
SandraF_WMF, GoranSMilovanovic, Lunewa, Hfbn0, QZanden, Tramullas, Acer, 
LawExplorer, Salgo60, Zppix, Silverfish, _jensen, rosalieper, Scott_WUaS, 
Susannaanas, Wong128hk, gnosygnu, Fuzheado, Jane023, Wikidata-bugs, Base, aude, 
Daniel_Mietchen, Ricordisamoa, Wesalius, Lydia_Pintscher, Raymond, faidon, 
Steinsplitter, Mbch331, Jay8g, fgiunchedi, Hokwelum
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2022-04-11 Thread ArielGlenn
ArielGlenn added a comment.


  In T138208#7844298 <https://phabricator.wikimedia.org/T138208#7844298>, 
@Ladsgroup wrote:
  
  > It's a bit hard to measure but it's probably fixed.
  
  That would be wonderful if true. Let's leave this open for a while yet just 
in case...

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Kormat, LSobanski, Ladsgroup, Marostegui, Addshore, Lydia_Pintscher, 
daniel, hoo, ArielGlenn, jcrespo, Zppix, Busfault, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, jannee_e, ItamarWMDE, Akuckartz, 
holger.knust, RhinosF1, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, 
QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, 
aude, Mbch331, Hokwelum
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T300240: Missing Wikidata RDF (ttl and nt) dumps for 20220117

2022-03-03 Thread ArielGlenn
ArielGlenn added a comment.


  Hey jsut a note that we saw another failure:
  
Output of systemd timer for '/usr/local/bin/dumpwikibaserdf.sh -p wikidata 
-d truthy -f nt'

SYSTEMDTIMER noreply@snapshot1008.eqiad.wmnet via wikimedia.org 

ERROR 2013 (HY000): Lost connection to MySQL server at 'reading 
authorization packet', system error: 104
Failed.
Couldn't get MAX(page_id) from db.
  
  Not sure who can/should undertake to make the script more resilient but there 
it is.

TASK DETAIL
  https://phabricator.wikimedia.org/T300240

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Aklapper, JAllemandou, AKhatun_WMF, dcausse, karapayneWMDE, 
Invadibot, maantietaja, jannee_e, Akuckartz, holger.knust, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2022-02-22 Thread ArielGlenn
ArielGlenn added a comment.


  I am aware of and following this discussion but right now, my responsiveness 
on this task will be slow, most of my time needs to go to getting my teammate 
who will be dumps co-maintainer up to speed. Please bear with us.

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LSobanski, Ladsgroup, Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, 
ArielGlenn, jcrespo, Zppix, karapayneWMDE, Invadibot, maantietaja, jannee_e, 
Akuckartz, holger.knust, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, 
QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, 
aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T300240: Missing Wikidata RDF (ttl and nt) dumps for 20220117

2022-02-01 Thread ArielGlenn
ArielGlenn added a comment.


  Hm I wonder who we should add that would take on restarting these jobs if 
they deem it useful. Uh. Deferring for now since I have no bright ideas, and 
noting that here. Thanks again!

TASK DETAIL
  https://phabricator.wikimedia.org/T300240

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Aklapper, JAllemandou, AKhatun_WMF, dcausse, Invadibot, 
maantietaja, jannee_e, Akuckartz, holger.knust, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T300240: Missing Wikidata RDF (ttl and nt) dumps for 20220117

2022-02-01 Thread ArielGlenn
ArielGlenn added a comment.


  Uh @dcausse Do you want to add someone to the ops-dumps alias so that you can 
be informed in these instances and perhaps schedule a restart of the job(s)? It 
would be easy enough. Sorry to ask after the task is closed!

TASK DETAIL
  https://phabricator.wikimedia.org/T300240

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Aklapper, JAllemandou, AKhatun_WMF, dcausse, Invadibot, 
maantietaja, jannee_e, Akuckartz, holger.knust, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T300240: Missing Wikidata RDF (ttl and nt) dumps for 20220117

2022-02-01 Thread ArielGlenn
ArielGlenn added a comment.


  I saw an error from the cron job, it was sent to ops-dumps, which someone 
from WMDE should be on as well I think. The error looked to me like it had to 
do with a db server being depooled or otherwise unavailable:
  
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading 
authorization packet', system error: 104
  
  So, transient indeed. Feel free to close if this info is sufficient followup 
for you, @dcausse :-)

TASK DETAIL
  https://phabricator.wikimedia.org/T300240

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Aklapper, JAllemandou, AKhatun_WMF, dcausse, Invadibot, 
maantietaja, jannee_e, Akuckartz, holger.knust, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2022-01-24 Thread ArielGlenn
ArielGlenn added a comment.


  Thanks. I was pretty careful with my testing for the last fix, making sure 
that in production the patch redirected to a vslow/dump server. But I may have 
overlooked something. :-(

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LSobanski, Ladsgroup, Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, 
ArielGlenn, jcrespo, Zppix, Invadibot, maantietaja, jannee_e, Akuckartz, 
holger.knust, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2022-01-24 Thread ArielGlenn
ArielGlenn added a comment.


  I hate to ask but can we capture any queries?

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LSobanski, Ladsgroup, Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, 
ArielGlenn, jcrespo, Zppix, Invadibot, maantietaja, jannee_e, Akuckartz, 
holger.knust, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T238972: switch xml/sql (and adds-changes) dumps to use 0.11 schema with content from multiple slots

2022-01-24 Thread ArielGlenn
Restricted Application added a project: wdwb-tech.

TASK DETAIL
  https://phabricator.wikimedia.org/T238972

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1519/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Christian75, Schnark, binbot, Johan, Lucas_Werkmeister_WMDE, RhinosF1, 
Benjavalero, hoo, leila, ArielGlenn, Invadibot, R4356th, Bebiezaza, 
EhsanKhandowa, maantietaja, jannee_e, Akuckartz, PatsagornY, holger.knust, 
Viztor, Nandana, Amorymeltzer, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, JJMC89, _jensen, rosalieper, Scott_WUaS, Luke081515, gnosygnu, 
Wikidata-bugs, aude, TheDJ, Addshore, Mbch331, Jay8g, valerio.bozzolan
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2022-01-24 Thread ArielGlenn
ArielGlenn added a comment.


  The above patch was deployed with the train everywhere, so the specific set 
of queries should no longer be directed to non-vslow/dump db servers. If that's 
the cas, we are now back to the harder issue of what to do when a db server is 
depooled, and I think that discussion is happening elsewhere.

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LSobanski, Ladsgroup, Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, 
ArielGlenn, jcrespo, Zppix, Invadibot, maantietaja, jannee_e, Akuckartz, 
holger.knust, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T297470: torrent file for Wikidata dumps

2022-01-17 Thread ArielGlenn
ArielGlenn closed this task as "Declined".
ArielGlenn added a comment.


  I'm goin to go ahead and close this as declined. Feel to re-open if things 
change in the future.

TASK DETAIL
  https://phabricator.wikimedia.org/T297470

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Andrawaag, Invadibot, maantietaja, jannee_e, Biaoo, 
Philoserf, Nintendofan885, Akuckartz, Ironie, holger.knust, Nandana, Lahi, 
Gq86, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, mys_721tx, Wikidata-bugs, Hydriz, aude, Nemo_bis, 
Addshore, Mbch331, Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2022-01-17 Thread ArielGlenn
ArielGlenn added a comment.


  The patch at https://gerrit.wikimedia.org/r/c/mediawiki/core/+/747455/ is 
tested and ready to go, and in line with the way existing dumps scripts work. 
So I'd like to go ahead with it.

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LSobanski, Ladsgroup, Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, 
ArielGlenn, jcrespo, Zppix, 786, Suran38, Biggs657, Invadibot, Lalamarie69, 
maantietaja, Juan90264, Alter-paule, jannee_e, Beast1978, Un1tY, Akuckartz, 
Hook696, Kent7301, holger.knust, joker88john, CucyNoiD, Nandana, Gaboe420, 
Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, Lunewa, 
QZanden, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, 
gnosygnu, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2022-01-10 Thread ArielGlenn
ArielGlenn added a comment.


  There is a complicated set of python scripts that coordinate the dump jobs 
for each wiki during the two monthly runs. 
https://wikitech.wikimedia.org/wiki/Dumps/Current_Architecture gives an 
overview. 
https://www.mediawiki.org/wiki/SQL/XML_Dumps#Becoming_a_dumps_co-maintainer 
gives rather a lot more. In general for testing you will run the python 
worker.py script, supplying it with the config file, the job name, the run date 
and the wiki; we test in deployment-prep, although I am working on a docker 
container testbed.

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LSobanski, Ladsgroup, Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, 
ArielGlenn, jcrespo, Zppix, 786, Suran38, Biggs657, Invadibot, Lalamarie69, 
maantietaja, Juan90264, Alter-paule, jannee_e, Beast1978, Un1tY, Akuckartz, 
Hook696, Kent7301, holger.knust, joker88john, CucyNoiD, Nandana, Gaboe420, 
Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, Lunewa, 
QZanden, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, 
gnosygnu, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2022-01-10 Thread ArielGlenn
ArielGlenn added a comment.


  In T138208#7611718 <https://phabricator.wikimedia.org/T138208#7611718>, 
@Ladsgroup wrote:
  
  > In T138208#7611712 <https://phabricator.wikimedia.org/T138208#7611712>, 
@ArielGlenn wrote:
  >
  >> Not yet; I need to talk with someone more knowledgeable than me about 
whether this approach is reasonable, before moving forward. I'll bring it up at 
our next meeting (tomorrow).
  >
  > Can I know how dumpers work? Any link to documentation would be 
appreciated. I need it to understand this patch and also finding a way for 
T298485 <https://phabricator.wikimedia.org/T298485>
  
  I don't know of any documentation specifically for the MW maintenance scripts 
for dumps or the modules used for import/export. There are genreal Manual pages 
for importing and exporting (maintained by volunteers I think) but I don't 
think they have the level of detail you are looking for. I have plenty of 
documentation for the python scripts, the formats, the content, and the various 
servers and how they are set up. But I guess that won't be so helpful here. 
Should we meet? Should I try to write something? If so, how in depth does it 
need to be?

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LSobanski, Ladsgroup, Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, 
ArielGlenn, jcrespo, Zppix, 786, Suran38, Biggs657, Invadibot, Lalamarie69, 
maantietaja, Juan90264, Alter-paule, jannee_e, Beast1978, Un1tY, Akuckartz, 
Hook696, Kent7301, holger.knust, joker88john, CucyNoiD, Nandana, Gaboe420, 
Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, Lunewa, 
QZanden, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, 
gnosygnu, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2022-01-10 Thread ArielGlenn
ArielGlenn added a comment.


  In T138208#7611708 <https://phabricator.wikimedia.org/T138208#7611708>, 
@Marostegui wrote:
  
  > In T138208#7571559 <https://phabricator.wikimedia.org/T138208#7571559>, 
@gerritbot wrote:
  >
  >> Change 747455 had a related patch set uploaded (by ArielGlenn; author: 
ArielGlenn):
  >>
  >> [mediawiki/core@master] try to use 'dump' group for db connections for 
dumps of page content
  >>
  >> https://gerrit.wikimedia.org/r/747455
  >
  > Any ETA on when this will be merged? Thanks!
  
  Not yet; I need to talk with someone more knowledgeable than me about whether 
this approach is reasonable, before moving forward. I'll bring it up at our 
next meeting (tomorrow).

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LSobanski, Ladsgroup, Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, 
ArielGlenn, jcrespo, Zppix, 786, Suran38, Biggs657, Invadibot, Lalamarie69, 
maantietaja, Juan90264, Alter-paule, jannee_e, Beast1978, Un1tY, Akuckartz, 
Hook696, Kent7301, holger.knust, joker88john, CucyNoiD, Nandana, Gaboe420, 
Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, Lunewa, 
QZanden, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, 
gnosygnu, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T222349: Do not rate limit dumps from internal network

2021-12-16 Thread ArielGlenn
ArielGlenn added a comment.


  Note that the checksum files for those dumps are available for download as 
well, since they are provided along with the main dump output files to all 
mirrors.
  
  Someone from WMCS will probably need to look at this (again) if the 
discussion is being re-opened. They should have insight into the impact on 
existing services from any change.

TASK DETAIL
  https://phabricator.wikimedia.org/T222349

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Gehel, ArielGlenn
Cc: Volans, ayounsi, cmooney, EBernhardson, Bstorm, ArielGlenn, Gehel, 
Aklapper, joanna_borun, Ramtin2021, Invadibot, MPhamWMF, dcaro, Devnull, 
Slst2020, GeminiAgaloos, maantietaja, nskaggs, lmata, Muchiri124, 
Raymond_Ndibe, CBogen, Nintendofan885, Akuckartz, Phamhi, RhinosF1, 
Legado_Shulgin, ReaperDawn, Nandana, Namenlos314, skpuneethumar, sietec, Zylc, 
Giuliamocci, Davinaclare77, 1978Gage2001, Techguru.pc, Lahi, Operator873, Gq86, 
Bsandipan, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Chicocvenancio, 
Allthingsgo, Hfbn0, QZanden, EBjune, Tbscho, merbst, LawExplorer, Zppix, 
JJMC89, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, Wong128hk, mys_721tx, 
jkroll, Wikidata-bugs, Jdouglas, Jitrixis, aude, Tobias1984, Manybubbles, 
Gryllida, faidon, scfc, Addshore, Mbch331, Jay8g, bd808, Krenair, fgiunchedi
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2021-12-14 Thread ArielGlenn
ArielGlenn added a comment.


  Thanks for this thought, Daniel. I think it's better if I can pass the 
dbgroupdefault parameter to the maintenance script itself, instead of hacking 
something into getBlob(). But I do need to check if that's going to work ok. 
The longer term fix you mentioned, is there a task for that, so I can follow 
along?

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LSobanski, Ladsgroup, Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, 
ArielGlenn, jcrespo, Zppix, Invadibot, maantietaja, jannee_e, Akuckartz, 
holger.knust, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2021-12-14 Thread ArielGlenn
ArielGlenn added a comment.


  As I feared, fetchText.php calls 
MediaWikiServices::getInstance()->getBlobStore()->getBlob() which gets a db 
replica connection on its own, with no opportunity for us to ask that it be in 
the vslow/dump group. We might be able to use the -dbgroupdefault dump option 
to this script; I will have to do some testing to see if that has any effect 
and what happens when that group is suddenly not available.

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LSobanski, Ladsgroup, Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, 
ArielGlenn, jcrespo, Zppix, Invadibot, maantietaja, jannee_e, Akuckartz, 
holger.knust, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2021-12-13 Thread ArielGlenn
ArielGlenn added a comment.


  The above is happening from pages-meta-history dumps, and I will look into it 
later today. The snapshot1008 (wikidata entity) dumps will be harder.

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LSobanski, Ladsgroup, Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, 
ArielGlenn, jcrespo, Zppix, Invadibot, maantietaja, jannee_e, Akuckartz, 
holger.knust, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2021-12-13 Thread ArielGlenn
ArielGlenn added a comment.


  The reason only those two snapshot hosts are involved is undoubtedly because 
dumps on the others have finished for this run.

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LSobanski, Ladsgroup, Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, 
ArielGlenn, jcrespo, Zppix, Invadibot, maantietaja, jannee_e, Akuckartz, 
holger.knust, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T297470: torrent file for Wikidata dumps

2021-12-11 Thread ArielGlenn
ArielGlenn added a comment.


  We don't provide torrent files from here because this is something that can 
be done by members of the community. I would get in touch with one of the 
people maintaining any of the torrents listed here: 
https://meta.wikimedia.org/wiki/Data_dump_torrents and see if they are willing 
to add Wikidata to the list. There also used to be a toolforge project for 
torrents, https://admin.toolforge.org/tool/dump-torrents but I'm not sure if it 
is still running. Finally, if the speed of the download is the main issue for 
you, you might try one of the mirror sites for downloading, 
https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Current_mirrors
 and see if you get faster downloads that way.

TASK DETAIL
  https://phabricator.wikimedia.org/T297470

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Andrawaag, Invadibot, maantietaja, jannee_e, Biaoo, 
Philoserf, Nintendofan885, Akuckartz, Ironie, holger.knust, Nandana, Lahi, 
Gq86, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, mys_721tx, Wikidata-bugs, Hydriz, aude, Nemo_bis, 
Addshore, Mbch331, Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T222985: Provide wikidata JSON dumps compressed with zstd

2021-06-20 Thread ArielGlenn
ArielGlenn added a comment.


  In T222985#7164049 <https://phabricator.wikimedia.org/T222985#7164049>, 
@Mitar wrote:
  
  > Are you saying that existing wikidata json dumps can be decompressed in 
parallel if using lbzip2, but not pbzip2?
  
  lbzip2 is format-compatible with bzip2 and can read bzip2 or lbzip2 
compressed files and use multiple cores to decompress, indeed. pbzip2 should 
also work forr that matter.

TASK DETAIL
  https://phabricator.wikimedia.org/T222985

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Mitar, ImreSamu, hoo, Smalyshev, ArielGlenn, Liuxinyu970226, bennofs, 
Invadibot, maantietaja, jannee_e, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T222985: Provide wikidata JSON dumps compressed with zstd

2021-06-20 Thread ArielGlenn
ArielGlenn added a comment.


  lbzip2 decompresses in parallel as well. We use that for compression of the 
SQL/XML dumps.

TASK DETAIL
  https://phabricator.wikimedia.org/T222985

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Mitar, ImreSamu, hoo, Smalyshev, ArielGlenn, Liuxinyu970226, bennofs, 
Invadibot, maantietaja, jannee_e, Akuckartz, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Addshore, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T281267: various weekly and daily dumps run from systemd timers are broken

2021-05-05 Thread ArielGlenn
ArielGlenn added a comment.


  What are the next steps on this? Should I be tweaking a manifest someplace?

TASK DETAIL
  https://phabricator.wikimedia.org/T281267

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: jbond, ArielGlenn
Cc: Addshore, Tonina_Zhelyazkova_WMDE, WMDE-leszek, JAllemandou, fgiunchedi, 
jbond, hoo, dcausse, ArielGlenn, Protsack.stephan, Invadibot, Ramtin0071, 
Devnull, maantietaja, lmata, Muchiri124, jannee_e, Akuckartz, RhinosF1, 
Legado_Shulgin, ReaperDawn, Nandana, Davinaclare77, Qtn1293, Techguru.pc, Lahi, 
Gq86, herron, GoranSMilovanovic, Chicocvenancio, Lunewa, Th3d3v1ls, Hfbn0, 
QZanden, LawExplorer, Zppix, Volans, _jensen, rosalieper, Scott_WUaS, 
Wong128hk, gnosygnu, Wikidata-bugs, aude, faidon, Mbch331, Jay8g
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T209390: Output some meta data about the wikidata JSON dump

2021-04-28 Thread ArielGlenn
ArielGlenn added a subscriber: hoo.
ArielGlenn added a comment.


  I am proactively adding @hoo as he can provide some insight and perhaps tag 
others as well.

TASK DETAIL
  https://phabricator.wikimedia.org/T209390

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: hoo, Sascha, Mitar, ArielGlenn, Smalyshev, Addshore, Invadibot, 
maantietaja, jannee_e, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, 
Lunewa, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, 
Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T279518: Enable automatic JSON dump validation for Wikidata

2021-04-07 Thread ArielGlenn
ArielGlenn added a comment.


  In T279518#6981710 <https://phabricator.wikimedia.org/T279518#6981710>, @hoo 
wrote:
  
  >> Icinga sends alerts, and those would come to me I guess, which is probably 
not the best outcome :-)
  >
  > We could use the `wikidata` contact group for that.
  >
  >> Note that mails for other jobs go to an email alias that includes several 
people on my team; perhaps you can rope a couple others in WMDE or who work on 
Wikidata to sign onto a new alias?
  >
  > We already have a `wikidata-monitoring` alias we use for these Icinga 
alerts, I guess we could nicely use it for this as well.
  >
  > So, I guess both would be fine... while cron is probably easier to wire up, 
Icinga seems more fitting (we don't care about this as long as it succeeds).
  
  Your cron job would only produce output on failure if you set it up 
appropriately, so both are fine indeed.

TASK DETAIL
  https://phabricator.wikimedia.org/T279518

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Lydia_Pintscher, ArielGlenn, Aklapper, hoo, Invadibot, maantietaja, 
jannee_e, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Svick, Addshore, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T279518: Enable automatic JSON dump validation for Wikidata

2021-04-07 Thread ArielGlenn
ArielGlenn added a comment.


  Icinga sends alerts, and those would come to me I guess, which is probably 
not the best outcome :-)
  
  I believe that we use MAILTO for everything in the dumpsgen crontab, but the 
question is whether there's a nice alias to send emails to, or whether we want 
to make you in particular the SPOF for this. I imagine you can figure out my 
opinion on this already :-)  Note that mails for other jobs go to an email 
alias that includes several people on my team; perhaps you can rope a couple 
others in WMDE or who work on Wikidata to sign onto a new alias?

TASK DETAIL
  https://phabricator.wikimedia.org/T279518

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Lydia_Pintscher, ArielGlenn, Aklapper, hoo, Invadibot, maantietaja, 
jannee_e, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Svick, Addshore, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T279518: Enable automatic JSON dump validation for Wikidata

2021-04-07 Thread ArielGlenn
ArielGlenn added a project: Dumps-Generation.
Restricted Application added a project: wdwb-tech.

TASK DETAIL
  https://phabricator.wikimedia.org/T279518

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Lydia_Pintscher, ArielGlenn, Aklapper, hoo, Invadibot, maantietaja, 
jannee_e, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Svick, Addshore, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T277300: Lexeme JSON dumps contain invalid JSON

2021-03-23 Thread ArielGlenn
ArielGlenn added a comment.


  This is now deployd and will be in effect for next week's lexeme run.

TASK DETAIL
  https://phabricator.wikimedia.org/T277300

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, hoo, Lydia_Pintscher, Invadibot, maantietaja, Alter-paule, 
jannee_e, Beast1978, Un1tY, Akuckartz, Hook696, Kent7301, joker88john, 
CucyNoiD, Nandana, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, 
Bsandipan, GoranSMilovanovic, Lunewa, Mahir256, QZanden, LawExplorer, 
Lewizho99, Maathavan, _jensen, rosalieper, Bodhisattwa, Scott_WUaS, gnosygnu, 
Wikidata-bugs, aude, Addshore, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T278031: Wikibase canonical JSON format is missing "modified" in Wikidata JSON dumps

2021-03-21 Thread ArielGlenn
ArielGlenn added a project: Dumps-Generation.
Restricted Application added a project: wdwb-tech.

TASK DETAIL
  https://phabricator.wikimedia.org/T278031

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Mitar, Aklapper, Invadibot, maantietaja, jannee_e, Akuckartz, 
darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Lydia_Pintscher, Addshore, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T276643: Wikidata JSON dump (bz2) no longer imports due to bad JSON format

2021-03-16 Thread ArielGlenn
ArielGlenn closed this task as "Resolved".
ArielGlenn added a comment.


  Since @hoo validated the dump from the past week, verifiying that the current 
dump generation process is fixed, we can now close this task. Thanks everyone!

TASK DETAIL
  https://phabricator.wikimedia.org/T276643

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: hoo, Tacsipacsi, Cparle, Palotabarat, LucasWerkmeister, Motagirl2, 
Addshore, Mahir256, ArielGlenn, Ash20001, maantietaja, jannee_e, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, gnosygnu, abian, Wikidata-bugs, aude, 
Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T276643: Wikidata JSON dump (bz2) no longer imports due to bad JSON format

2021-03-07 Thread ArielGlenn
ArielGlenn added a comment.


  I'll leave this open until the run is complete and folks have had time to try 
to use them, so probably through the coming weekend.

TASK DETAIL
  https://phabricator.wikimedia.org/T276643

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LucasWerkmeister, Motagirl2, Addshore, Mahir256, ArielGlenn, Ash20001, 
maantietaja, jannee_e, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, 
Lunewa, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, gnosygnu, 
abian, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T276643: Wikidata JSON dump (bz2) no longer imports due to bad JSON format

2021-03-07 Thread ArielGlenn
ArielGlenn added a comment.


  In T276643#6890308 <https://phabricator.wikimedia.org/T276643#6890308>, 
@Ash20001 wrote:
  
  > Will this patch be included in the next dump or can be put back in the last 
two dumps (regenerate dump)
  
  This should be in time for the dump that will be produced this week. For the 
previous two weeks you'll need to filter the contents to add in commas, as 
mentioned by Lucas in his earlier comment.

TASK DETAIL
  https://phabricator.wikimedia.org/T276643

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: LucasWerkmeister, Motagirl2, Addshore, Mahir256, ArielGlenn, Ash20001, 
maantietaja, jannee_e, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, 
Lunewa, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, gnosygnu, 
abian, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T264883: Prepare deployment of JSON dumps for Lexeme

2021-02-18 Thread ArielGlenn
ArielGlenn added a comment.


  These look fine to me from today, and I've done all the buster-side testing 
so that's ok too. Closing this! Ah, do we want to anounce it anywhere though? 
Maybe I won't close it pending that answer. Places it could be announced: 
xmldatadumps-l, wikitech-l, research list, wikidata list.

TASK DETAIL
  https://phabricator.wikimedia.org/T264883

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: hoo, noarave, ArielGlenn, Lucas_Werkmeister_WMDE, Lydia_Pintscher, 
WMDE-leszek, Pablo-WMDE, Alter-paule, Beast1978, Un1tY, Akuckartz, Hook696, 
Iflorez, Kent7301, alaa_wmde, joker88john, CucyNoiD, Nandana, Gaboe420, 
Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, 
Mahir256, QZanden, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, 
Bodhisattwa, Scott_WUaS, Jonas, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T264883: Prepare deployment of JSON dumps for Lexeme

2021-02-11 Thread ArielGlenn
ArielGlenn added a comment.


  I am doing some prep work before I try to test this on buster. Getting close!

TASK DETAIL
  https://phabricator.wikimedia.org/T264883

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, ArielGlenn
Cc: noarave, ArielGlenn, Lucas_Werkmeister_WMDE, Lydia_Pintscher, WMDE-leszek, 
Pablo-WMDE, Alter-paule, Beast1978, Un1tY, Akuckartz, Hook696, Iflorez, 
Kent7301, alaa_wmde, joker88john, CucyNoiD, Nandana, Gaboe420, Giuliamocci, 
Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, Mahir256, QZanden, 
LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Bodhisattwa, 
Scott_WUaS, Jonas, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2021-02-09 Thread ArielGlenn
ArielGlenn added a comment.


  mysql.php, used for wikidata entity dumps, does not apparently correctly 
handle the --group flag. it's unclear to me what it does do, I need to check 
into this sometime later. The queries run by it are extremely short so the 
impact is minimal, but it still needs to be checked.

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, ArielGlenn, jcrespo, 
Zppix, jannee_e, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, 
QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, 
aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2021-02-08 Thread ArielGlenn
ArielGlenn added a comment.


  In T138208#6811418 <https://phabricator.wikimedia.org/T138208#6811418>, 
@Addshore wrote:
  
  > In T138208#6809784 <https://phabricator.wikimedia.org/T138208#6809784>, 
@ArielGlenn wrote:
  >
  >> This is because the maintenance scripts that do "small" page ranges take 
several hours to complete. I will keep this in mind for when we can go to 
multiple bz2 streams in the page content history dumps; I'll be able to dump 
much smaller ranges then and concat them together. The other thing I should do 
is check how often we respawn fetchText; that is something I might be able to 
change sooner rather than later.
  >
  > From the sounds of things I can leave this ticket on your plate then 
@ArielGlenn ? :)
  
  Sadly, yes :-P

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, ArielGlenn, jcrespo, 
Zppix, jannee_e, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, 
QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, 
aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T147169: Make sure Wikibase dump maintenance scripts solely use the "dump" db group

2021-02-08 Thread ArielGlenn
ArielGlenn added a comment.


  These are for the weekly wikidata "entity dumps", and so separate from the 
main xml/sql dumps implicated in the other task.

TASK DETAIL
  https://phabricator.wikimedia.org/T147169

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, ArielGlenn
Cc: Marostegui, gerritbot, Lucas_Werkmeister_WMDE, ArielGlenn, Addshore, aaron, 
Aklapper, Lydia_Pintscher, jcrespo, aude, daniel, hoo, Akuckartz, Iflorez, 
alaa_wmde, Nandana, lucamauri, Lahi, Gq86, GoranSMilovanovic, lisong, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T138208: Connections to all db servers for wikidata as wikiadmin from snapshot, terbium

2021-02-08 Thread ArielGlenn
ArielGlenn added a comment.


  This is because the maintenance scripts that do "small" page ranges take 
several hours to complete. I will keep this in mind for when we can go to 
multiple bz2 streams in the page content history dumps; I'll be able to dump 
much smaller ranges then and concat them together. The other thing I should do 
is check how often we respawn fetchText; that is something I might be able to 
change sooner rather than later.

TASK DETAIL
  https://phabricator.wikimedia.org/T138208

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Marostegui, Addshore, Lydia_Pintscher, daniel, hoo, ArielGlenn, jcrespo, 
Zppix, jannee_e, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, 
QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, 
aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T264883: Prepare deployment of JSON dumps for Lexeme

2021-02-01 Thread ArielGlenn
ArielGlenn added a comment.


  All set. We should check on these again in the middle of next week, as the 
run starts on Monday at ridiculous-o-clock when we are all sleeping.

TASK DETAIL
  https://phabricator.wikimedia.org/T264883

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, ArielGlenn
Cc: ArielGlenn, Lucas_Werkmeister_WMDE, Lydia_Pintscher, WMDE-leszek, 
Pablo-WMDE, Alter-paule, Beast1978, Un1tY, Akuckartz, Hook696, Iflorez, 
Kent7301, alaa_wmde, joker88john, CucyNoiD, Nandana, Gaboe420, Giuliamocci, 
Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, Mahir256, QZanden, 
LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Bodhisattwa, 
Scott_WUaS, Jonas, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T264883: Prepare deployment of JSON dumps for Lexeme

2021-01-29 Thread ArielGlenn
ArielGlenn added a comment.


  In T264883#6786811 <https://phabricator.wikimedia.org/T264883#6786811>, 
@Lucas_Werkmeister_WMDE wrote:
  
  > Are you sure they ran? That directory only contains RDF dumps as far as I 
can tell (Turtle and NTriples), we’ve been generating those for a while 
(compare 20210122 
<https://dumps.wikimedia.org/other/wikibase/wikidatawiki/20210122/> with 
20201218 <https://dumps.wikimedia.org/other/wikibase/wikidatawiki/20201218/>). 
I haven’t found any lexeme JSON dumps yet.
  
  Ah crap. Yeah I see that now.
  
  I didn't get any failure emails about it, but when I looked in the log I saw 
this:
  root@snapshot1008:~# more 
/var/log/wikidatadump/dumpwikidatajson-wikidata-20210127-lexemes-main.log
  File size for shard 0 is only 26086402. Aborting.
  
  I guess those values need to be adjusted for lexemes.

TASK DETAIL
  https://phabricator.wikimedia.org/T264883

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, ArielGlenn
Cc: ArielGlenn, Lucas_Werkmeister_WMDE, Lydia_Pintscher, WMDE-leszek, 
Pablo-WMDE, Akuckartz, Iflorez, alaa_wmde, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Mahir256, QZanden, LawExplorer, _jensen, rosalieper, 
Bodhisattwa, Scott_WUaS, Jonas, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T264883: Prepare deployment of JSON dumps for Lexeme

2021-01-29 Thread ArielGlenn
ArielGlenn added a comment.


  These ran and are available at 
https://dumps.wikimedia.org/other/wikibase/wikidatawiki/20210122/
  
  How do they look?

TASK DETAIL
  https://phabricator.wikimedia.org/T264883

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, ArielGlenn
Cc: ArielGlenn, Lucas_Werkmeister_WMDE, Lydia_Pintscher, WMDE-leszek, 
Pablo-WMDE, Akuckartz, Iflorez, alaa_wmde, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Mahir256, QZanden, LawExplorer, _jensen, rosalieper, 
Bodhisattwa, Scott_WUaS, Jonas, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T221504: investigate why content history dump of certain wikidata page ranges is so slow

2020-12-15 Thread ArielGlenn
ArielGlenn added a comment.


  Following up on this, has there been any more discussion about making the 
JSON a little less wordy/disk-filly? I don't see any other path forward on this 
in the short to medium term.

TASK DETAIL
  https://phabricator.wikimedia.org/T221504

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Addshore, Smalyshev, Gehel, Mahir256, ArielGlenn, jannee_e, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T246415: Investigate a different db load groups for wikidata / wikibase

2020-11-04 Thread ArielGlenn
ArielGlenn added a project: User-ArielGlenn.

TASK DETAIL
  https://phabricator.wikimedia.org/T246415

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Michael, ArielGlenn
Cc: ArielGlenn, Michael, Marostegui, Ladsgroup, WMDE-leszek, Aklapper, 
Addshore, Alter-paule, Beast1978, Un1tY, Akuckartz, Hook696, Iflorez, Kent7301, 
alaa_wmde, joker88john, CucyNoiD, Nandana, jijiki, Klaas_Z4us_V, Gaboe420, 
Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, Pablo-WMDE, 
GoranSMilovanovic, QZanden, LawExplorer, Lewizho99, Maathavan, elukey, _jensen, 
rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, Lydia_Pintscher, Mbch331, 
Jay8g
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T264298: wb_terms is getting removed

2020-10-30 Thread ArielGlenn
ArielGlenn added a comment.


  All of those tables are there: see 
https://gerrit.wikimedia.org/r/c/operations/puppet/+/527505 and current 
https://github.com/wikimedia/puppet/blob/production/modules/snapshot/files/dumps/table_jobs.yaml#L142
  
  Is there anything else needed, @Lucas_Werkmeister_WMDE ?

TASK DETAIL
  https://phabricator.wikimedia.org/T264298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Lucas_Werkmeister_WMDE, Addshore, toan, jannee_e, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T264298: wb_terms is getting removed

2020-10-30 Thread ArielGlenn
ArielGlenn added a comment.


  In T264298#6511634 <https://phabricator.wikimedia.org/T264298#6511634>, 
@Lucas_Werkmeister_WMDE wrote:
  
  > We also realized that the `tablejobs.yaml` file didn’t mention the new 
tables (the replacement for `wb_terms`: `wbt_{item,property}_terms`, 
`wbt_{term,text}_in_lang`, `wbt_text`, `wbt_type`). If `wb_terms` was worth 
dumping, then presumably the new tables should be dumped too. Is it enough to 
add them to the YAML file or do you need some extra setup for new tables?
  
  Woops I missed this comment entirely. Ummm. Let me have a look at that and if 
there are changes needed, I'll push them out TODAY. Otherwise is there anything 
else needed for this task, now that the mw-vagrant patch is merged?

TASK DETAIL
  https://phabricator.wikimedia.org/T264298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Lucas_Werkmeister_WMDE, Addshore, toan, jannee_e, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T264850: Categorylinks dump might have some problem with the encoding

2020-10-11 Thread ArielGlenn
ArielGlenn removed projects: Wikidata, Wikidata-Query-Service, Analytics.

TASK DETAIL
  https://phabricator.wikimedia.org/T264850

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JAllemandou, ArielGlenn
Cc: Lucas_Werkmeister_WMDE, ArielGlenn, Milimetric, Aklapper, marcmiquel, 
Strainu, jannee_e, Lunewa, gnosygnu, CBogen, Akuckartz, 4748kitoko, 
darthmon_wmde, Nandana, Namenlos314, Akovalyov, Lahi, Gq86, GoranSMilovanovic, 
QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, terrrydactyl, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T264850: Categorylinks dump might have some problem with the encoding

2020-10-08 Thread ArielGlenn
ArielGlenn added a comment.


  In T264850#6531377 <https://phabricator.wikimedia.org/T264850#6531377>, 
@Milimetric wrote:
  
  > @ArielGlenn is this something you'd know about or know who to point me to?
  
  I think the wdqs folks are going to be your best bet, I've added the project. 
Looks like a simple text encoding error, but I'd like to know exactly what 
tools were used to display the text before saying that for sure.

TASK DETAIL
  https://phabricator.wikimedia.org/T264850

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JAllemandou, ArielGlenn
Cc: ArielGlenn, Milimetric, Aklapper, marcmiquel, Strainu, jannee_e, CBogen, 
Akuckartz, 4748kitoko, darthmon_wmde, Nandana, Namenlos314, Akovalyov, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Lunewa, QZanden, EBjune, 
merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, gnosygnu, 
JAllemandou, terrrydactyl, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T264850: Categorylinks dump might have some problem with the encoding

2020-10-08 Thread ArielGlenn
ArielGlenn added a comment.


echo -n ânești  | od -t x1
000 c3 a2 6e 65 c8 99 74 69
  
  You appear to be seeing a string representation of the non-ascii characters 
as hex bytes, i.e. xc3 xa2 ne xc8 x99 ti.   What command are you using to 
display the test in the file, and on what platform?

TASK DETAIL
  https://phabricator.wikimedia.org/T264850

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JAllemandou, ArielGlenn
Cc: ArielGlenn, Milimetric, Aklapper, marcmiquel, Strainu, jannee_e, CBogen, 
Akuckartz, 4748kitoko, darthmon_wmde, Nandana, Namenlos314, Akovalyov, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Lunewa, QZanden, EBjune, 
merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, gnosygnu, 
JAllemandou, terrrydactyl, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T264850: Categorylinks dump might have some problem with the encoding

2020-10-08 Thread ArielGlenn
ArielGlenn added projects: Wikidata-Query-Service, Dumps-Generation.
Restricted Application added a project: Wikidata.

TASK DETAIL
  https://phabricator.wikimedia.org/T264850

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JAllemandou, ArielGlenn
Cc: ArielGlenn, Milimetric, Aklapper, marcmiquel, Strainu, jannee_e, CBogen, 
Akuckartz, 4748kitoko, darthmon_wmde, Nandana, Namenlos314, Akovalyov, Lahi, 
Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Lunewa, QZanden, EBjune, 
merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, gnosygnu, 
JAllemandou, terrrydactyl, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T264164: Cleanup broken dumps in /wikidatawiki/entities/20200921/

2020-10-02 Thread ArielGlenn
ArielGlenn added a comment.


  They are indeed gone from dumpsdata1002; we keep fewer back issues there, 
since we're not serving them anywhere but only rsyncing them off. We keep the 
last 3 wikibase dumps, see 
https://github.com/wikimedia/puppet/blob/production/modules/dumps/manifests/web/cleanups/miscdumps.pp#L14
 ( or on the host itself, /etc/dumps/confs/cleanup_misc.conf and the 
"wikibase/wikidatawiki" entry). Now that the runs have specific dates we might 
want to increase that to 6 so we have the last two weeks' worth.

TASK DETAIL
  https://phabricator.wikimedia.org/T264164

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Gehel, dcausse, jannee_e, CBogen, Akuckartz, darthmon_wmde, 
Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, EBjune, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T264298: wb_terms is getting removed

2020-10-02 Thread ArielGlenn
ArielGlenn added a comment.


  No impact.  Only tables actually in the database are dumped, a check of each 
table in the list is done beforehand. The code can be cleaned up anyways just 
to be nice though.

TASK DETAIL
  https://phabricator.wikimedia.org/T264298

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Lucas_Werkmeister_WMDE, Addshore, toan, jannee_e, Akuckartz, 
darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T264164: Cleanup broken dumps in /wikidatawiki/entities/20200921/

2020-10-01 Thread ArielGlenn
ArielGlenn added subscribers: Gehel, ArielGlenn.
ArielGlenn added a comment.


  @Gehel was just asking about these yesterday and whether he should clean them 
up. The procedure is: delete first from the appropriate dumpsdata host 
(dumpsdata1002) where they are first written. Then delete them from the 
labstore1006 and 1007 hosts to which they would be rsynced.
  
  On dumpsdata1002 the path is /data/otherdumps to the tree containing all of 
the various datasets unrelated to xml/sql dumps. On the labstore hosts, it is 
/srv/dumps/xmldatadumps/public/other

TASK DETAIL
  https://phabricator.wikimedia.org/T264164

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Gehel, dcausse, jannee_e, Akuckartz, darthmon_wmde, Nandana, 
Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T220883: Wikidata JSON dumps should include Lexemes

2020-09-30 Thread ArielGlenn
ArielGlenn added a comment.


  I renew my question above in T220883#5185999 
<https://phabricator.wikimedia.org/T220883#5185999> and if someone can answer 
this, I can work with them to make these go live.

TASK DETAIL
  https://phabricator.wikimedia.org/T220883

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: hoo, ArielGlenn
Cc: DVrandecic, Addshore, ArielGlenn, VIGNERON, Aklapper, hoo, Lydia_Pintscher, 
Envlh, Akuckartz, darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, 
Mahir256, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, 
Wikidata-bugs, aude, Svick, Mbch331, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T260232: BatchRowIterator slow query on commonswiki

2020-09-22 Thread ArielGlenn
ArielGlenn closed this task as "Resolved".
ArielGlenn claimed this task.
ArielGlenn added a comment.


  Re-enabled, checked daily runs, they look good, so I'm resolving this. 
Thanks, everybody!

TASK DETAIL
  https://phabricator.wikimedia.org/T260232

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, CBogen, Cparle, Umherirrender, DannyS712, Naike, WDoranWMF, 
Krinkle, aaron, Reedy, Ladsgroup, Aklapper, Marostegui, XeroS_SkalibuR, 
Alter-paule, jannee_e, Beast1978, Un1tY, Akuckartz, eprodromou, Hook696, 
Adidsone1, darthmon_wmde, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Phukettaxigroup, Giuliamocci, Cpaulf30, Lahi, Gq86, 
Af420, Ramsey-WMF, Darkminds3113, Bsandipan, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, Jayprakash12345, Lunewa, QZanden, EBjune, merbst, 
LawExplorer, Vali.matei, Lewizho99, Maathavan, _jensen, rosalieper, Agabi10, 
Scott_WUaS, Pchelolo, Jonas, Xmlizer, Volker_E, gnosygnu, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, GWicke, Dcljr, Dinoguy1000, 
Manybubbles, Mbch331, Rxy, Jay8g, Ltrlg
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T226093: Capacity planning for Commons Structured Data

2020-09-16 Thread ArielGlenn
ArielGlenn added a comment.


  Updated (ouch!) F32352585: commons_slots.png 
<https://phabricator.wikimedia.org/F32352585>

TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Ladsgroup, Abit, matthiasmullie, Marostegui, Mholloway, Addshore, 
Ramsey-WMF, jcrespo, Yann, MarkTraceur, ArielGlenn, Aklapper, lmata, jannee_e, 
CBogen, Akuckartz, darthmon_wmde, Legado_Shulgin, Nandana, JKSTNK, 
Davinaclare77, Qtn1293, Techguru.pc, Lahi, PDrouin-WMF, Gq86, E1presidente, 
Cparle, Anooprao, SandraF_WMF, GoranSMilovanovic, Lunewa, Th3d3v1ls, Hfbn0, 
QZanden, Tramullas, Acer, LawExplorer, Salgo60, Zppix, Silverfish, _jensen, 
rosalieper, Scott_WUaS, Susannaanas, Wong128hk, gnosygnu, Jane023, 
Wikidata-bugs, Base, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, 
Fabrice_Florin, Raymond, faidon, Steinsplitter, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T260232: BatchRowIterator slow query on commonswiki

2020-09-13 Thread ArielGlenn
ArielGlenn added a comment.


  In T260232#6448382 <https://phabricator.wikimedia.org/T260232#6448382>, 
@gerritbot wrote:
  
  > Change 625642 **merged** by jenkins-bot:
  > [mediawiki/core@master] don't pass null page id to page related queries for 
category change rdf dumps
  >
  > https://gerrit.wikimedia.org/r/625642
  
  When this is deployed on the wikis I'll be able to re-enable category dumps, 
both dailies and weeklies, which shouldmean at the end of the week.

TASK DETAIL
  https://phabricator.wikimedia.org/T260232

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, CBogen, Cparle, Umherirrender, DannyS712, Naike, WDoranWMF, 
Krinkle, aaron, Reedy, Ladsgroup, Aklapper, Marostegui, XeroS_SkalibuR, 
Alter-paule, jannee_e, Beast1978, Un1tY, Akuckartz, eprodromou, Hook696, 
Adidsone1, darthmon_wmde, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Phukettaxigroup, Giuliamocci, Cpaulf30, Lahi, Gq86, 
Af420, Ramsey-WMF, Darkminds3113, Bsandipan, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, Jayprakash12345, Lunewa, QZanden, EBjune, merbst, 
LawExplorer, Vali.matei, Lewizho99, Maathavan, _jensen, rosalieper, Agabi10, 
Scott_WUaS, Pchelolo, Jonas, Xmlizer, Volker_E, gnosygnu, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, GWicke, Dcljr, Dinoguy1000, 
Manybubbles, Mbch331, Rxy, Jay8g, Ltrlg
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T260232: BatchRowIterator slow query on commonswiki

2020-09-07 Thread ArielGlenn
ArielGlenn added a comment.


  In T260232#6390706 <https://phabricator.wikimedia.org/T260232#6390706>, 
@gerritbot wrote:
  
  > Change 620775 had a related patch set uploaded (by ArielGlenn; owner: 
ArielGlenn):
  > [mediawiki/core@master] don't include null page ids in query list for 
category dumps
  >
  > https://gerrit.wikimedia.org/r/620775
  
  I have tested the above patch by doing a manual run of the cron script on 
snapshot1008 as the dumpsgen user:
  
dumpsgen@snapshot1008:~$ /usr/local/bin/dumpcategoriesrdf.sh --config 
/etc/dumps/confs/wikidump.conf.other --list 
/srv/mediawiki/dblists/categories-rdf.dblist
  
  It completed in a little under 4 hours for all wikis. What is needed to get 
the patch merged?

TASK DETAIL
  https://phabricator.wikimedia.org/T260232

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, CBogen, Cparle, Umherirrender, DannyS712, Naike, WDoranWMF, 
Krinkle, aaron, Reedy, Ladsgroup, Aklapper, Marostegui, XeroS_SkalibuR, 
Alter-paule, jannee_e, Beast1978, Un1tY, Akuckartz, eprodromou, Hook696, 
Adidsone1, darthmon_wmde, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Phukettaxigroup, Giuliamocci, Cpaulf30, Lahi, Gq86, 
Af420, Ramsey-WMF, Darkminds3113, Bsandipan, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, Jayprakash12345, Lunewa, QZanden, EBjune, merbst, 
LawExplorer, Vali.matei, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, 
Jonas, Xmlizer, Volker_E, gnosygnu, jkroll, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, GWicke, Dcljr, Dinoguy1000, Manybubbles, Mbch331, Rxy, Jay8g, Ltrlg
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T262187: Wikidata entity dumps didn't start this week

2020-09-07 Thread ArielGlenn
ArielGlenn created this task.
ArielGlenn added projects: Wikidata, Dumps-Generation.

TASK DESCRIPTION
  This change: P12492 <https://phabricator.wikimedia.org/P12492> left the dump 
db group empty, and so any attempts to run wikidata entity dumps failed. The 
host was added back in early on September 7. The dumps for this week should be 
restarted; you'll want to coordinate this with the deployment of 
https://gerrit.wikimedia.org/r/c/operations/puppet/+/622342 which should be 
deployed when no jobs are running.
  
  Wikidata entity dumps use the flag --dbgroupdefault; it would be a good idea 
for that flag to permit fallback to use of any host in the special case that 
the requested group is empty.

TASK DETAIL
  https://phabricator.wikimedia.org/T262187

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: RKemper, ArielGlenn, jannee_e, Akuckartz, darthmon_wmde, Nandana, Lahi, 
Gq86, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T261204: Wikidata lexeme ttl dumps should be in a "predictable" folder

2020-09-01 Thread ArielGlenn
ArielGlenn added a comment.


  I think we can just move this through and keep our eyes on it.

TASK DETAIL
  https://phabricator.wikimedia.org/T261204

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, dcausse, Alter-paule, jannee_e, Beast1978, CBogen, Un1tY, 
Akuckartz, Hook696, darthmon_wmde, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, Lunewa, QZanden, EBjune, merbst, 
LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, gnosygnu, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T260232: BatchRowIterator slow query on commonswiki

2020-08-17 Thread ArielGlenn
ArielGlenn added a comment.


  I took to brute force approach of writing all  queries to a log file by 
adding the appropriate fopen/fputs/fclose in Database::select (live on 
snapshot1010, testbed host). I then ran:
  
dumpsgen@snapshot1010:/srv/mediawiki$ /usr/bin/php7.2 
/srv/mediawiki/multiversion/MWScript.php maintenance/categoryChangesAsRdf.php 
--wiki=commonswiki -s 2020081521 -e 20200817050001  | gzip > 
/srv/tmp/categories-out.gz
  
  I examined the output and found numerous examples of queries with the ' ' 
string in them (without the space).
  
  The following two queries were back-to-back, indicating that one was used to 
generate input for the next:
  
SELECT  page_id,cat_title AS 
`rc_title`,pp_propname,cat_pages,cat_subcats,cat_files  FROM `category` LEFT 
JOIN `page` ON ((page_title = cat_title) AND page_namespace = 14) LEFT JOIN 
`page_props` ON (pp_propname = 'hiddencat' AND (pp_page = page_id))   WHERE 
cat_title IN 
('Bridges_over_Kunar_River_(Pakistan)','People_of_the_University_of_Wyoming','University_of_Wyoming','Bus_routes_numbered_144','Churches_in_the_Roman_Catholic_Archdiocese_of_Benevento','August_2020_in_Cardiff','Cardiff_Coach_Station,_Sophia_Gardens','Bus_stations_in_Cardiff','Sophia_Gardens','Logos_of_companies_based_in_Mecklenburg-Vorpommern','Rameswaram','Media_needing_categories_as_of_18_March_2018','All_media_needing_categories_as_of_2018','Pages_with_local_object_coordinates_and_missing_SDC_coordinates','CC-BY-SA-4.0','Self-published_work','Photographs_by_LigaDue','Civitella_Marittima','Pages_with_maps','Scans_from_the_Internet_Archive','CC-PD-Mark','PD_US_Government','FEDLINK_-_United_States_Federal_Collection','Books_uploaded_by_Fæ','Files_with_no_machine-readable_author','Former_bus_lines_in_Budapest','Bus_lines_in_Budapest','Plzeň_1','Plzeň','Plzeň-City_District','Kaufland_Plzeň-Roudná','Epta_Piges_(Rhodes)','PD_US_expired','Books_in_the_Library_of_Congress','Trains_at_Inuyama_Yuen_Station','Inuyama_Yuen_Station','People_in_1910','2_men','OCR_detected_cover_page','1910_photographs','Iwakura_Station_(Aichi)','Unidentified_subjects_in_Japan','名古屋鉄道の画像','駅名板画像','Alumni_of_the_University_of_Wyoming','Lety_memorial','Cultural_buildings_in_Burgos','Iwateken_Kotsu','岩手県交通の画像','Piet_Retief,_Mpumalanga','Quality_images_missing_SDC_source_of_file','Quality_images_missing_SDC_copyright_status','Quality_images_missing_SDC_copyright_license','Quality_images_missing_SDC_inception','Media_requiring_renaming','Media_requiring_renaming_-_rationale_6','Bus_routes_numbered_148','Stained-glass_windows_in_Burgenland','Stained-glass_windows_in_Austria_by_district','Rust_(Burgenland)','PD_NASA','Tropical_Storm_Josephine_(2020)','Quality_images_missing_SDC_Commons_quality_assessment','PD-old-100-expired','Medical_Heritage_Library','Nominated_valued_image_candidates','Iwate_Kyūkō_Bus','バス画像','Bus_routes_numbered_149','Quality_images_missing_SDC_creator','Bus_routes_numbered_150','1926-03-27','Breda,_Netherlands','一関市の画像','Bus_routes_numbered_147','Hernán_Cortés','Augusto_Belvedere','Ichinoseki_Station','1926_photographs','Items_with_OTRS_permission_confirmed','Files_with_PermissionOTRS_template_but_without_P6305_SDC_statement','Stolpersteine_in_Oslo-Gamle','Images_uploaded_by_Donna_Gedenk','Pages_with_local_camera_coordinates_and_missing_SDC_coordinates','1926_photographs_of_the_United_States','Schools_in_Quebec_City','Railway_photographs_by_Geof_Sheppard','Photographs_by_Geof_Sheppard')

SELECT  cl_from,cl_to  FROM `categorylinks`WHERE cl_type = 'subcat' AND 
cl_from IN 
(16427435,77160905,29237265,5273988,93171207,8292833,49598671,48452708,73514884,73514913,93141746,73514933,73514942,5229557,65375295,89119256,49325694,2371050,11740061,71765819,2581799,12178689,16468547,3355416,92207293,56860321,45788180,4127763,47563334,102952,4314089,25108543,93119689,5062995,2255349,6788554,'',62189827,93056961)
   ORDER BY cl_from ASC,cl_to ASC LIMIT 200
  
  And lo and behold, when I run the first query, what do I get:
  

+--++-+---+-+---+
| page_id  | rc_title   
| pp_propname | cat_pages | cat_subcats | cat_files |

+--++-+---+-+---+
|  8057514 | 1910_photographs   
| NULL|   509 |  12 |   497 |
| 93176596 | 1926-03-27 
| NULL| 3 |   0 | 3 |
... other normal-looking stuff ...
| 93137742 | Stained-glass_windows_in_Austria_by_district   
| NULL|98 |  98 | 0 |
| 24821400 | Stained-glass_windows_in_Burgenland
| N

[Wikidata-bugs] [Maniphest] T260232: BatchRowIterator slow query on commonswiki

2020-08-14 Thread ArielGlenn
ArielGlenn added a comment.


  Just for completeness, on db2073 I also I ran the original query with the 
crap entry, the show explain showed use of a filesort as above, and the 
execution time was... well it's still going, 330 seconds in. I killed it.

TASK DETAIL
  https://phabricator.wikimedia.org/T260232

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, CBogen, Cparle, Umherirrender, DannyS712, Naike, WDoranWMF, 
Krinkle, aaron, Reedy, Ladsgroup, Aklapper, Marostegui, XeroS_SkalibuR, 
jannee_e, Akuckartz, Adidsone1, darthmon_wmde, holger.knust, EvanProdromou, 
Nandana, Namenlos314, Phukettaxigroup, Lahi, Gq86, Darkminds3113, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, Jayprakash12345, Lunewa, QZanden, 
EBjune, merbst, LawExplorer, Vali.matei, _jensen, rosalieper, Agabi10, 
Scott_WUaS, Pchelolo, Jonas, Xmlizer, Volker_E, gnosygnu, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, GWicke, Dcljr, Dinoguy1000, 
Manybubbles, Mbch331, Rxy, Jay8g
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T260232: BatchRowIterator slow query on commonswiki

2020-08-14 Thread ArielGlenn
ArielGlenn added a comment.


  I saw multiple queries with this string in them while camping on the 
production vslow and looking at the processlist.  I don't know how many of the 
queries have this issue.

TASK DETAIL
  https://phabricator.wikimedia.org/T260232

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, CBogen, Cparle, Umherirrender, DannyS712, Naike, WDoranWMF, 
Krinkle, aaron, Reedy, Ladsgroup, Aklapper, Marostegui, XeroS_SkalibuR, 
jannee_e, Akuckartz, Adidsone1, darthmon_wmde, holger.knust, EvanProdromou, 
Nandana, Namenlos314, Phukettaxigroup, Lahi, Gq86, Darkminds3113, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, Jayprakash12345, Lunewa, QZanden, 
EBjune, merbst, LawExplorer, Vali.matei, _jensen, rosalieper, Agabi10, 
Scott_WUaS, Pchelolo, Jonas, Xmlizer, Volker_E, gnosygnu, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, GWicke, Dcljr, Dinoguy1000, 
Manybubbles, Mbch331, Rxy, Jay8g
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T260232: BatchRowIterator slow query on commonswiki

2020-08-14 Thread ArielGlenn
ArielGlenn added a comment.


  When I ran the above query on db2073 (codfw dups and vslow host) without the 
crap ' ' field in there, it returned in 0.00 seconds. Maybe the bad entries are 
a new development?

TASK DETAIL
  https://phabricator.wikimedia.org/T260232

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, CBogen, Cparle, Umherirrender, DannyS712, Naike, WDoranWMF, 
Krinkle, aaron, Reedy, Ladsgroup, Aklapper, Marostegui, XeroS_SkalibuR, 
jannee_e, Akuckartz, Adidsone1, darthmon_wmde, holger.knust, EvanProdromou, 
Nandana, Namenlos314, Phukettaxigroup, Lahi, Gq86, Darkminds3113, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, Jayprakash12345, Lunewa, QZanden, 
EBjune, merbst, LawExplorer, Vali.matei, _jensen, rosalieper, Agabi10, 
Scott_WUaS, Pchelolo, Jonas, Xmlizer, Volker_E, gnosygnu, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, GWicke, Dcljr, Dinoguy1000, 
Manybubbles, Mbch331, Rxy, Jay8g
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T260232: BatchRowIterator slow query on commonswiki

2020-08-14 Thread ArielGlenn
ArielGlenn added a comment.


SELECT /* BatchRowIterator::next  */  cl_from,cl_to  FROM `categorylinks`   
 WHERE cl_type = 'subcat' AND cl_from IN 
(92967652,234494,24559020,960551,3007520,76398273,6972234,363488,2257260,4157420,89319925,84920900,41797907,61421859,92055128,9221880,14562,26762776,33298380,65449552,3795363,66235719,42442426,89319828,27708617,2563533,66701920,22548996,108484,25232065,6846286,43665564,2257433,8811984,84203487,3837544,5324927,8645978,'',805218,1078394,81978764,391851)
   ORDER BY cl_from ASC,cl_to ASC LIMIT 200
  
  What is that empty thing in there? I see  ' ' in the list of ids.

TASK DETAIL
  https://phabricator.wikimedia.org/T260232

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, CBogen, Cparle, Umherirrender, DannyS712, Naike, WDoranWMF, 
Krinkle, aaron, Reedy, Ladsgroup, Aklapper, Marostegui, XeroS_SkalibuR, 
jannee_e, Akuckartz, Adidsone1, darthmon_wmde, holger.knust, EvanProdromou, 
Nandana, Namenlos314, Phukettaxigroup, Lahi, Gq86, Darkminds3113, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, Jayprakash12345, Lunewa, QZanden, 
EBjune, merbst, LawExplorer, Vali.matei, _jensen, rosalieper, Agabi10, 
Scott_WUaS, Pchelolo, Jonas, Xmlizer, Volker_E, gnosygnu, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, GWicke, Dcljr, Dinoguy1000, 
Manybubbles, Mbch331, Rxy, Jay8g
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T260232: BatchRowIterator slow query on commonswiki

2020-08-14 Thread ArielGlenn
ArielGlenn added a comment.


  Daily rdf dumps are probably broken until this is resolved, just a fyi for 
folks importing these for search purposes.

TASK DETAIL
  https://phabricator.wikimedia.org/T260232

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, CBogen, Cparle, Umherirrender, DannyS712, Naike, WDoranWMF, 
Krinkle, aaron, Reedy, Ladsgroup, Aklapper, Marostegui, XeroS_SkalibuR, 
jannee_e, Akuckartz, Adidsone1, darthmon_wmde, holger.knust, EvanProdromou, 
Nandana, Namenlos314, Phukettaxigroup, Lahi, Gq86, Darkminds3113, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, Jayprakash12345, Lunewa, QZanden, 
EBjune, merbst, LawExplorer, Vali.matei, _jensen, rosalieper, Agabi10, 
Scott_WUaS, Pchelolo, Jonas, Xmlizer, Volker_E, gnosygnu, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, GWicke, Dcljr, Dinoguy1000, 
Manybubbles, Mbch331, Rxy, Jay8g
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] T257876: redirected Q & deleted P Not Consistant in the json dump and web front end

2020-07-15 Thread ArielGlenn
ArielGlenn added a project: Wikidata.

TASK DETAIL
  https://phabricator.wikimedia.org/T257876

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Alicezou26, jannee_e, Akuckartz, darthmon_wmde, Nandana, Jony, Lahi, Gq86, 
NoohNaeem, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, Guy13949413, 
_jensen, rosalieper, Scott_WUaS, gnosygnu, mys_721tx, Wikidata-bugs, aude, 
Svick, Mbch331, Krenair, jeremyb
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-07-09 Thread ArielGlenn
ArielGlenn added a comment.


  Links latest-full.ttl.bz2 -> 20200116/commons-20200116-full.ttl.bz2 and 
latest-full.ttl.gz -> 20200116/commons-20200116-full.ttl.gz have been cleaned 
up. Thanks for the suggestion!

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: DD063520, D063520, CBogen, nettrom_WMF, Mahir256, dcausse, EBernhardson, 
Cparle, Abit, Gehel, jleedev, hoo, ArielGlenn, WMDE-leszek, Poyekhali, 
Steinsplitter, Aklapper, Lydia_Pintscher, Bugreporter, Tgr, Ramsey-WMF, Jarekt, 
Addshore, Tpt, Salgo60, Lucas_Werkmeister_WMDE, Smalyshev, Alter-paule, 
jannee_e, Beast1978, Un1tY, Akuckartz, Hook696, darthmon_wmde, Kent7301, 
joker88john, CucyNoiD, Nandana, Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, 
Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, Lunewa, QZanden, EBjune, 
merbst, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Taiwania_Justo, 
Scott_WUaS, Jonas, Xmlizer, Ixocactus, Wong128hk, gnosygnu, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, El_Grafo, Dinoguy1000, Manybubbles, 
Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-07-09 Thread ArielGlenn
ArielGlenn added a comment.


  It's linked off the 'other datasets' page near the top. But here's the direct 
link: https://dumps.wikimedia.org/other/wikibase/commonswiki/

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: DD063520, D063520, CBogen, nettrom_WMF, Mahir256, dcausse, EBernhardson, 
Cparle, Abit, Gehel, jleedev, hoo, ArielGlenn, WMDE-leszek, Poyekhali, 
Steinsplitter, Aklapper, Lydia_Pintscher, Bugreporter, Tgr, Ramsey-WMF, Jarekt, 
Addshore, Tpt, Salgo60, Lucas_Werkmeister_WMDE, Smalyshev, Alter-paule, 
jannee_e, Beast1978, Un1tY, Akuckartz, Hook696, darthmon_wmde, Kent7301, 
joker88john, CucyNoiD, Nandana, Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, 
Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, Lunewa, QZanden, EBjune, 
merbst, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Taiwania_Justo, 
Scott_WUaS, Jonas, Xmlizer, Ixocactus, Wong128hk, gnosygnu, jkroll, 
Wikidata-bugs, Jdouglas, aude, Tobias1984, El_Grafo, Dinoguy1000, Manybubbles, 
Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T226093: Capacity planning for Commons Structured Data

2020-07-07 Thread ArielGlenn
ArielGlenn added a comment.


  Updated.F31919691: commons_slots_new.png 
<https://phabricator.wikimedia.org/F31919691>

TASK DETAIL
  https://phabricator.wikimedia.org/T226093

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Ladsgroup, Abit, matthiasmullie, Marostegui, Mholloway, Addshore, 
Ramsey-WMF, jcrespo, Yann, MarkTraceur, ArielGlenn, Aklapper, lmata, jannee_e, 
CBogen, Akuckartz, darthmon_wmde, Legado_Shulgin, Nandana, JKSTNK, 
Davinaclare77, Qtn1293, Techguru.pc, Lahi, PDrouin-WMF, Gq86, E1presidente, 
Cparle, Anooprao, SandraF_WMF, GoranSMilovanovic, Lunewa, Th3d3v1ls, Hfbn0, 
QZanden, Tramullas, Acer, LawExplorer, Salgo60, Zppix, Silverfish, _jensen, 
rosalieper, Scott_WUaS, Susannaanas, Wong128hk, gnosygnu, Jane023, 
Wikidata-bugs, Base, aude, Ricordisamoa, Wesalius, Lydia_Pintscher, 
Fabrice_Florin, Raymond, faidon, Steinsplitter, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-05-27 Thread ArielGlenn
ArielGlenn added a comment.


  @dcausse what's your time frame?

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: nettrom_WMF, Mahir256, dcausse, EBernhardson, Cparle, Abit, Gehel, jleedev, 
hoo, ArielGlenn, WMDE-leszek, Poyekhali, Steinsplitter, Aklapper, 
Lydia_Pintscher, Bugreporter, Tgr, Ramsey-WMF, Jarekt, Addshore, Tpt, Salgo60, 
Lucas_Werkmeister_WMDE, Smalyshev, jannee_e, CBogen, darthmon_wmde, Nandana, 
Namenlos314, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, 
Ixocactus, Wong128hk, gnosygnu, jkroll, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, El_Grafo, Dinoguy1000, Manybubbles, Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T238199: SpecialFewestRevisions::reallyDoQuery takes more than 9h to run

2020-05-27 Thread ArielGlenn
ArielGlenn added a comment.


  Unless folks want to keep it open to work on speeding it up in the future?

TASK DETAIL
  https://phabricator.wikimedia.org/T238199

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: SilentSpike, WMDE-leszek, ArielGlenn, Lea_Lacroix_WMDE, jcrespo, Addshore, 
Lydia_Pintscher, Aklapper, Ladsgroup, Marostegui, darthmon_wmde, Nandana, 
jijiki, Amorymeltzer, Imarlier, Lahi, Gq86, Lsherwinforone, GoranSMilovanovic, 
Jayprakash12345, QZanden, LawExplorer, Sethakill, elukey, _jensen, rosalieper, 
Scott_WUaS, Wong128hk, Wikidata-bugs, aude, Bawolff, He7d3r, Jdforrester-WMF, 
Mbch331, Rxy, Jay8g, akosiaris
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-05-16 Thread ArielGlenn
ArielGlenn added a comment.


  I see that we're no longer blocked.  Does this mean that we're good to go for 
weekly runs?

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: nettrom_WMF, Mahir256, dcausse, EBernhardson, Cparle, Abit, Gehel, jleedev, 
hoo, ArielGlenn, WMDE-leszek, Poyekhali, Steinsplitter, Aklapper, 
Lydia_Pintscher, Bugreporter, Tgr, Ramsey-WMF, Jarekt, Addshore, Tpt, Salgo60, 
Lucas_Werkmeister_WMDE, Smalyshev, jannee_e, CBogen, darthmon_wmde, Nandana, 
Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, EBjune, merbst, LawExplorer, 
_jensen, rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, Ixocactus, 
Wong128hk, gnosygnu, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
El_Grafo, Dinoguy1000, Manybubbles, Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T238199: SpecialFewestRevisions::reallyDoQuery takes more than 9h to run

2020-05-13 Thread ArielGlenn
ArielGlenn added a comment.


  In T238199#6135018 <https://phabricator.wikimedia.org/T238199#6135018>, 
@Ladsgroup wrote:
  
  > 
  
  ...
  
  > Anyway, Lydia said it's fine to do it tomorrow when it gets announced by 
our communication manager. Does that work for you?
  
  Anything's fine as long as there's a plan of some sort before next month, so, 
sure!

TASK DETAIL
  https://phabricator.wikimedia.org/T238199

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: WMDE-leszek, ArielGlenn, Lea_Lacroix_WMDE, jcrespo, Addshore, 
Lydia_Pintscher, Aklapper, Ladsgroup, Marostegui, Blissjay007, Oblanco79, 
Alter-paule, Beast1978, Un1tY, Hook696, Daryl-TTMG, RomaAmorRoma, E.S.A-Sheild, 
darthmon_wmde, Kent7301, Meekrab2012, joker88john, CucyNoiD, Nandana, 
NebulousIris, jijiki, Gaboe420, Versusxo, Majesticalreaper22, Amorymeltzer, 
Giuliamocci, Adrian1985, Cpaulf30, Imarlier, Lahi, Gq86, Af420, Darkminds3113, 
Bsandipan, Lordiis, Lsherwinforone, GoranSMilovanovic, Adik2382, 
Jayprakash12345, Th3d3v1ls, Ramalepe, Liugev6, QZanden, LawExplorer, Sethakill, 
WSH1906, Lewizho99, Maathavan, elukey, _jensen, rosalieper, Scott_WUaS, 
Wong128hk, Wikidata-bugs, aude, Bawolff, He7d3r, Jdforrester-WMF, Mbch331, Rxy, 
Jay8g, akosiaris
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T238199: SpecialFewestRevisions::reallyDoQuery takes more than 9h to run

2020-05-13 Thread ArielGlenn
ArielGlenn added a comment.


  Can we do this temporarily while the query is being fixed up? It looks like 
it had to be killed in Nov, Feb, Apr, May, so I'd rather temp disable than 
require folks to shoot it (and anything else hung as a side effect).

TASK DETAIL
  https://phabricator.wikimedia.org/T238199

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Lea_Lacroix_WMDE, jcrespo, Addshore, Lydia_Pintscher, Aklapper, 
Ladsgroup, Marostegui, Blissjay007, Oblanco79, Alter-paule, Beast1978, Un1tY, 
Hook696, Daryl-TTMG, RomaAmorRoma, E.S.A-Sheild, darthmon_wmde, Kent7301, 
Meekrab2012, joker88john, CucyNoiD, Nandana, NebulousIris, jijiki, Gaboe420, 
Versusxo, Majesticalreaper22, Amorymeltzer, Giuliamocci, Adrian1985, Cpaulf30, 
Imarlier, Lahi, Gq86, Af420, Darkminds3113, Bsandipan, Lordiis, Lsherwinforone, 
GoranSMilovanovic, Adik2382, Jayprakash12345, Th3d3v1ls, Ramalepe, Liugev6, 
QZanden, LawExplorer, Sethakill, WSH1906, Lewizho99, Maathavan, elukey, 
_jensen, rosalieper, Scott_WUaS, Wong128hk, Wikidata-bugs, aude, Bawolff, 
He7d3r, Jdforrester-WMF, Mbch331, Rxy, Jay8g, akosiaris
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T238199: SpecialFewestRevisions::reallyDoQuery takes more than 9h to run

2020-05-13 Thread ArielGlenn
ArielGlenn added a comment.


  Can we just skip the updateSpecialPages.php wikidatawiki --override 
--only=Fewestrevisions  script altogether, instead of shooting it every month?

TASK DETAIL
  https://phabricator.wikimedia.org/T238199

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Lea_Lacroix_WMDE, jcrespo, Addshore, Lydia_Pintscher, Aklapper, 
Ladsgroup, Marostegui, darthmon_wmde, Nandana, jijiki, Amorymeltzer, Imarlier, 
Lahi, Gq86, Lsherwinforone, GoranSMilovanovic, Jayprakash12345, QZanden, 
LawExplorer, Sethakill, elukey, _jensen, rosalieper, Scott_WUaS, Wong128hk, 
Wikidata-bugs, aude, Bawolff, He7d3r, Jdforrester-WMF, Mbch331, Rxy, Jay8g, 
akosiaris
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T252632: Restart wikidata entity dumps

2020-05-13 Thread ArielGlenn
ArielGlenn added a comment.


  As I understand it the long running query comes from a monthly cron job.

TASK DETAIL
  https://phabricator.wikimedia.org/T252632

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: hoo, ArielGlenn, jannee_e, darthmon_wmde, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Created] T252632: Restart wikidata entity dumps

2020-05-13 Thread ArielGlenn
ArielGlenn created this task.
ArielGlenn added projects: Dumps-Generation, Wikidata.

TASK DESCRIPTION
  The weekly run was shot this morning when vslow db connections stalled due to 
an unrelated long-running query, see T238199 
<https://phabricator.wikimedia.org/T238199>
  
  It can be restarted from wherever it died.
  
  Note that we could face this same issue again in the future, as the 
underlying problem with that slow query is not resolved.

TASK DETAIL
  https://phabricator.wikimedia.org/T252632

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: hoo, ArielGlenn, jannee_e, darthmon_wmde, Nandana, Lahi, Gq86, 
GoranSMilovanovic, Lunewa, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, gnosygnu, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-04-21 Thread ArielGlenn
ArielGlenn added a comment.


  Hi, just checking in: any progress on invetigating the 'extra' dumps content?

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: nettrom_WMF, Mahir256, dcausse, EBernhardson, Cparle, Abit, Gehel, jleedev, 
hoo, ArielGlenn, WMDE-leszek, Poyekhali, Steinsplitter, Aklapper, 
Lydia_Pintscher, Bugreporter, Tgr, Ramsey-WMF, Jarekt, Addshore, Tpt, Salgo60, 
Lucas_Werkmeister_WMDE, Smalyshev, jannee_e, CBogen, darthmon_wmde, Nandana, 
Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, EBjune, merbst, LawExplorer, 
_jensen, rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, Ixocactus, 
Wong128hk, gnosygnu, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, 
El_Grafo, Dinoguy1000, Manybubbles, Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T248857: Wikdata entities dump not generated

2020-03-30 Thread ArielGlenn
ArielGlenn added subscribers: hoo, ArielGlenn.
ArielGlenn added a comment.


  See T248612 <https://phabricator.wikimedia.org/T248612> for that, I believe 
@hoo is planning to deploy and restart the week's run today.

TASK DETAIL
  https://phabricator.wikimedia.org/T248857

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, hoo, JAllemandou, dcausse, Aklapper, jannee_e, CBogen, 
darthmon_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, EBjune, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, gnosygnu, Wikidata-bugs, aude, 
Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-02-12 Thread ArielGlenn
ArielGlenn added a comment.


  @Cparle, No blocks on your side, the ball is now in @dcausse 's court. :-)

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: nettrom_WMF, Mahir256, dcausse, EBernhardson, Cparle, Abit, Gehel, jleedev, 
hoo, ArielGlenn, WMDE-leszek, Poyekhali, Steinsplitter, Aklapper, 
Lydia_Pintscher, Bugreporter, Tgr, Ramsey-WMF, Jarekt, Addshore, Tpt, Salgo60, 
Lucas_Werkmeister_WMDE, Smalyshev, darthmon_wmde, Nandana, JKSTNK, Lahi, 
PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, 
Lunewa, QZanden, EBjune, Tramullas, Acer, merbst, LawExplorer, Silverfish, 
_jensen, rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, Susannaanas, 
Ixocactus, Wong128hk, gnosygnu, Jane023, jkroll, Wikidata-bugs, Jdouglas, Base, 
matthiasmullie, aude, Tobias1984, El_Grafo, Dinoguy1000, Manybubbles, 
Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Jdforrester-WMF, Mbch331, 
Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T238972: switch xml/sql (and adds-changes) dumps to use 0.11 schema with content from multiple slots

2020-02-11 Thread ArielGlenn
ArielGlenn added a comment.


  This is pending https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/556346/ 
and related patches, so we're looking at March 1 if all goes well.

TASK DETAIL
  https://phabricator.wikimedia.org/T238972

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Christian75, Schnark, binbot, Johan, Lucas_Werkmeister_WMDE, RhinosF1, 
Benjavalero, hoo, leila, ArielGlenn, darthmon_wmde, Viztor, Nandana, 
Amorymeltzer, Lahi, Gq86, GoranSMilovanovic, Lunewa, QZanden, LawExplorer, 
Avner, JJMC89, _jensen, rosalieper, Scott_WUaS, Luke081515, gnosygnu, 
Wikidata-bugs, aude, Capt_Swing, TheDJ, Mbch331, Rxy, Jay8g
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T243701: Wikidata maxlag repeatedly over 5s since Jan20, 2020 (primarily caused by the query service)

2020-02-06 Thread ArielGlenn
ArielGlenn added a comment.


  In T243701#5855352 <https://phabricator.wikimedia.org/T243701#5855352>, 
@Lea_Lacroix_WMDE wrote:
  
  > Over the past weeks, we noticed a huge increase of content in Wikidata. 
Maybe that's something worth looking at?
  
  Wikidata content is growing at a fast and steady pace and has been for a few 
years now. For the last few months it's been expanding at a rate of around 
3,500,000 new pages per month. So that seems unlikely to be connected.

TASK DETAIL
  https://phabricator.wikimedia.org/T243701

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Ladsgroup, Alicia_Fagerving_WMSE, JeanFred, Pasleim, Gehel, 
Lea_Lacroix_WMDE, ArthurPSmith, Albertvillanovadelmoral, Xqt, 
Lucas_Werkmeister_WMDE, Addshore, jcrespo, Dvorapa, Aklapper, Strainu, 
darthmon_wmde, ET4Eva, Legado_Shulgin, Nandana, Davinaclare77, Qtn1293, 
Techguru.pc, Lahi, Gq86, Darkminds3113, GoranSMilovanovic, Th3d3v1ls, Hfbn0, 
QZanden, EBjune, merbst, LawExplorer, Vali.matei, Avner, Zppix, _jensen, 
rosalieper, Scott_WUaS, Jonas, FloNight, Xmlizer, Volker_E, Wong128hk, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, GWicke, Dinoguy1000, 
Manybubbles, Lydia_Pintscher, faidon, Mbch331, Rxy, Jay8g, fgiunchedi
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T221917: Create RDF dump of structured data on Commons

2020-01-22 Thread ArielGlenn
ArielGlenn added a comment.


  Some unexpected (?) triples popping up that @dcausse is looking into, so the 
dumps will not be turned on in cron until we have the thumbs up on that. See 
T243292 <https://phabricator.wikimedia.org/T243292>
  
  If it turns out the data is all ok, we can move forward.

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Mahir256, dcausse, EBernhardson, Cparle, Abit, Gehel, jleedev, hoo, 
ArielGlenn, WMDE-leszek, Poyekhali, Steinsplitter, Aklapper, Lydia_Pintscher, 
Bugreporter, Tgr, Ramsey-WMF, Jarekt, Addshore, Tpt, Salgo60, 
Lucas_Werkmeister_WMDE, Smalyshev, darthmon_wmde, Nandana, JKSTNK, Lahi, 
PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, 
Lunewa, QZanden, EBjune, Tramullas, Acer, merbst, LawExplorer, Silverfish, 
_jensen, rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, Susannaanas, 
Ixocactus, Wong128hk, gnosygnu, Jane023, jkroll, Wikidata-bugs, Jdouglas, Base, 
matthiasmullie, aude, Tobias1984, El_Grafo, Dinoguy1000, Manybubbles, 
Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Jdforrester-WMF, Mbch331, 
Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T243292: Fix the munger to support commons RDF dump

2020-01-22 Thread ArielGlenn
ArielGlenn added a parent task: T221917: Create RDF dump of structured data on 
Commons.

TASK DETAIL
  https://phabricator.wikimedia.org/T243292

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: ArielGlenn, Aklapper, dcausse, darthmon_wmde, Nandana, Lahi, Gq86, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, 
Smalyshev, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Updated] T221917: Create RDF dump of structured data on Commons

2020-01-22 Thread ArielGlenn
ArielGlenn added a subtask: T243292: Fix the munger to support commons RDF dump.

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Mahir256, dcausse, EBernhardson, Cparle, Abit, Gehel, jleedev, hoo, 
ArielGlenn, WMDE-leszek, Poyekhali, Steinsplitter, Aklapper, Lydia_Pintscher, 
Bugreporter, Tgr, Ramsey-WMF, Jarekt, Addshore, Tpt, Salgo60, 
Lucas_Werkmeister_WMDE, Smalyshev, darthmon_wmde, Nandana, JKSTNK, Lahi, 
PDrouin-WMF, Gq86, E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, 
Lunewa, QZanden, EBjune, Tramullas, Acer, merbst, LawExplorer, Silverfish, 
_jensen, rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, Susannaanas, 
Ixocactus, Wong128hk, gnosygnu, Jane023, jkroll, Wikidata-bugs, Jdouglas, Base, 
matthiasmullie, aude, Tobias1984, El_Grafo, Dinoguy1000, Manybubbles, 
Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Jdforrester-WMF, Mbch331, 
Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Changed Subscribers] T221917: Create RDF dump of structured data on Commons

2020-01-16 Thread ArielGlenn
ArielGlenn added a subscriber: dcausse.
ArielGlenn added a comment.


  @dcausse is going to check over the ttl dump and let me know if it looks ok; 
if so then I'll flip the switch for generation weekly and make sure there's 
cleanup too.

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: dcausse, EBernhardson, Cparle, Abit, Gehel, jleedev, hoo, ArielGlenn, 
WMDE-leszek, Poyekhali, Steinsplitter, Aklapper, Lydia_Pintscher, Bugreporter, 
Tgr, Ramsey-WMF, Jarekt, Addshore, Tpt, Salgo60, Lucas_Werkmeister_WMDE, 
Smalyshev, darthmon_wmde, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, 
E1presidente, Anooprao, SandraF_WMF, GoranSMilovanovic, Lunewa, QZanden, 
EBjune, Tramullas, Acer, merbst, LawExplorer, Silverfish, _jensen, rosalieper, 
Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, Susannaanas, Ixocactus, Wong128hk, 
gnosygnu, Jane023, jkroll, Wikidata-bugs, Jdouglas, Base, matthiasmullie, aude, 
Tobias1984, El_Grafo, Dinoguy1000, Manybubbles, Ricordisamoa, Wesalius, 
Fabrice_Florin, Raymond, Jdforrester-WMF, Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-01-16 Thread ArielGlenn
ArielGlenn added a comment.


  In https://dumps.wikimedia.org/other/wikibase/commonswiki/ there are two ttl 
files, gz and bz2 compressed. Please have a look!
  
  The bash script producing them complained that
  
/usr/local/bin/dumpwikibaserdf.sh: line 224: setDcatConfig: command not 
found
  
  so I'm looking at that.

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: EBernhardson, Cparle, Abit, Gehel, jleedev, hoo, ArielGlenn, WMDE-leszek, 
Poyekhali, Steinsplitter, Aklapper, Lydia_Pintscher, Bugreporter, Tgr, 
Ramsey-WMF, Jarekt, Addshore, Tpt, Salgo60, Lucas_Werkmeister_WMDE, Smalyshev, 
darthmon_wmde, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, 
Anooprao, SandraF_WMF, GoranSMilovanovic, Lunewa, QZanden, EBjune, Tramullas, 
Acer, merbst, LawExplorer, Silverfish, _jensen, rosalieper, Taiwania_Justo, 
Scott_WUaS, Jonas, Xmlizer, Susannaanas, Ixocactus, Wong128hk, gnosygnu, 
Jane023, jkroll, Wikidata-bugs, Jdouglas, Base, matthiasmullie, aude, 
Tobias1984, El_Grafo, Dinoguy1000, Manybubbles, Ricordisamoa, Wesalius, 
Fabrice_Florin, Raymond, Jdforrester-WMF, Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-01-16 Thread ArielGlenn
ArielGlenn added a comment.


  I found a ticket that mentions use of ttl files so I'll run
  
/usr/local/bin/dumpwikibaserdf.sh commons full ttl
  
  and keep an eye on it. Running on snapshot1008 in a screen session. Here we 
go!

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: EBernhardson, Cparle, Abit, Gehel, jleedev, hoo, ArielGlenn, WMDE-leszek, 
Poyekhali, Steinsplitter, Aklapper, Lydia_Pintscher, Bugreporter, Tgr, 
Ramsey-WMF, Jarekt, Addshore, Tpt, Salgo60, Lucas_Werkmeister_WMDE, Smalyshev, 
darthmon_wmde, Nandana, JKSTNK, Lahi, PDrouin-WMF, Gq86, E1presidente, 
Anooprao, SandraF_WMF, GoranSMilovanovic, Lunewa, QZanden, EBjune, Tramullas, 
Acer, merbst, LawExplorer, Silverfish, _jensen, rosalieper, Taiwania_Justo, 
Scott_WUaS, Jonas, Xmlizer, Susannaanas, Ixocactus, Wong128hk, gnosygnu, 
Jane023, jkroll, Wikidata-bugs, Jdouglas, Base, matthiasmullie, aude, 
Tobias1984, El_Grafo, Dinoguy1000, Manybubbles, Ricordisamoa, Wesalius, 
Fabrice_Florin, Raymond, Jdforrester-WMF, Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-01-13 Thread ArielGlenn
ArielGlenn added a comment.


  I plan to try running
  
/usr/local/bin/dumpwikibaserdf.sh commons full nt
  
  on Thursday morning and see how long it takes with the 8 shards that are 
currently configured.  @Abit is the nt format the one needed for WDQS testing?

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Cparle, Abit, Gehel, jleedev, hoo, ArielGlenn, WMDE-leszek, Poyekhali, 
Steinsplitter, Aklapper, Lydia_Pintscher, Bugreporter, Tgr, Ramsey-WMF, Jarekt, 
Addshore, Tpt, Salgo60, Lucas_Werkmeister_WMDE, Smalyshev, Hook696, Daryl-TTMG, 
RomaAmorRoma, 0010318400, E.S.A-Sheild, darthmon_wmde, Meekrab2012, 
joker88john, CucyNoiD, Nandana, NebulousIris, JKSTNK, Gaboe420, Versusxo, 
Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, PDrouin-WMF, Gq86, 
Af420, E1presidente, Darkminds3113, Anooprao, SandraF_WMF, Bsandipan, Lordiis, 
GoranSMilovanovic, Adik2382, Lunewa, Th3d3v1ls, Ramalepe, Liugev6, QZanden, 
EBjune, Tramullas, Acer, merbst, LawExplorer, WSH1906, Lewizho99, Maathavan, 
Silverfish, _jensen, rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, 
Susannaanas, Ixocactus, Wong128hk, gnosygnu, Jane023, jkroll, Wikidata-bugs, 
Jdouglas, Base, matthiasmullie, aude, Tobias1984, El_Grafo, Dinoguy1000, 
Manybubbles, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Jdforrester-WMF, 
Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-01-13 Thread ArielGlenn
ArielGlenn added a comment.


  Ran
  
php /srv/mediawiki/multiversion/MWScript.php 
extensions/Wikibase/repo/maintenance/dumpRdf.php --wiki commonswiki 
--batch-size 500 --format nt --flavor full-dump --entity-type mediainfo 
--no-cache --dbgroupdefault dump  --ignore-missing --first-page-id 78846320 
--last-page-id 79046320 --shard 0 --sharding-factor 1  
2>/var/lib/dumpsgen/mediainfo-log-small-shard-oom.txt | gzip > 
/mnt/dumpsdata/temp/dumpsgen/mediainfo-dumps-test-nt-one-shard-small-oom.gz
  
  which should cover the page range where we had the oom; it ran to completion 
fine. I guess that there is some small memory leak that must accumulate over 
batches, which is what did us in earlier. As long as we limit runs to some 
reasonable number of pages each time, we should be fine.

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Cparle, Abit, Gehel, jleedev, hoo, ArielGlenn, WMDE-leszek, Poyekhali, 
Steinsplitter, Aklapper, Lydia_Pintscher, Bugreporter, Tgr, Ramsey-WMF, Jarekt, 
Addshore, Tpt, Salgo60, Lucas_Werkmeister_WMDE, Smalyshev, Hook696, Daryl-TTMG, 
RomaAmorRoma, 0010318400, E.S.A-Sheild, darthmon_wmde, Meekrab2012, 
joker88john, CucyNoiD, Nandana, NebulousIris, JKSTNK, Gaboe420, Versusxo, 
Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, PDrouin-WMF, Gq86, 
Af420, E1presidente, Darkminds3113, Anooprao, SandraF_WMF, Bsandipan, Lordiis, 
GoranSMilovanovic, Adik2382, Lunewa, Th3d3v1ls, Ramalepe, Liugev6, QZanden, 
EBjune, Tramullas, Acer, merbst, LawExplorer, WSH1906, Lewizho99, Maathavan, 
Silverfish, _jensen, rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, 
Susannaanas, Ixocactus, Wong128hk, gnosygnu, Jane023, jkroll, Wikidata-bugs, 
Jdouglas, Base, matthiasmullie, aude, Tobias1984, El_Grafo, Dinoguy1000, 
Manybubbles, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Jdforrester-WMF, 
Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-01-13 Thread ArielGlenn
ArielGlenn added a comment.


  Ran
  
php /srv/mediawiki/multiversion/MWScript.php 
extensions/Wikibase/repo/maintenance/dumpRdf.php --wiki commonswiki 
--batch-size 1000 --format nt --flavor full-dump --entity-type mediainfo 
--no-cache --dbgroupdefault dump  --ignore-missing --first-page-id 1 
--last-page-id 21 --shard 1 --sharding-factor 4  
2>/var/lib/dumpsgen/mediainfo-log-small-shard.txt | gzip > 
/mnt/dumpsdata/temp/dumpsgen/mediainfo-dumps-test-nt-one-shard-of-4-small.gz
  
  and it also ran fine.

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Cparle, Abit, Gehel, jleedev, hoo, ArielGlenn, WMDE-leszek, Poyekhali, 
Steinsplitter, Aklapper, Lydia_Pintscher, Bugreporter, Tgr, Ramsey-WMF, Jarekt, 
Addshore, Tpt, Salgo60, Lucas_Werkmeister_WMDE, Smalyshev, Hook696, Daryl-TTMG, 
RomaAmorRoma, 0010318400, E.S.A-Sheild, darthmon_wmde, Meekrab2012, 
joker88john, CucyNoiD, Nandana, NebulousIris, JKSTNK, Gaboe420, Versusxo, 
Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, PDrouin-WMF, Gq86, 
Af420, E1presidente, Darkminds3113, Anooprao, SandraF_WMF, Bsandipan, Lordiis, 
GoranSMilovanovic, Adik2382, Lunewa, Th3d3v1ls, Ramalepe, Liugev6, QZanden, 
EBjune, Tramullas, Acer, merbst, LawExplorer, WSH1906, Lewizho99, Maathavan, 
Silverfish, _jensen, rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, 
Susannaanas, Ixocactus, Wong128hk, gnosygnu, Jane023, jkroll, Wikidata-bugs, 
Jdouglas, Base, matthiasmullie, aude, Tobias1984, El_Grafo, Dinoguy1000, 
Manybubbles, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Jdforrester-WMF, 
Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-01-13 Thread ArielGlenn
ArielGlenn added a comment.


  Note to self that a run of
  
php /srv/mediawiki/multiversion/MWScript.php 
extensions/Wikibase/repo/maintenance/dumpRdf.php --wiki commonswiki 
--batch-size 250 --format nt --flavor full-dump --entity-type mediainfo 
--no-cache --dbgroupdefault dump  --ignore-missing --first-page-id 1 
--last-page-id 21 --shard 0 --sharding-factor 1  
2>/var/lib/dumpsgen/mediainfo-log-small.txt | gzip > 
/mnt/dumpsdata/temp/dumpsgen/mediainfo-dumps-test-nt-noshard-small.gz
  
  worked fine. Going to run one with a sharding factor of 4 and a batch size 4 
times larger, to see how that is.

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Cparle, Abit, Gehel, jleedev, hoo, ArielGlenn, WMDE-leszek, Poyekhali, 
Steinsplitter, Aklapper, Lydia_Pintscher, Bugreporter, Tgr, Ramsey-WMF, Jarekt, 
Addshore, Tpt, Salgo60, Lucas_Werkmeister_WMDE, Smalyshev, Hook696, Daryl-TTMG, 
RomaAmorRoma, 0010318400, E.S.A-Sheild, darthmon_wmde, Meekrab2012, 
joker88john, CucyNoiD, Nandana, NebulousIris, JKSTNK, Gaboe420, Versusxo, 
Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, PDrouin-WMF, Gq86, 
Af420, E1presidente, Darkminds3113, Anooprao, SandraF_WMF, Bsandipan, Lordiis, 
GoranSMilovanovic, Adik2382, Lunewa, Th3d3v1ls, Ramalepe, Liugev6, QZanden, 
EBjune, Tramullas, Acer, merbst, LawExplorer, WSH1906, Lewizho99, Maathavan, 
Silverfish, _jensen, rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, 
Susannaanas, Ixocactus, Wong128hk, gnosygnu, Jane023, jkroll, Wikidata-bugs, 
Jdouglas, Base, matthiasmullie, aude, Tobias1984, El_Grafo, Dinoguy1000, 
Manybubbles, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Jdforrester-WMF, 
Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-01-13 Thread ArielGlenn
ArielGlenn added a comment.


  This morning the job was terminated by the oom killer:
  
[4288057.417443] Out of memory: Kill process 117265 (php) score 868 or 
sacrifice child
[4288057.425084] Killed process 117265 (php) total-vm:58241128kB, 
anon-rss:56901636kB, file-rss:0kB, shmem-rss:0kB
  
  It produced a file of size 380M with 2224612 entitites in it before being 
shot. One of the last entries in it is the page File:Gerrardina_foliosa_1.jpg 
with page id 78 846 520 and mediainfo entity (Depicts) added on Jan 10th, 2020. 
Also the gz output file is not truncated, so perhaps it is complete. @Abit   
Should I move it somewhere for folks to test with?

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Cparle, Abit, Gehel, jleedev, hoo, ArielGlenn, WMDE-leszek, Poyekhali, 
Steinsplitter, Aklapper, Lydia_Pintscher, Bugreporter, Tgr, Ramsey-WMF, Jarekt, 
Addshore, Tpt, Salgo60, Lucas_Werkmeister_WMDE, Smalyshev, Hook696, Daryl-TTMG, 
RomaAmorRoma, 0010318400, E.S.A-Sheild, darthmon_wmde, Meekrab2012, 
joker88john, CucyNoiD, Nandana, NebulousIris, JKSTNK, Gaboe420, Versusxo, 
Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, PDrouin-WMF, Gq86, 
Af420, E1presidente, Darkminds3113, Anooprao, SandraF_WMF, Bsandipan, Lordiis, 
GoranSMilovanovic, Adik2382, Lunewa, Th3d3v1ls, Ramalepe, Liugev6, QZanden, 
EBjune, Tramullas, Acer, merbst, LawExplorer, WSH1906, Lewizho99, Maathavan, 
Silverfish, _jensen, rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, 
Susannaanas, Ixocactus, Wong128hk, gnosygnu, Jane023, jkroll, Wikidata-bugs, 
Jdouglas, Base, matthiasmullie, aude, Tobias1984, El_Grafo, Dinoguy1000, 
Manybubbles, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Jdforrester-WMF, 
Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-01-10 Thread ArielGlenn
ArielGlenn added a comment.


  A batchsize of 50k turned out to be too large. Same with 5k. I'm now running 
with a batchsize of 500, which will surely be too small, but at least I am 
getting output. I'll check on it tomorrow and see how it's doing.

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Cparle, Abit, Gehel, jleedev, hoo, ArielGlenn, WMDE-leszek, Poyekhali, 
Steinsplitter, Aklapper, Lydia_Pintscher, Bugreporter, Tgr, Ramsey-WMF, Jarekt, 
Addshore, Tpt, Salgo60, Lucas_Werkmeister_WMDE, Smalyshev, Hook696, Daryl-TTMG, 
RomaAmorRoma, 0010318400, E.S.A-Sheild, darthmon_wmde, Meekrab2012, 
joker88john, CucyNoiD, Nandana, NebulousIris, JKSTNK, Gaboe420, Versusxo, 
Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, PDrouin-WMF, Gq86, 
Af420, E1presidente, Darkminds3113, Anooprao, SandraF_WMF, Bsandipan, Lordiis, 
GoranSMilovanovic, Adik2382, Lunewa, Th3d3v1ls, Ramalepe, Liugev6, QZanden, 
EBjune, Tramullas, Acer, merbst, LawExplorer, WSH1906, Lewizho99, Maathavan, 
Silverfish, _jensen, rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, 
Susannaanas, Ixocactus, Wong128hk, gnosygnu, Jane023, jkroll, Wikidata-bugs, 
Jdouglas, Base, matthiasmullie, aude, Tobias1984, El_Grafo, Dinoguy1000, 
Manybubbles, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Jdforrester-WMF, 
Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T221917: Create RDF dump of structured data on Commons

2020-01-10 Thread ArielGlenn
ArielGlenn added a comment.


  Because I've gotten a nice run in beta with the --ignore-missing flag, I'm 
trying a test run on snapshot1008 in a screen session:
  
php /srv/mediawiki/multiversion/MWScript.php 
extensions/Wikibase/repo/maintenance/dumpRdf.php --wiki commonswiki 
--batch-size 5 --format nt --flavor full-dump --entity-type mediainfo 
--no-cache --dbgroupdefault dump  --ignore-missing 
2>>/var/lib/dumpsgen/mediainfo-log.txt | gzip > 
/mnt/dumpsdata/temp/dumpsgen/mediainfo-dumps-test-nt-noshard.gz
  
  If the output looks good, I'll put it somewhere for WQS testing and move 
forward with making these weekly runs with the appropriate number of parallel 
processes.

TASK DETAIL
  https://phabricator.wikimedia.org/T221917

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArielGlenn
Cc: Cparle, Abit, Gehel, jleedev, hoo, ArielGlenn, WMDE-leszek, Poyekhali, 
Steinsplitter, Aklapper, Lydia_Pintscher, Bugreporter, Tgr, Ramsey-WMF, Jarekt, 
Addshore, Tpt, Salgo60, Lucas_Werkmeister_WMDE, Smalyshev, Hook696, Daryl-TTMG, 
RomaAmorRoma, 0010318400, E.S.A-Sheild, darthmon_wmde, Meekrab2012, 
joker88john, CucyNoiD, Nandana, NebulousIris, JKSTNK, Gaboe420, Versusxo, 
Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, PDrouin-WMF, Gq86, 
Af420, E1presidente, Darkminds3113, Anooprao, SandraF_WMF, Bsandipan, Lordiis, 
GoranSMilovanovic, Adik2382, Lunewa, Th3d3v1ls, Ramalepe, Liugev6, QZanden, 
EBjune, Tramullas, Acer, merbst, LawExplorer, WSH1906, Lewizho99, Maathavan, 
Silverfish, _jensen, rosalieper, Taiwania_Justo, Scott_WUaS, Jonas, Xmlizer, 
Susannaanas, Ixocactus, Wong128hk, gnosygnu, Jane023, jkroll, Wikidata-bugs, 
Jdouglas, Base, matthiasmullie, aude, Tobias1984, El_Grafo, Dinoguy1000, 
Manybubbles, Ricordisamoa, Wesalius, Fabrice_Florin, Raymond, Jdforrester-WMF, 
Mbch331, Keegan
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


  1   2   3   4   5   6   >