AndrewTavis_WMDE added a project: Epic.
TASK DETAIL
https://phabricator.wikimedia.org/T356618
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE, me,
Danny_Benjafield_WMDE
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T356618
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE
AndrewTavis_WMDE moved this task from In progress to Product verification on
the Wikidata Analytics (Kanban) board.
AndrewTavis_WMDE added a comment.
@Manuel and @Lydia_Pintscher, just shared a folder with the two CSVs on
Wolke. Let me know if there's anything else needed, and I will
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T366621
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE
AndrewTavis_WMDE added a comment.
Hi @MarcoSwart 👋 Thanks for the communication here :) I guess I'm a bit
confused by how the other one would be used. You're roughly talking about:
| word_that_is_missing_from_a_wiktionary |
number_of_wiktionaries_that_do_have_it |
| MOST_MI
AndrewTavis_WMDE added a comment.
@Manuel, my assumption was that you could help any non-analytics PMs or go
through the results with them as you have the needed access. Using Google for
PII is not something we're supposed to do if it can be avoided, but I have no
experience with
AndrewTavis_WMDE added a comment.
Talked further with WMF about this just now. One basic question for the end
users: would it make it more convenient for you all if the exported datasets
were per Wiktionary? There are two options here, with missing entries being
used as an example:
1
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T366621
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T366621
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T366621
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE
AndrewTavis_WMDE added a comment.
I can also prepare a notebook with quick functions to load and explore the
data, if that would make the option I suggested a bit easier.
TASK DETAIL
https://phabricator.wikimedia.org/T366621
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings
AndrewTavis_WMDE added a comment.
> Would it be possible to send us a spreadsheet (and schedule it for deletion
after 90 days)?
I'd prefer to transfer via the servers if possible given the comment here
<https://phabricator.wikimedia.org/T358311#9820450> from WMF Engineer
AndrewTavis_WMDE added a comment.
Base queries for all of this are ready :) Let me know on the above and I'll
finalize them.
Re how to send the files: my suggestion would be that I put them into my
`stat1010` and then @Manuel can migrate them to his. From there I'll delete my
AndrewTavis_WMDE added a comment.
Checking on the numbers here really quick: the request is for the top `1000`
user agents and then a sample of `1000` user agents, but the total is `1221`.
Would an ordered list of all of them make more sense as we're talking a sample
of 82%? There r
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T366621
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE
AndrewTavis_WMDE added a comment.
Status is open as T364045 <https://phabricator.wikimedia.org/T364045> has
been resolved :)
TASK DETAIL
https://phabricator.wikimedia.org/T363583
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavi
AndrewTavis_WMDE changed the task status from "Stalled" to "Open".
TASK DETAIL
https://phabricator.wikimedia.org/T363583
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benja
AndrewTavis_WMDE added a comment.
Unstalled as the plan for the data export has been approved in T365699
<https://phabricator.wikimedia.org/T365699> :)
TASK DETAIL
https://phabricator.wikimedia.org/T361203
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings
AndrewTavis_WMDE changed the task status from "Stalled" to "Open".
TASK DETAIL
https://phabricator.wikimedia.org/T361203
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, D
AndrewTavis_WMDE added a comment.
Unstalled as the table has been created :)
TASK DETAIL
https://phabricator.wikimedia.org/T362849
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel
AndrewTavis_WMDE changed the status of subtask T362849: [Analytics] Items that
contain a sitelink to one of the Wikimedia projects over time from
"Stalled" to "Open".
TASK DETAIL
https://phabricator.wikimedia.org/T343019
EMAIL PREFERENCES
https://phabricator.wikimedi
AndrewTavis_WMDE changed the task status from "Stalled" to "Open".
TASK DETAIL
https://phabricator.wikimedia.org/T362849
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WM
AndrewTavis_WMDE added a comment.
Hi @MarcoSwart, sorry for changing the status without explanation. Was in a
meeting and we were moving things around, but obviously context should have
been added. This is stalled for now as we're waiting for WMF to advise us on
the best way forwa
AndrewTavis_WMDE added a comment.
Note, work that will unblock this task is being done in T364045: [Bug?] Can't
find wikidatawiki on wmf.mediawiki_wikitext_history
<https://phabricator.wikimedia.org/T364045>.
TASK DETAIL
https://phabricator.wikimedia.org/T363583
EMAIL PREFEREN
AndrewTavis_WMDE claimed this task.
TASK DETAIL
https://phabricator.wikimedia.org/T366621
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Lydia_Pintscher, Manuel, Danny_Benjafield_WMDE,
S8321414
AndrewTavis_WMDE added a comment.
Quick note on this, in discussion, something to check as well would be those
user agents that were present in May 2024, but were not active in April 2024 :)
TASK DETAIL
https://phabricator.wikimedia.org/T366621
EMAIL PREFERENCES
https
AndrewTavis_WMDE changed the status of subtask T360296: [Analytics] Implement
data process to identify missing Wiktionary entries from "Open" to
"Stalled".
TASK DETAIL
https://phabricator.wikimedia.org/T332899
EMAIL PREFERENCES
https://phabricator.wikimedi
AndrewTavis_WMDE changed the task status from "Open" to "Stalled".
TASK DETAIL
https://phabricator.wikimedia.org/T360296
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pampu
AndrewTavis_WMDE added a comment.
There's now a draft for the DAGs
<https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/725/diffs#96f15bf21ce9c18b6638c53402e35a2654aeeff6>
open on GitLab. There's still lots to do as WMF wants to sync on suggestio
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T356618
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T356618
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.
Thanks so much for the support here, @BTullis! I'll update the epic
<https://phabricator.wikimedia.org/T356618> with this being done. So close to
being finished with all this :)
TASK DETAIL
https://phabricator.wikimedia.org/T358311
EMAIL
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T360296
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T360296
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T360296
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred
AndrewTavis_WMDE added a comment.
wmde/analytics/hql/airflow_jobs/wiktionary_cognate
<https://gitlab.wikimedia.org/repos/wmde/analytics/-/tree/main/hql/airflow_jobs/wiktionary_cognate?ref_type=heads>
on GitLab now has all the needed queries to for missing entries, most popular
entri
AndrewTavis_WMDE added a comment.
Table has been updated with the new data from the most recent DAG run. Lots
more user agents - almost a 3x increase. Noting this for now as maybe grounds
for further investigation later, but IPs are also increasing (just not by as
much).
Note that we
AndrewTavis_WMDE renamed this task from "[Analytics] Monthly repeating tasks
(next: June 2024)" to "[Analytics] Monthly repeating tasks (next: July 2024)".
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T342559
EMAIL
AndrewTavis_WMDE closed this task as "Resolved".
AndrewTavis_WMDE claimed this task.
TASK DETAIL
https://phabricator.wikimedia.org/T351072
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Arian_Bozorg, karapayneWMDE
AndrewTavis_WMDE closed subtask T351072: Remove the WDCM clone (stats1007) as
"Resolved".
TASK DETAIL
https://phabricator.wikimedia.org/T351070
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: AndrewTavis_WMDE, Micha
AndrewTavis_WMDE closed subtask T351072: Remove the WDCM clone (stats1007) as
"Resolved".
TASK DETAIL
https://phabricator.wikimedia.org/T364965
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Lucas_Werkmeister_WMD
AndrewTavis_WMDE added a comment.
Perfect, @Lucas_Werkmeister_WMDE! Glad to have this all cleared up :)
TASK DETAIL
https://phabricator.wikimedia.org/T351072
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Arian_Bozorg
AndrewTavis_WMDE closed this task as "Resolved".
AndrewTavis_WMDE claimed this task.
AndrewTavis_WMDE added a comment.
Sounds good to me! :) Thanks for the help here, @Lucas_Werkmeister_WMDE and
@BTullis!
TASK DETAIL
https://phabricator.wikimedia.org/T364965
EMAIL PREFERENC
AndrewTavis_WMDE added a comment.
Hi @Bicolino34 👋 Thanks for reaching out :) We are still working on tasks
related to this dashboard - at least bringing back some of the data processes.
TASK DETAIL
https://phabricator.wikimedia.org/T321666
EMAIL PREFERENCES
https
AndrewTavis_WMDE added a comment.
Moving this to verification given the work in T364965
<https://phabricator.wikimedia.org/T364965>. Thanks for all of this,
@Lucas_Werkmeister_WMDE! Maybe we can resolve this and leave T364965
<https://phabricator.wikimedia.org/T364965> until `
AndrewTavis_WMDE added a comment.
None of the files listed in your comment above
<https://phabricator.wikimedia.org/T364965#9838579> look like things we should
worry about, @Lucas_Werkmeister_WMDE. Similarly that there's a different commit
for this, as to my knowledge `stat10
AndrewTavis_WMDE added a comment.
I've been asking around about the data source and connecting the tables and
have yet to get concrete answers. Based on general assumptions of the names of
the tables/columns though, the path forward for getting missing entries for a
Wiktionary will
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T356618
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE
AndrewTavis_WMDE changed the task status from "Open" to "Stalled".
TASK DETAIL
https://phabricator.wikimedia.org/T362849
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WM
AndrewTavis_WMDE changed the task status from "Open" to "Stalled".
TASK DETAIL
https://phabricator.wikimedia.org/T361203
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, D
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T360296
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Michael, ECohen_WMDE, Aklapper, Pamputt, AndrewTavis_WMDE, JeanFred
AndrewTavis_WMDE added a comment.
Thanks for taking care of this, @Lucas_Werkmeister_WMDE! We'll be able to
close both this and T351072 <https://phabricator.wikimedia.org/T351072> after
Tuesday next week if/when the Puppet change is deployed :)
TASK DET
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T365457
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Aklapper, Manuel, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414
AndrewTavis_WMDE added a comment.
@BTullis, checking in on this as your help in T358311
<https://phabricator.wikimedia.org/T358311> reminded me as it's all related to
the same user. Would you be able to remove the
`statistics/manifests/wmde/wdcm.pp` file and any related processes
AndrewTavis_WMDE added a comment.
Thank you, @BTullis! Ya I wasn't happy with the solution either. Appreciate
your willingness to help!
TASK DETAIL
https://phabricator.wikimedia.org/T358311
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences
AndrewTavis_WMDE added a comment.
I'm realizing also that I don't have admin rights and thus can't move files
to your directory. I'll copy these files over to my directory, download them
and send you a link to a zipped directory on Google Drive once we have the
abov
AndrewTavis_WMDE added a comment.
Hi @Manuel, checking further as it's still not clear what you'd like. The
double except is confusing. I'll only transfer files from `stat1005`, and could
you answer the following questions:
1. Do you want **data files** (.csv, .tsv, etc)
AndrewTavis_WMDE added a comment.
Hi @Manuel - sending along a summary of what I'll be getting for you:
== stat1004 ==
Jul 25 2020 Analytics
Jun 23 2020 Experiments
Jul 25 2020 wdUsagePerPage
== stat1005 ==
All non data
AndrewTavis_WMDE added a comment.
Ok then!
So the checks of the files above is complete as shown by its status. General
summaries of each stat machine and HDFS are provided under the subsections
above. `stat1005` has some files that @Manuel may find interesting given that
they'r
AndrewTavis_WMDE added a comment.
So basically removing the wdcm.pp related file on GitHub and its Puppet
workflows will close both tasks :)
TASK DETAIL
https://phabricator.wikimedia.org/T351072
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To
AndrewTavis_WMDE added a comment.
Ah looking at this, I'm realizing I restated myself as the work that's left
in T364965: stat1007 to stat1011 migration pipeline output check
<https://phabricator.wikimedia.org/T364965> is a duplicate of what we want to
do here :)
TAS
AndrewTavis_WMDE added a comment.
Hey @Arian_Bozorg 👋 Yes, we do still need to check this out. I was thinking
that @Lucas_Werkmeister_WMDE and I could discuss this when we chat about what
else is needed in T364965: stat1007 to stat1011 migration pipeline output check
<ht
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T365457
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Aklapper, Manuel, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Wikidata Analytics (Kanban), Wikidata.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
Making this task as a means of saving that there is still work to be done to
close out the Purdue Data Mine program
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T356618
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Michael, karapayneWMDE, Aklapper, Manuel, AndrewTavis_WMDE
AndrewTavis_WMDE added a comment.
⚠️ Currently WIP ⚠️
===
Going through the files sent by @JAllemandou above
<https://phabricator.wikimedia.org/T358311#9648470>. This message will be saved
as I go so that I don't loose my progress 😊 If I do find some
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T362849
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper,
Danny_Benjafield_WMDE
AndrewTavis_WMDE added a comment.
Note that MR#700
<https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/700>
has been opened that has the work for this :)
TASK DETAIL
https://phabricator.wikimedia.org/T361203
EMAIL PREFERENCES
AndrewTavis_WMDE added a comment.
Note that MR#700
<https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/700>
has been opened that has the work for this :)
TASK DETAIL
https://phabricator.wikimedia.org/T362849
EMAIL PREFERENCES
AndrewTavis_WMDE claimed this task.
TASK DETAIL
https://phabricator.wikimedia.org/T358311
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: brouberol, JAllemandou, MoritzMuehlenhoff, Manuel, Aklapper,
AndrewTavis_WMDE
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T362849
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper,
Danny_Benjafield_WMDE
AndrewTavis_WMDE added a comment.
Confirming that data's still coming in as well. @BTullis, what should we do
about statistics/manifests/wmde/wdcm.pp
<https://github.com/wikimedia/operations-puppet/blob/production/modules/statistics/manifests/wmde/wdcm.pp>?
Remove the file? An
AndrewTavis_WMDE added a comment.
Quick note that the word used by @BTullis was `disabled` instead of `removed`
for the stat1007 timers, so apologies if this caused some confusion. I figure
not, but just wanted to be clear :)
@BTullis, would you be able to check the journal for them and
AndrewTavis_WMDE changed the task status from "Open" to "Stalled".
TASK DETAIL
https://phabricator.wikimedia.org/T363583
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benja
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T363583
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414, Astuthiodit_1
AndrewTavis_WMDE renamed this task from "stat1007 migration output check" to
"stat1007 to stat1011 migration pipeline output check".
TASK DETAIL
https://phabricator.wikimedia.org/T364965
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/e
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Wikidata Analytics (Kanban), Wikidata,
Wikidata Dev Team.
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
Context
---
Recently WMF has been migrating from legacy stat servers that are being
AndrewTavis_WMDE added a comment.
Sheet updated with the numbers for April. Higher number of user agents, but
lower IPs (but then IPs still much higher than Feb).
TASK DETAIL
https://phabricator.wikimedia.org/T342559
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel
AndrewTavis_WMDE renamed this task from "[Analytics] Monthly repeating tasks
(next: May 2024)" to "[Analytics] Monthly repeating tasks (next: June 2024)".
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T342559
EMAIL
AndrewTavis_WMDE claimed this task.
TASK DETAIL
https://phabricator.wikimedia.org/T361203
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Manuel, Aklapper, AndrewTavis_WMDE, Danny_Benjafield_WMDE, S8321414,
Astuthiodit_1
AndrewTavis_WMDE added a comment.
Hey @brouberol 👋 Just getting back from two weeks off today :) I'll check
into this and get back to you all! Thanks for the ping!
TASK DETAIL
https://phabricator.wikimedia.org/T358311
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/
AndrewTavis_WMDE renamed this task from "Generate historical weekly segments of
Wikidata item sitelinks segmentations" to "Generate historical weekly segments
of Wikidata item sitelink segmentations".
TASK DETAIL
https://phabricator.wikimedia.org/T363583
EMAIL
AndrewTavis_WMDE renamed this task from "Generate weekly historical segments of
Wikidata item sitelinks segmentations" to "Generate historical weekly segments
of Wikidata item sitelinks segmentations".
TASK DETAIL
https://phabricator.wikimedia.org/T363583
EMAIL
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T362849
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper,
Danny_Benjafield_WMDE
AndrewTavis_WMDE created this task.
AndrewTavis_WMDE added projects: Wikidata, Wikidata Analytics (Kanban).
Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION
Purpose
---
In T362849: [Analytics] Segments of Wikidata's data over time
&
AndrewTavis_WMDE added a comment.
See T362849_wd_item_sitelink_segments.ipynb
<https://gitlab.wikimedia.org/repos/wmde/analytics/-/blob/main/tasks/wikidata/2024/T362849_wd_item_sitelink_segments/T362849_wd_item_sitelink_segments.ipynb?ref_type=heads>
for the work to derive the se
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T362849
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: JAllemandou, mpopov, AndrewTavis_WMDE, Manuel, Aklapper,
Danny_Benjafield_WMDE
AndrewTavis_WMDE added a comment.
Ok, so the new numbers after the change in scope for the max `2024-04-15`
snapshot are:
items_with_sitelinks: 32,231,861
items_items_with_sitelinks_link_to: 2,980,388
all_other_items: 72,910,679
For documentation, the numbers for the
AndrewTavis_WMDE added a comment.
Moved this to `In progress` as I'm adding the job to export everything to the
published datasets folder to the DAG as I work on the same for T362849
<https://phabricator.wikimedia.org/T362849>.
TASK DETAIL
https://phabricator.wikimedia.org/T36
AndrewTavis_WMDE added a comment.
See {https://phabricator.wikimedia.org/T363451} for the task about bringing
back the partition (hopefully via another job). I added a bit about whether we
want to maybe turn this job on when WMDE needs historical data. Let me know
what you all think on that
AndrewTavis_WMDE added a comment.
Another note on this is: if we don't expect to be needing a Wikidata
partition of `wmf.mediawiki_wikitext_history` for other tasks, then we could
work directly from the XML dump for the data backdate. We wouldn't be able to
leverage PySpark for th
AndrewTavis_WMDE added a subscriber: JAllemandou.
AndrewTavis_WMDE added a comment.
Thanks for all of the information, @mpopov!
I talked this over in my bi-weekly with @JAllemandou, and would like to bring
some further context to this particular situation :)
The go to table for this
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T362849
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: mpopov, AndrewTavis_WMDE, Manuel, Aklapper, Danny_Benjafield_WMDE,
S8321414
AndrewTavis_WMDE claimed this task.
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T362849
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: mpopov, AndrewTavis_WMDE, Manuel, Aklapper
AndrewTavis_WMDE added a comment.
Summary on your end sounds great, @Ifrahkhanyaree_WMDE! 😊 Let me know if
sending along some empty new item revisions from 2024 would be helpful :)
TASK DETAIL
https://phabricator.wikimedia.org/T360761
EMAIL PREFERENCES
https://phabricator.wikimedia.org
AndrewTavis_WMDE added a comment.
Notebook with the work that was done for this is:
wmde/analytics/tasks/product_platform/2024/T360761_empty_wikidata_items/T360761_empty_wikidata_items.ipynb
<https://gitlab.wikimedia.org/repos/wmde/analytics/-/blob/main/tasks/product_platform/2
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T360761
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414
AndrewTavis_WMDE updated the task description.
TASK DETAIL
https://phabricator.wikimedia.org/T360761
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: AndrewTavis_WMDE
Cc: Aklapper, Ifrahkhanyaree_WMDE, Manuel, Danny_Benjafield_WMDE, S8321414
AndrewTavis_WMDE moved this task from Needs product input to Product
verification on the Wikidata Analytics (Kanban) board.
AndrewTavis_WMDE added a comment.
Further insights on this, and moving it to `Product verification` at this
point :) I've now changed the query to a span of bytes
AndrewTavis_WMDE moved this task from In progress to Needs product input on the
Wikidata Analytics (Kanban) board.
AndrewTavis_WMDE added a comment.
The thread on Mattermost
<https://mattermost.wikimedia.de/swe/pl/gsr9b485x7geby79t4sg151j7c> for
discussing this has a lot of comments
1 - 100 of 628 matches
Mail list logo