[Wikidata-bugs] [Maniphest] [Commented On] T195701: new ORES labeling campaign for Wikidata

2018-07-11 Thread Ladsgroup
Ladsgroup added a comment. https://github.com/wiki-ai/editquality/pull/165TASK DETAILhttps://phabricator.wikimedia.org/T195701EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: LadsgroupCc: Aklapper, Halfak, Ladsgroup, matej_suchanek, Lydia_Pintscher, Lahi, Gq86,

[Wikidata-bugs] [Maniphest] [Commented On] T195701: new ORES labeling campaign for Wikidata

2018-07-11 Thread Ladsgroup
Ladsgroup added a comment. Giving the *is bot/was bot* take precedence seems the best approach to me. Will make a patch.TASK DETAILhttps://phabricator.wikimedia.org/T195701EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: LadsgroupCc: Aklapper, Halfak, Ladsgroup

[Wikidata-bugs] [Maniphest] [Commented On] T195701: new ORES labeling campaign for Wikidata

2018-07-10 Thread Halfak
Halfak added a comment. Oohhh. Hmm. Yeah. I wonder if we can adjust for block reason. Or maybe let *is bot/was bot* take precedence.TASK DETAILhttps://phabricator.wikimedia.org/T195701EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup, HalfakCc: Akl

[Wikidata-bugs] [Maniphest] [Commented On] T195701: new ORES labeling campaign for Wikidata

2018-07-10 Thread Ladsgroup
Ladsgroup added a comment. I loaded it in the wikilabels and started labeling but I encounter a funny problem. Most of the edits are okay and made by bots of users who go blocked (case that happens so often is MechQuesterBot) Should we do another round of autolabeling but with ignoring the block co

[Wikidata-bugs] [Maniphest] [Commented On] T195701: new ORES labeling campaign for Wikidata

2018-07-09 Thread Halfak
Halfak added a comment. Merged. Ready for loading into Wiki labels.TASK DETAILhttps://phabricator.wikimedia.org/T195701EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup, HalfakCc: Aklapper, Halfak, Ladsgroup, matej_suchanek, Lydia_Pintscher, Lahi, Gq8

[Wikidata-bugs] [Maniphest] [Commented On] T195701: new ORES labeling campaign for Wikidata

2018-06-28 Thread Halfak
Halfak added a comment. I left some notes on the PR. I think it is more complicated than necessary.TASK DETAILhttps://phabricator.wikimedia.org/T195701EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup, HalfakCc: Halfak, Ladsgroup, matej_suchanek, Akla

[Wikidata-bugs] [Maniphest] [Commented On] T195701: new ORES labeling campaign for Wikidata

2018-06-27 Thread Ladsgroup
Ladsgroup added a comment. https://github.com/wiki-ai/editquality/pull/164TASK DETAILhttps://phabricator.wikimedia.org/T195701EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: LadsgroupCc: Halfak, Ladsgroup, matej_suchanek, Aklapper, Lydia_Pintscher, Lahi, Gq86,

[Wikidata-bugs] [Maniphest] [Commented On] T195701: new ORES labeling campaign for Wikidata

2018-06-25 Thread Halfak
Halfak added a comment. I think that should be the plan then. Query for a random sample of 500k. Then select *needs_review* from that set.TASK DETAILhttps://phabricator.wikimedia.org/T195701EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup, HalfakCc:

[Wikidata-bugs] [Maniphest] [Commented On] T195701: new ORES labeling campaign for Wikidata

2018-06-25 Thread Ladsgroup
Ladsgroup added a comment. Last time we did it with 500K. I think that's enoughTASK DETAILhttps://phabricator.wikimedia.org/T195701EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: LadsgroupCc: Halfak, Ladsgroup, matej_suchanek, Aklapper, Lydia_Pintscher, Lahi,

[Wikidata-bugs] [Maniphest] [Commented On] T195701: new ORES labeling campaign for Wikidata

2018-06-21 Thread Halfak
Halfak added a comment. How big of a sample do you think we would need in order to get enough "needs_review" samples?TASK DETAILhttps://phabricator.wikimedia.org/T195701EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup, HalfakCc: Halfak, Ladsgroup, mat

[Wikidata-bugs] [Maniphest] [Commented On] T195701: new ORES labeling campaign for Wikidata

2018-06-21 Thread Halfak
Halfak added a comment. We don't actually count all edits by people with 1000+ edits as good. We'll check to see if the edit was reverted and if they are, they are included in the needs_review dataset.TASK DETAILhttps://phabricator.wikimedia.org/T195701EMAIL PREFERENCEShttps://phabricator.wikimedi

[Wikidata-bugs] [Maniphest] [Commented On] T195701: new ORES labeling campaign for Wikidata

2018-06-20 Thread Ladsgroup
Ladsgroup added a comment. To make the dataset (sorta) balanced, we automatically mark edits made by users with more than 1K edits as trusted and doesn't need review (Look at the Makefile) and wikidata is populated by bots (more than any other wiki) so if we want to achieve a dataset to review we n

[Wikidata-bugs] [Maniphest] [Commented On] T195701: new ORES labeling campaign for Wikidata

2018-06-20 Thread Halfak
Halfak added a comment. What's the purpose of the editcount restriction?TASK DETAILhttps://phabricator.wikimedia.org/T195701EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup, HalfakCc: Halfak, Ladsgroup, matej_suchanek, Aklapper, Lydia_Pintscher, Lahi,