[Wikidata-bugs] [Maniphest] [Commented On] T164862: Train a basic item quality based on edit quality for Wikidata
Ladsgroup added a comment. 2017-05-20 17:01:35,365 INFO:revscoring.utilities.cv_train -- Cross-validating model statistics for 10 folds... 2017-05-20 17:01:35,428 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 1... 2017-05-20 17:01:35,451 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 2... 2017-05-20 17:01:35,462 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 3... 2017-05-20 17:01:35,483 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 5... 2017-05-20 17:01:35,508 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 4... 2017-05-20 17:01:35,517 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 6... 2017-05-20 17:01:35,506 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 7... 2017-05-20 17:01:35,528 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 8... 2017-05-20 17:01:42,878 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 9... 2017-05-20 17:01:42,925 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 10... 2017-05-20 17:01:48,206 INFO:revscoring.utilities.cv_train -- Training model on all data... ScikitLearnClassifier - type: RF - params: verbose=0, scale=true, criterion="gini", balanced_sample_weight=false, min_samples_split=2, max_features="log2", n_jobs=1, min_weight_fraction_leaf=0.0, warm_start=false, center=true, balanced_sample=true, oob_score=false, class_weight=null, random_state=null, bootstrap=true, max_leaf_nodes=null, n_estimators=20, max_depth=null, min_samples_leaf=13 - version: .0 - trained: 2017-05-20T17:01:48.950545 Table: ~A~B~C~D~E -- A2793310 0 0 B 64 29177 5 1 C 63 208 141486 2 D 0 165 89437 E 0 0 5 103 1361 Accuracy: 0.848 ROC-AUC: --- - 'A' 0.987 'B' 0.937 'C' 0.969 'D' 0.977 'E' 0.993 --- - F1: - - E 0.948 B 0.595 D 0.858 A 0.764 C 0.845 - -TASK DETAILhttps://phabricator.wikimedia.org/T164862EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: LadsgroupCc: Lydia_Pintscher, Glorian_WD, Halfak, Glorian_Yapinus, Aklapper, samuwmde, Ladsgroup, GoranSMilovanovic, QZanden, Avner, Izno, Wikidata-bugs, aude, Ricordisamoa, He7d3r, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T164862: Train a basic item quality based on edit quality for Wikidata
Glorian_WD added a comment. @Halfak: Oh I see. Does Multi-layer Perceptron also in the benchmark models?TASK DETAILhttps://phabricator.wikimedia.org/T164862EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup, Glorian_WDCc: Lydia_Pintscher, Glorian_WD, Halfak, Glorian_Yapinus, Aklapper, samuwmde, Ladsgroup, GoranSMilovanovic, QZanden, Avner, Izno, Wikidata-bugs, aude, Ricordisamoa, He7d3r, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T164862: Train a basic item quality based on edit quality for Wikidata
Halfak added a comment. @Glorian_WD, we use revscoring tune to do estimator and hyperparameter optimization. So, we'll likely test out a set of benchmark models (naive bayes, logistic regression, etc.) as well as a large set of parameters for Random Forest and Gradient Boosting.TASK DETAILhttps://phabricator.wikimedia.org/T164862EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup, HalfakCc: Lydia_Pintscher, Glorian_WD, Halfak, Glorian_Yapinus, Aklapper, samuwmde, Ladsgroup, GoranSMilovanovic, QZanden, Avner, Izno, Wikidata-bugs, aude, Ricordisamoa, He7d3r, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
[Wikidata-bugs] [Maniphest] [Commented On] T164862: Train a basic item quality based on edit quality for Wikidata
Glorian_WD added a comment. @Ladsgroup : are you going to train with a single classifier? or are you going to train with multiple classifiers and measure the result to find which of the classifiers which has the best accuracy?TASK DETAILhttps://phabricator.wikimedia.org/T164862EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup, Glorian_WDCc: Lydia_Pintscher, Glorian_WD, Halfak, Glorian_Yapinus, Aklapper, samuwmde, Ladsgroup, GoranSMilovanovic, QZanden, Avner, Izno, Wikidata-bugs, aude, Ricordisamoa, He7d3r, Mbch331___ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs