[Wikidata-bugs] [Maniphest] [Commented On] T164862: Train a basic item quality based on edit quality for Wikidata

2017-05-20 Thread Ladsgroup
Ladsgroup added a comment.
2017-05-20 17:01:35,365 INFO:revscoring.utilities.cv_train -- Cross-validating model statistics for 10 folds...
2017-05-20 17:01:35,428 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 1...
2017-05-20 17:01:35,451 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 2...
2017-05-20 17:01:35,462 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 3...
2017-05-20 17:01:35,483 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 5...
2017-05-20 17:01:35,508 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 4...
2017-05-20 17:01:35,517 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 6...
2017-05-20 17:01:35,506 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 7...
2017-05-20 17:01:35,528 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 8...
2017-05-20 17:01:42,878 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 9...
2017-05-20 17:01:42,925 INFO:revscoring.scorer_models.sklearn_classifier -- Performing cross-validation 10...
2017-05-20 17:01:48,206 INFO:revscoring.utilities.cv_train -- Training model on all data...
ScikitLearnClassifier
 - type: RF
 - params: verbose=0, scale=true, criterion="gini", balanced_sample_weight=false, min_samples_split=2, max_features="log2", n_jobs=1, min_weight_fraction_leaf=0.0, warm_start=false, center=true, balanced_sample=true, oob_score=false, class_weight=null, random_state=null, bootstrap=true, max_leaf_nodes=null, n_estimators=20, max_depth=null, min_samples_leaf=13
 - version: .0
 - trained: 2017-05-20T17:01:48.950545

Table:
	  ~A~B~C~D~E
	--          
	A2793310 0 0
	B 64   29177 5 1
	C 63   208  141486 2
	D  0 165   89437
	E  0 0 5   103  1361

Accuracy: 0.848
ROC-AUC:
	---  -
	'A'  0.987
	'B'  0.937
	'C'  0.969
	'D'  0.977
	'E'  0.993
	---  -

F1:
	-  -
	E  0.948
	B  0.595
	D  0.858
	A  0.764
	C  0.845
	-  -TASK DETAILhttps://phabricator.wikimedia.org/T164862EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: LadsgroupCc: Lydia_Pintscher, Glorian_WD, Halfak, Glorian_Yapinus, Aklapper, samuwmde, Ladsgroup, GoranSMilovanovic, QZanden, Avner, Izno, Wikidata-bugs, aude, Ricordisamoa, He7d3r, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T164862: Train a basic item quality based on edit quality for Wikidata

2017-05-17 Thread Glorian_WD
Glorian_WD added a comment.
@Halfak: Oh I see. Does Multi-layer Perceptron also in the benchmark models?TASK DETAILhttps://phabricator.wikimedia.org/T164862EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup, Glorian_WDCc: Lydia_Pintscher, Glorian_WD, Halfak, Glorian_Yapinus, Aklapper, samuwmde, Ladsgroup, GoranSMilovanovic, QZanden, Avner, Izno, Wikidata-bugs, aude, Ricordisamoa, He7d3r, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T164862: Train a basic item quality based on edit quality for Wikidata

2017-05-15 Thread Halfak
Halfak added a comment.
@Glorian_WD, we use revscoring tune to do estimator and hyperparameter optimization.  So, we'll likely test out a set of benchmark models (naive bayes, logistic regression, etc.) as well as a large set of parameters for Random Forest and Gradient Boosting.TASK DETAILhttps://phabricator.wikimedia.org/T164862EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup, HalfakCc: Lydia_Pintscher, Glorian_WD, Halfak, Glorian_Yapinus, Aklapper, samuwmde, Ladsgroup, GoranSMilovanovic, QZanden, Avner, Izno, Wikidata-bugs, aude, Ricordisamoa, He7d3r, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T164862: Train a basic item quality based on edit quality for Wikidata

2017-05-12 Thread Glorian_WD
Glorian_WD added a comment.
@Ladsgroup : are you going to train with a single classifier? or are you going to train with multiple classifiers and measure the result to find which of the classifiers which has the best accuracy?TASK DETAILhttps://phabricator.wikimedia.org/T164862EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ladsgroup, Glorian_WDCc: Lydia_Pintscher, Glorian_WD, Halfak, Glorian_Yapinus, Aklapper, samuwmde, Ladsgroup, GoranSMilovanovic, QZanden, Avner, Izno, Wikidata-bugs, aude, Ricordisamoa, He7d3r, Mbch331___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs