[scikit-learn] Label encoding for classifiers and soft targets

Javier López Peña Sat, 11 Mar 2017 05:07:04 -0800

Hi there!

I have been recently experimenting with model regularization through the use of 
soft targets,
and I’d like to be able to play with that from sklearn.


The main idea is as follows: imagine I want to fit a (probabilisitic) 
classifier with three possible 
targets, 0, 1, 2

If I pass my training set (X, y) to a sklearn classifier, the target vector y 
gets encoded so that
each target becomes an array, [1, 0, 0], [0, 1, 0], or [0, 0, 1]

What I would like to do is to be able to pass the targets directly in the 
encoded form, and avoid
any further encoding. This allows for instance to pass targets as [0.9, 0.5, 
0.5] if I want to prevent
my classifier from becoming too opinionated on its predicted probabilities.

Ideally I would like to do something like this:
```
clf = SomeClassifier(*parameters, encode_targets=False)
```

and then call
```
elf.fit(X, encoded_y) 
```

Would it be simple to modify sklearn code to do this, or would it require a lot 
of tinkering 
such as modifying every single classifier under the sun? 

Cheers,
J
_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

[scikit-learn] Label encoding for classifiers and soft targets

Reply via email to