On Sun, Sep 25, 2011 at 7:05 PM, Lars Buitinck <[email protected]> wrote:

> What I'm trying to do is PU learning: fitting a binary classifier on a
> set of positive samples and a set of unlabeled samples; *no
> negatives*. Liu et al. [1, 2, 3] have a method for this called I-EM
> that first assumes all unlabeled examples are negative to fit an
> initial classifier, then iteratively executes roughly the following
> loop body (where unlabeled is a vector of indices and 1 denotes
> positive):

That seems very similar to Kamal Nigam's semi-supervised Naive-Bayes.

In theory, I think that EM guarantees convergence in likelihood but
not in parameters or probabilities. In practice, I don't know
(monitoring likelihood is slow though...). Here's a related post on
the LingPipe blog (especially the comments at the end):

http://lingpipe-blog.com/2011/01/04/monitoring-convergence-of-em-for-map-estimates-with-priors/

Mathieu

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to