On Sun, Sep 25, 2011 at 7:05 PM, Lars Buitinck <[email protected]> wrote:
> What I'm trying to do is PU learning: fitting a binary classifier on a > set of positive samples and a set of unlabeled samples; *no > negatives*. Liu et al. [1, 2, 3] have a method for this called I-EM > that first assumes all unlabeled examples are negative to fit an > initial classifier, then iteratively executes roughly the following > loop body (where unlabeled is a vector of indices and 1 denotes > positive): That seems very similar to Kamal Nigam's semi-supervised Naive-Bayes. In theory, I think that EM guarantees convergence in likelihood but not in parameters or probabilities. In practice, I don't know (monitoring likelihood is slow though...). Here's a related post on the LingPipe blog (especially the comments at the end): http://lingpipe-blog.com/2011/01/04/monitoring-convergence-of-em-for-map-estimates-with-priors/ Mathieu ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
