Those two functions are apparently not computing the same thing. The mutual_info_score function is a clustering quality evaluation tool used to compute the mutual information between 2 sets of integer cluster label assignment. At least some of the integer label values must match for the score to be non zero. The goal is to assess how much 2 clusterings approximately agree on how to split the data.
As the data in X is a floating point representation of features (e.g. TF-IDF features for text) it does not make sense to pass it as an argument to mutual_info_score. -- Olivier ------------------------------------------------------------------------------ Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
