On Sep 29, 2009, at 8:47 AM, Sandra Clover wrote:

Hi,    I'm using Mahout 0.1 for document classification (using the
distributed Bayesian Network) and I'm getting some answers back. I
have noticed 1 thing that is really bugging me. I'm wondering can you
help please:-
 Problem: Concernign the Classify() method there are 2 constructors in
the API. The first one returns just one answer (according to the API it
returns: "the single best category"). The second constructor says that
it: "return the top numResults, ranked by score" My problem is that I
have compared and contrasted the results in both techniques. I have
noticed that the single best category does not appear at *all* in the
range of categories given by the second contructor! Strange no? I would of expected that it should come top of the list. I have gone to a value
of 20 deep in the numResults level and have not even see in the best
category. Has anyone encountered this before? I would appreciate any comments/suggestions/user-experience that you may like to share. Thanks,
Sandra.


That sounds like a bug. Can you try out the trunk version of Mahout and see if it is still there? A lot of the classification stuff has been reworked recently (I'm not even sure at the moment that those two classify methods are even still in the code!)

Reply via email to