On Thu, Sep 16, 2010 at 11:33:48AM +0000, somnath wrote:
> msf <msf <at> kisoku.net> writes:
> 
> > 
> > Hi everyone, 
> > 
> > I've been attempting to use TestClassifier on a directory of roughly
> > 49,000 small text files. When running the following command I receive a
> > NullPointerException in ConfusionMatrix.getCount(). I've attached the
> > full verbose output of the mahout run plus the stacktrace. 
> > 
> > This is on 0.4-SNAPSHOT running today's HEAD plus the small patch to
> > BayesFileFormatter I submitted in MAHOUT-488.
> > 
> > Any pointers on how to go about resolving this problem ?
> > 
> > Thanks, 
> > 
> 
> 
> Hi , 
> please check labels of test data. labels of test data and train data should 
> be 
> same for classifier

Hi, 

I did eventually figure out what I was doing wrong with the classifier,
I had not been converting my test data to the same format used to create
the model. I eventually figured it out by running the newsgroups example code 
and
looking at the example source files to see exactly what was being done
to prepare the data. The current documentation for testclassifier is not
clear enough in my opinion.

I'll look into what I can do to improve the doco.

Thanks, 

-- 
Mathieu Sauve-Frankel

Reply via email to