date:20140506

Re: Using clustering output for classification

2014-05-06 Thread Ted Dunning

I think Peng is right. It might help to amplify a bit. The idea is that in addition to the other predictor variables that you have, there is also one predictor variable per cluster. Whichever cluster is closest to the training example is turned on. On Wikipedia, the term used is "one hot" encod

Re: Using clustering output for classification

2014-05-06 Thread Angel Luis Scull

I will check it thanks. On 06/05/14 09:32, Ted Dunning wrote: I think Peng is right. It might help to amplify a bit. The idea is that in addition to the other predictor variables that you have, there is also one predictor variable per cluster. Whichever cluster is closest to the training examp

Re: Mahout Naive Bayes CSV Classification

2014-05-06 Thread Jossef Harush

Yes On Mon, May 5, 2014 at 10:51 PM, Andrew Palumbo wrote: > Jossef, > Does your training set have any features with a zero value for all > instances? > > > Date: Mon, 5 May 2014 08:33:37 +0300 > > Subject: RE: Mahout Naive Bayes CSV Classification > > From: josse...@gmail.com > > To: user@maho

RE: Mahout Naive Bayes CSV Classification

2014-05-06 Thread Andrew Palumbo

This would lead to that term not being counted by NaiveBayesModel.numFeatures(). NaiveBayesModel.numFeatures() returns the number of features (terms counts if this were a text classification problem) with a non-zero count across the entire input set. > From: josse...@gmail.com > Date: Tu

Re: Using clustering output for classification

Re: Using clustering output for classification

Re: Mahout Naive Bayes CSV Classification

RE: Mahout Naive Bayes CSV Classification

4 matches

Site Navigation

Mail list logo

Footer information