Re: Using clustering output for classification

Angel Luis Scull Tue, 06 May 2014 08:33:38 -0700

I will check it thanks.
On 06/05/14 09:32, Ted Dunning wrote:

I think Peng is right.  It might help to amplify a bit.


The idea is that in addition to the other predictor variables that you
have, there is also one predictor variable per cluster.  Whichever cluster
is closest to the training example is turned on.

On Wikipedia, the term used is "one hot" encoding.

http://en.wikipedia.org/wiki/One-hot




On Tue, May 6, 2014 at 4:02 AM, Peng Zhang <pzhang.x...@gmail.com> wrote:

Angel,

I thinks Ted means each example falls into one cluster. If you have k
clusters, and each example should have one of the encodings: 1,2,…k.

On May 6, 2014, at 5:27 AM, Angel Luis Scull <ascu...@facinf.uho.edu.cu>
wrote:

What do you mean with "get a 1 of n encodings..."

On 05/05/14 16:59, Ted Dunning wrote:

In theory, what you need to do is take your training data for your
classifier and run your clustering to get a 1 of n encoding of the

cluster

for each example in the training data.

Then train the classifier using original and new features.

Does that help?  I have a simple demo of the process in R that I do if

that

would help.




On Mon, May 5, 2014 at 5:53 PM, Angel Luis Scull
<ascu...@facinf.uho.edu.cu>wrote:

Hello to all

I've a document dataset that I applied kmeans over it an obtained a
clusters, now I want to use this the association of the vectors and
clusters as input for a classification algorithm.

How can I achieve that?

thanks in advance

Re: Using clustering output for classification

Reply via email to