subject:"Classification beginner questions"

Re: Classification beginner questions

2011-06-16 Thread Ted Dunning

A full sort is not usually feasible/desirable. Better to just keep a pool of samples and replace random samples. On Thu, Jun 16, 2011 at 2:41 AM, Lance Norskog wrote: > Use a crypto-hash on the base data as the sorting key. The base data > is the value (payload). That should randomly permute th

Re: Classification beginner questions

2011-06-15 Thread Lance Norskog

Use a crypto-hash on the base data as the sorting key. The base data is the value (payload). That should randomly permute things. On Wed, Jun 15, 2011 at 2:50 PM, Ted Dunning wrote: > It is already in Mahout, I think. > > On Tue, Jun 14, 2011 at 5:48 AM, Lance Norskog wrote: > >> Coding a permut

Re: Classification beginner questions

2011-06-15 Thread Ted Dunning

It is already in Mahout, I think. On Tue, Jun 14, 2011 at 5:48 AM, Lance Norskog wrote: > Coding a permutation like this in Map/Reduce is a good beginner exercise. > > On Sun, Jun 12, 2011 at 11:34 PM, Ted Dunning > wrote: > > But the key is that you have to have both kinds of samples. Moreove

Re: Classification beginner questions

2011-06-13 Thread Lance Norskog

Coding a permutation like this in Map/Reduce is a good beginner exercise. On Sun, Jun 12, 2011 at 11:34 PM, Ted Dunning wrote: > But the key is that you have to have both kinds of samples. Moreover, > for all of the stochastic gradient descent work, you need to have them > in a random-ish order.

Re: Classification beginner questions

2011-06-12 Thread Ted Dunning

But the key is that you have to have both kinds of samples. Moreover, for all of the stochastic gradient descent work, you need to have them in a random-ish order. You can't show all of one category and then all of another. It is even worse if you sort your data. On Mon, Jun 13, 2011 at 5:35 AM

Re: Classification beginner questions

2011-06-12 Thread Hector Yee

If you have a much larger background set you can try online passive aggressive in mahout 0.6 as it uses hinge loss and does not update the model of it gets things correct. Log loss will always have a gradient in contrast. On Jun 12, 2011 7:54 AM, "Joscha Feth" wrote: > Hi Ted, > > I see. Only for

Re: Classification beginner questions

2011-06-12 Thread Ted Dunning

An infinite number of samples is fine. It is still true that you need to have training samples from all of the target categories. On Sun, Jun 12, 2011 at 2:53 PM, Joscha Feth wrote: > Hi Ted, > > I see. Only for the OLR or also for any other algorithm? What if my > other category theoretically c

Re: Classification beginner questions

2011-06-12 Thread Joscha Feth

Hi Ted, I see. Only for the OLR or also for any other algorithm? What if my other category theoretically contains an infinite number of samples? Cheers, Joscha Am 12.06.2011 um 15:08 schrieb Ted Dunning : > Joscha, > > There is no implicit training. you need to give negative examples as > well

Re: Classification beginner questions

2011-06-12 Thread Ted Dunning

Joscha, There is no implicit training. you need to give negative examples as well as positive. On Sat, Jun 11, 2011 at 9:08 AM, Joscha Feth wrote: > Hello Ted, > > thanks for your response! > What I wanted to accomplish is actually quite simple in theory: I have some > sentences which have thi

Re: Classification beginner questions

2011-06-11 Thread Joscha Feth

Hello Ted, thanks for your response! What I wanted to accomplish is actually quite simple in theory: I have some sentences which have things in common (like some similar words for example). I want to train my model with these example sentences I have. Once it is trained I want to give an unknown s

Re: Classification beginner questions

2011-06-11 Thread Joscha Feth

Hello Sebastian, Thanks for the hint, I did get the MEAP edition of the ebook already through manning, however I find myself struggling to translate the newsgroup and wikipedia examples to my usecase. Especially I can't seem to be able to find any code examples which helps me with the generation o

Re: Classification beginner questions

2011-06-11 Thread Joscha Feth

Hector, thank you very much for youir response, I adapted my example: -- 8< -- public class OLRTest { private static final String[] animals = new String[] { "alligator", "ant", "bear", "bee", "bird", "camel", "cat", "cheetah", "chicken", "chimpanzee", "cow", "crocodile

Re: Classification beginner questions

2011-06-10 Thread Ted Dunning

The target variable here is always zero. Shouldn't it vary? On Fri, Jun 10, 2011 at 9:54 AM, Joscha Feth wrote: > algorithm.train(0, generateVector(animal)); >

Re: Classification beginner questions

2011-06-10 Thread Sebastian Schelter

Hi Joscha, If you have some money left, I'd recommend to get a copy of Mahout in Action, which features a very nice to read, detailed introduction to classification with Mahout, including strategies for feature selection. --sebastian On 10.06.2011 17:28, Hector Yee wrote: Oh you have a very

Re: Classification beginner questions

2011-06-10 Thread Hector Yee

Oh you have a very strange feature, you are using the label as a feature, may bad. I thought the words were the labels. Usually it's something like weight, height, something meaningful. If it's just the label like you have you might as well use a hash map there is no feature to learn! But if you

Re: Classification beginner questions

2011-06-10 Thread Hector Yee

It's the one with the highest score. the relative score to other classes matter more than the absolute value. Especially when you have many classes like you have. Even with logistic regression my personal preference is to use the noLink function and use that score. Sent from my iPad On Jun 10

Classification beginner questions

2011-06-10 Thread Joscha Feth

Hello fellow Mahouts, I am trying to grasp Mahout and generated a very simple (but obviously wrong) example which I hoped would help me understand how everything works: -- 8< -- public class OLRTest { private static final int FEATURES = 1; private static final int CATEGORIES = 2; pr

Re: Classification beginner questions

Re: Classification beginner questions

Re: Classification beginner questions

Re: Classification beginner questions

Re: Classification beginner questions

Re: Classification beginner questions

Re: Classification beginner questions

Re: Classification beginner questions

Re: Classification beginner questions

Re: Classification beginner questions

Re: Classification beginner questions

Re: Classification beginner questions

Re: Classification beginner questions

Re: Classification beginner questions

Re: Classification beginner questions

Re: Classification beginner questions

Classification beginner questions

17 matches

Site Navigation

Mail list logo

Footer information