thanks. We are trying to get larger dataset. probably over 2000 for each class. what do you mean by "the errors on performance estimates"? the confusion matrix?
On Jul 11, 2011, at 2:44 PM, Konstantin Shmakov wrote: > It seems that training data set is way too small. What are the errors > on performance estimates? > > -- > > On Mon, Jul 11, 2011 at 2:26 PM, Weihua Zhu <w...@adconion.com> wrote: >> Target class is if a user click an ad(advertisement), buy through an ad, or >> not; so 3 classes. >> Feature A s about the Advertisement itself; >> Feature B is about the user's behaviors; >> Currently im only using feature A and B. >> Total training data is 250 for each class; >> >> thanks.. >> >> >> ________________________________________ >> From: Ted Dunning [ted.dunn...@gmail.com] >> Sent: Monday, July 11, 2011 2:15 PM >> To: user@mahout.apache.org >> Subject: Re: combination of features worsen the performance >> >> Can you say a little bit about the data? >> >> What are features A and B? What kind of data do they represent? >> >> How many other features are there? >> >> What is the target variable? How many possible values does it have? >> >> How much training data do you have? >> >> What sort of training are you doing? >> >> >> >> On Mon, Jul 11, 2011 at 2:08 PM, Weihua Zhu <w...@adconion.com> wrote: >> >>> Hi, Dear all, >>> >>> I am using mahout logistic regression for classification; interestingly, >>> for feature A, B, individually each has satisfactory performances, say 65%, >>> 80%, but when i combine them together(using encoder), the performance is >>> like 72%. Shouldn't the performance be better? Any thoughts? Thanks a lot, >>> >>> >>> -wz. >>> >> > > > > -- > ksh: