hi Thanks Ted. 
I understand that the training dataset size is small. The reason is that we 
have very limited number of "action" class events/instances.  We also want to 
make each target class have equal number of events/instances.   
Feature A is the advertisement campaign ID, and Feature B is the behaviors that 
internet user has, for example, gender:male, country: us, etc.
I set the size of the encoder to 10000, which is very large.
I used this setup for  OnlineLogisticRegressioN:
        olr = new OnlineLogisticRegression(3, FEATURES, new L1());
        olr.alpha(1).stepOffset(1000).lambda(3e-5).learningRate(3);
 
Thanks.

-wz


On Jul 11, 2011, at 2:49 PM, Ted Dunning wrote:

> This is a tiny amount of data.  The regularization in Mahout's SGD
> implementation is probably not as effective as second order techniques for
> such tiny data.
> 
> Btw... you didn't answer my questions about what kind of data feature A and
> B are.  I understand that you might be shy about this, but without that kind
> of information, I can't help you.
> 
> (and add this additional question)
> 
> What is the size of the encoded vector?
> 
> On Mon, Jul 11, 2011 at 2:26 PM, Weihua Zhu <[email protected]> wrote:
> 
>> Target class is if a user click an ad(advertisement), buy through an ad, or
>> not; so 3 classes.
>> Feature A s about the Advertisement itself;
>> Feature B is about the user's behaviors;
>> Currently im only using feature A and B.
>> Total training data is 250 for each class;
>> 
>> thanks..
>> 
>> 
>> ________________________________________
>> From: Ted Dunning [[email protected]]
>> Sent: Monday, July 11, 2011 2:15 PM
>> To: [email protected]
>> Subject: Re: combination of features worsen the performance
>> 
>> Can you say a little bit about the data?
>> 
>> What are features A and B?  What kind of data do they represent?
>> 
>> How many other features are there?
>> 
>> What is the target variable?  How many possible values does it have?
>> 
>> How much training data do you have?
>> 
>> What sort of training are you doing?
>> 
>> 
>> 
>> On Mon, Jul 11, 2011 at 2:08 PM, Weihua Zhu <[email protected]> wrote:
>> 
>>> Hi, Dear all,
>>> 
>>> I am using mahout logistic regression for classification; interestingly,
>>> for feature A, B, individually each has satisfactory performances, say
>> 65%,
>>> 80%, but when i combine them together(using encoder), the performance is
>>> like 72%. Shouldn't the performance be better? Any thoughts? Thanks a
>> lot,
>>> 
>>> 
>>> -wz.
>>> 
>> 

Reply via email to