Re: [R] boosting - second posting

2006-05-30 Thread Weiwei Shi
I remember if you use distribution=bernoulli, then you don't have to
as.factor(your_response_variable) either.

Weiwei

On 5/30/06, Kuhn, Max <[EMAIL PROTECTED]> wrote:
>
> The family arg appears to be the problem. Either bernoulli or adaboost
> are appropriate for classification problems.
>
> Max
>
> > Perhaps by following the Posting Guide you're likely to get more
> helpful
> > responses.  You have not shown an example that others can reproduce,
> not
> > given version information for R or gbm.  The output you showed does
> not use
> > type="response", either.
> >
> > Andy
> >
> >   _
> >
> > From: r-help-bounces at stat.math.ethz.ch on behalf of stephenc
> > Sent: Sat 5/27/2006 4:02 PM
> > To: 'R Help'
> > Subject: [R] boosting - second posting [Broadcast]
> >
> >
> >
> > Hi
> >
> > I am using boosting for a classification and prediction problem.
> >
> > For some reason it is giving me an outcome that doesn't fall between 0
>
> > and 1 for the predictions.  I have tried type="response" but it made
> no
> > difference.
> >
> > Can anyone see what I am doing wrong?
> >
> > Screen output shown below:
> >
> >
> > > boost.model <- gbm(as.factor(train$simNuance) ~ ., # formula
>
> > +  data=train,   # dataset
> > +   # +1: monotone increase,
> > +   #  0: no monotone restrictions
>
> > +  distribution="gaussian", # bernoulli, adaboost,
> gaussian,
> > +   # poisson, and coxph available
>
> > +  n.trees=3000,# number of trees
> > +  shrinkage=0.005, # shrinkage or learning rate,
> > +   # 0.001 to 0.1 usually work
> > +  interaction.depth=3, # 1: additive model, 2:
> two-way
> > interactions, etc.
> > +  bag.fraction = 0.5,  # subsampling fraction, 0.5 is
>
> > probably best
> > +  train.fraction = 0.5,# fraction of data for
> training,
> > +   # first train.fraction*N used
> > for training
> > +  n.minobsinnode = 10, # minimum total weight needed
> in
> > each node
> > +  cv.folds = 5,# do 5-fold cross-validation
> > +  keep.data=TRUE,  # keep a copy of the dataset
> > with the object
> > +  verbose=FALSE)# print out progress
> > >
> > > best.iter = gbm.perf(boost.model,method="cv")
> > > pred = predict.gbm(boost.model, test, best.iter)
> > > summary(pred)
> >Min. 1st Qu.  MedianMean 3rd Qu.Max.
> > 0.4772  1.5140  1.6760  1.5100  1.7190  1.9420
> --
> LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>



-- 
Weiwei Shi, Ph.D

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] boosting - second posting

2006-05-30 Thread Kuhn, Max
The family arg appears to be the problem. Either bernoulli or adaboost
are appropriate for classification problems.

Max

> Perhaps by following the Posting Guide you're likely to get more
helpful
> responses.  You have not shown an example that others can reproduce,
not
> given version information for R or gbm.  The output you showed does
not use
> type="response", either.
>  
> Andy
> 
>   _  
> 
> From: r-help-bounces at stat.math.ethz.ch on behalf of stephenc
> Sent: Sat 5/27/2006 4:02 PM
> To: 'R Help'
> Subject: [R] boosting - second posting [Broadcast]
> 
> 
> 
> Hi 
>   
> I am using boosting for a classification and prediction problem. 
>   
> For some reason it is giving me an outcome that doesn't fall between 0

> and 1 for the predictions.  I have tried type="response" but it made
no 
> difference. 
>   
> Can anyone see what I am doing wrong? 
>   
> Screen output shown below: 
>   
>   
> > boost.model <- gbm(as.factor(train$simNuance) ~ ., # formula

> +  data=train,   # dataset 
> +   # +1: monotone increase, 
> +   #  0: no monotone restrictions

> +  distribution="gaussian", # bernoulli, adaboost,
gaussian, 
> +   # poisson, and coxph available

> +  n.trees=3000,# number of trees 
> +  shrinkage=0.005, # shrinkage or learning rate, 
> +   # 0.001 to 0.1 usually work 
> +  interaction.depth=3, # 1: additive model, 2:
two-way 
> interactions, etc. 
> +  bag.fraction = 0.5,  # subsampling fraction, 0.5 is

> probably best 
> +  train.fraction = 0.5,# fraction of data for
training, 
> +   # first train.fraction*N used 
> for training 
> +  n.minobsinnode = 10, # minimum total weight needed
in 
> each node 
> +  cv.folds = 5,# do 5-fold cross-validation 
> +  keep.data=TRUE,  # keep a copy of the dataset 
> with the object 
> +  verbose=FALSE)# print out progress 
> > 
> > best.iter = gbm.perf(boost.model,method="cv") 
> > pred = predict.gbm(boost.model, test, best.iter) 
> > summary(pred) 
>Min. 1st Qu.  MedianMean 3rd Qu.Max. 
 > 0.4772  1.5140  1.6760  1.5100  1.7190  1.9420
--
LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] boosting - second posting

2006-05-27 Thread Liaw, Andy
Perhaps by following the Posting Guide you're likely to get more helpful
responses.  You have not shown an example that others can reproduce, not
given version information for R or gbm.  The output you showed does not use
type="response", either.
 
Andy

  _  

From: [EMAIL PROTECTED] on behalf of stephenc
Sent: Sat 5/27/2006 4:02 PM
To: 'R Help'
Subject: [R] boosting - second posting [Broadcast]



Hi 
  
I am using boosting for a classification and prediction problem. 
  
For some reason it is giving me an outcome that doesn't fall between 0 
and 1 for the predictions.  I have tried type="response" but it made no 
difference. 
  
Can anyone see what I am doing wrong? 
  
Screen output shown below: 
  
  
> boost.model <- gbm(as.factor(train$simNuance) ~ ., # formula 
+  data=train,   # dataset 
+   # +1: monotone increase, 
+   #  0: no monotone restrictions 
+  distribution="gaussian", # bernoulli, adaboost, gaussian, 
+   # poisson, and coxph available 
+  n.trees=3000,# number of trees 
+  shrinkage=0.005, # shrinkage or learning rate, 
+   # 0.001 to 0.1 usually work 
+  interaction.depth=3, # 1: additive model, 2: two-way 
interactions, etc. 
+  bag.fraction = 0.5,  # subsampling fraction, 0.5 is 
probably best 
+  train.fraction = 0.5,# fraction of data for training, 
+   # first train.fraction*N used 
for training 
+  n.minobsinnode = 10, # minimum total weight needed in 
each node 
+  cv.folds = 5,# do 5-fold cross-validation 
+  keep.data=TRUE,  # keep a copy of the dataset 
with the object 
+  verbose=FALSE)# print out progress 
> 
> best.iter = gbm.perf(boost.model,method="cv") 
> pred = predict.gbm(boost.model, test, best.iter) 
> summary(pred) 
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
 0.4772  1.5140  1.6760  1.5100  1.7190  1.9420

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] boosting - second posting

2006-05-27 Thread stephenc
Hi
 
I am using boosting for a classification and prediction problem.
 
For some reason it is giving me an outcome that doesn't fall between 0
and 1 for the predictions.  I have tried type="response" but it made no
difference.
 
Can anyone see what I am doing wrong?
 
Screen output shown below:
 
 
> boost.model <- gbm(as.factor(train$simNuance) ~ ., # formula
+  data=train,   # dataset
+   # +1: monotone increase,
+   #  0: no monotone restrictions
+  distribution="gaussian", # bernoulli, adaboost, gaussian,
+   # poisson, and coxph available
+  n.trees=3000,# number of trees
+  shrinkage=0.005, # shrinkage or learning rate,
+   # 0.001 to 0.1 usually work
+  interaction.depth=3, # 1: additive model, 2: two-way
interactions, etc.
+  bag.fraction = 0.5,  # subsampling fraction, 0.5 is
probably best
+  train.fraction = 0.5,# fraction of data for training,
+   # first train.fraction*N used
for training
+  n.minobsinnode = 10, # minimum total weight needed in
each node
+  cv.folds = 5,# do 5-fold cross-validation
+  keep.data=TRUE,  # keep a copy of the dataset
with the object
+  verbose=FALSE)# print out progress
> 
> best.iter = gbm.perf(boost.model,method="cv")
> pred = predict.gbm(boost.model, test, best.iter)
> summary(pred)
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
 0.4772  1.5140  1.6760  1.5100  1.7190  1.9420   

Checked by AVG Free Edition.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html