Re: [R] boosting - second posting

Weiwei Shi Tue, 30 May 2006 08:28:06 -0700

I remember if you use distribution=bernoulli, then you don't have to
as.factor(your_response_variable) either.


Weiwei

On 5/30/06, Kuhn, Max <[EMAIL PROTECTED]> wrote:
>
> The family arg appears to be the problem. Either bernoulli or adaboost
> are appropriate for classification problems.
>
> Max
>
> > Perhaps by following the Posting Guide you're likely to get more
> helpful
> > responses.  You have not shown an example that others can reproduce,
> not
> > given version information for R or gbm.  The output you showed does
> not use
> > type="response", either.
> >
> > Andy
> >
> >   _____
> >
> > From: r-help-bounces at stat.math.ethz.ch on behalf of stephenc
> > Sent: Sat 5/27/2006 4:02 PM
> > To: 'R Help'
> > Subject: [R] boosting - second posting [Broadcast]
> >
> >
> >
> > Hi
> >
> > I am using boosting for a classification and prediction problem.
> >
> > For some reason it is giving me an outcome that doesn't fall between 0
>
> > and 1 for the predictions.  I have tried type="response" but it made
> no
> > difference.
> >
> > Can anyone see what I am doing wrong?
> >
> > Screen output shown below:
> >
> >
> > > boost.model <- gbm(as.factor(train$simNuance) ~ .,         # formula
>
> > +          data=train,                   # dataset
> > +                                       # +1: monotone increase,
> > +                                       #  0: no monotone restrictions
>
> > +          distribution="gaussian",     # bernoulli, adaboost,
> gaussian,
> > +                                       # poisson, and coxph available
>
> > +          n.trees=3000,                # number of trees
> > +          shrinkage=0.005,             # shrinkage or learning rate,
> > +                                       # 0.001 to 0.1 usually work
> > +          interaction.depth=3,         # 1: additive model, 2:
> two-way
> > interactions, etc.
> > +          bag.fraction = 0.5,          # subsampling fraction, 0.5 is
>
> > probably best
> > +          train.fraction = 0.5,        # fraction of data for
> training,
> > +                                       # first train.fraction*N used
> > for training
> > +          n.minobsinnode = 10,         # minimum total weight needed
> in
> > each node
> > +          cv.folds = 5,                # do 5-fold cross-validation
> > +          keep.data=TRUE,              # keep a copy of the dataset
> > with the object
> > +          verbose=FALSE)                # print out progress
> > >
> > > best.iter = gbm.perf(boost.model,method="cv")
> > > pred = predict.gbm(boost.model, test, best.iter)
> > > summary(pred)
> >    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
> > 0.4772  1.5140  1.6760  1.5100  1.7190  1.9420
> ----------------------------------------------------------------------
> LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>



-- 
Weiwei Shi, Ph.D

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III

        [[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] boosting - second posting

Reply via email to