Thanks Peter and Marc.
I am sorry, I was wrong in dichotomizing the response. Thanks for pointing
to my mistake.

However, a correct dichotomization is not helping.

Also the link that you provided is very useful and I am thinking now not to
dichotomize my values.

Thanks again




On Fri, Oct 4, 2013 at 3:50 PM, Marc Schwartz <marc_schwa...@me.com> wrote:

>
> On Oct 4, 2013, at 2:35 PM, peter dalgaard <pda...@gmail.com> wrote:
>
> >
> > On Oct 4, 2013, at 21:16 , Mary Kindall wrote:
> >
> >> Y[Y < mean(Y)] = 0   #My edit
> >> Y[Y >= mean(Y)] = 1  #My edit
> >
> > I have no clue about gbm, but I don't think the above does what I think
> you think it does.
> >
> > Y <- as.integer(Y >= mean(Y))
> >
> > might be closer to the mark.
>
>
> Good catch Peter! I didn't pay attention to that initially.
>
> Here is an example:
>
> set.seed(1)
> Y <- rnorm(10)
>
> > Y
>  [1] -0.6264538  0.1836433 -0.8356286  1.5952808  0.3295078 -0.8204684
>  [7]  0.4874291  0.7383247  0.5757814 -0.3053884
>
> > mean(Y)
> [1] 0.1322028
>
> Before changing Y:
>
> > Y[Y < mean(Y)]
> [1] -0.6264538 -0.8356286 -0.8204684 -0.3053884
>
> > Y[Y >= mean(Y)]
> [1] 0.1836433 1.5952808 0.3295078 0.4874291 0.7383247 0.5757814
>
>
> However, the incantation that Mary is using, which calculates mean(Y)
> separately in each call, results in:
>
> Y[Y < mean(Y)]  = 0
>
> > Y
>  [1] 0.0000000 0.1836433 0.0000000 1.5952808 0.3295078 0.0000000
>  [7] 0.4874291 0.7383247 0.5757814 0.0000000
>
>
> # mean(Y) is no longer the original value from above
> > mean(Y)
> [1] 0.3909967
>
>
> Thus:
>
> Y[Y >= mean(Y)]  = 1
>
> > Y
>  [1] 0.0000000 0.1836433 0.0000000 1.0000000 0.3295078 0.0000000
>  [7] 1.0000000 1.0000000 1.0000000 0.0000000
>
>
> Some of the values in Y do not change because the threshold for modifying
> the values changed as a result of the recalculation of the mean after the
> first set of values in Y have changed. As Peter noted, you don't end up
> with a dichotomous vector.
>
> Using Peter's method:
>
> Y <- as.integer(Y >= mean(Y))
> > Y
>  [1] 0 1 0 1 1 0 1 1 1 0
>
>
> That being said, the original viewpoint stands, which is to not do this
> due to loss of information.
>
> Regards,
>
> Marc Schwartz
>
>


-- 
-------------
Mary Kindall
Yorktown Heights, NY
USA

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to