Thanks Peter and Marc. I am sorry, I was wrong in dichotomizing the response. Thanks for pointing to my mistake.
However, a correct dichotomization is not helping. Also the link that you provided is very useful and I am thinking now not to dichotomize my values. Thanks again On Fri, Oct 4, 2013 at 3:50 PM, Marc Schwartz <marc_schwa...@me.com> wrote: > > On Oct 4, 2013, at 2:35 PM, peter dalgaard <pda...@gmail.com> wrote: > > > > > On Oct 4, 2013, at 21:16 , Mary Kindall wrote: > > > >> Y[Y < mean(Y)] = 0 #My edit > >> Y[Y >= mean(Y)] = 1 #My edit > > > > I have no clue about gbm, but I don't think the above does what I think > you think it does. > > > > Y <- as.integer(Y >= mean(Y)) > > > > might be closer to the mark. > > > Good catch Peter! I didn't pay attention to that initially. > > Here is an example: > > set.seed(1) > Y <- rnorm(10) > > > Y > [1] -0.6264538 0.1836433 -0.8356286 1.5952808 0.3295078 -0.8204684 > [7] 0.4874291 0.7383247 0.5757814 -0.3053884 > > > mean(Y) > [1] 0.1322028 > > Before changing Y: > > > Y[Y < mean(Y)] > [1] -0.6264538 -0.8356286 -0.8204684 -0.3053884 > > > Y[Y >= mean(Y)] > [1] 0.1836433 1.5952808 0.3295078 0.4874291 0.7383247 0.5757814 > > > However, the incantation that Mary is using, which calculates mean(Y) > separately in each call, results in: > > Y[Y < mean(Y)] = 0 > > > Y > [1] 0.0000000 0.1836433 0.0000000 1.5952808 0.3295078 0.0000000 > [7] 0.4874291 0.7383247 0.5757814 0.0000000 > > > # mean(Y) is no longer the original value from above > > mean(Y) > [1] 0.3909967 > > > Thus: > > Y[Y >= mean(Y)] = 1 > > > Y > [1] 0.0000000 0.1836433 0.0000000 1.0000000 0.3295078 0.0000000 > [7] 1.0000000 1.0000000 1.0000000 0.0000000 > > > Some of the values in Y do not change because the threshold for modifying > the values changed as a result of the recalculation of the mean after the > first set of values in Y have changed. As Peter noted, you don't end up > with a dichotomous vector. > > Using Peter's method: > > Y <- as.integer(Y >= mean(Y)) > > Y > [1] 0 1 0 1 1 0 1 1 1 0 > > > That being said, the original viewpoint stands, which is to not do this > due to loss of information. > > Regards, > > Marc Schwartz > > -- ------------- Mary Kindall Yorktown Heights, NY USA [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.