Hi,
I have been examining large data and need to do simple linear regression
with the data which is grouped based on the values of a particular
attribute. For instance, consider three columns : ID, x, y,  and  I need to
regress x on y for each distinct value of ID. Specifically, for the set of
data corresponding to each of the 4 values of ID (76,111,121,168) in the
below data, I should invoke linear regression 4 times. The challenge is
that, the length of the ID vector is around 20000 and therefore linear
regression must be done automatically for each distinct value of ID.

               ID            x                     y
 76 36476 15.8  76 36493 66.9  76 36579 65.6  111 35465 10.3  111 35756 4.8
121 38183 16  121 38184 15  121 38254 9.6  121 38255 7  168 37727 21.9  168
37739 29.7  168 37746 97.4
I was wondering whether there is an easy way to group data based on the
values of ID in R  so that linear regression can be done easily for each
group determined by each value of ID. Or, is the only way to construct
loops  with 'for' or 'while'  in which a matrix is generated for each
distinct value of ID  that stores corresponding values of x and y by
screening the entire ID vector?

Thanks in advance,

Yasin

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to