On Thu, Mar 15, 2012 at 10:48:48AM -0700, Filoche wrote: > Hi everyone. > > Based on a dependent variable (y), I'm trying to generate some independent > variables with a specified correlation. For this there's no problems. > However, I would like that have all my "regressors" to be orthogonal (i.e. > no correlation among them. > > For example, > > y = x1 + x2 + x3 where the correlation between y x1 = 0.7, x2 = 0.4 and x3 = > 0.8. However, x1, x2 and x3 should not be correlated to each other.
Hi. If the following computation is correct, then there is no solution for the required correlations, but there is one, if the vector of the required correlations is normalized to have sum of squares 1. Assume, variables x1, x2, x3 have mean zero and denote s1^2 = var(x1), s2^2 = var(x2), s3^2 = var(x3) and assume zero correlations among x1, x2, x3, so also zero covariances. Then var(y) = s1^2 + s2^2 + s3^2 E y x1 = E x1^2 + E x1 x2 + E x1 x3 = E x1^2 = s1^2 and similarly E y x2 = var(x2) = s2^2 E y x3 = var(x3) = s3^2 So, the correlation cor(y, x1) is s1^2/s1/sqrt(s1^2 + s2^2 + s3^2) = s1/sqrt(s1^2 + s2^2 + s3^2) Expressing all the correlations in this way, we get cor(y, x1) = s1/sqrt(s1^2 + s2^2 + s3^2) cor(y, x2) = s2/sqrt(s1^2 + s2^2 + s3^2) cor(y, x3) = s3/sqrt(s1^2 + s2^2 + s3^2) Clearly, we have cor(y, x1)^2 + cor(y, x2)^2 + cor(y, x3)^2 = 1. For your numbers, we get r <- c(0.7, 0.4, 0.8) sum(r^2) # [1] 1.29 So, for these numbers, the conditions are contradictory. However, a solution may be found for the vector of correlations r/sqrt(1.29) [1] 0.6163156 0.3521804 0.7043607 which are the original correlations normalized to have sum of squares 1. In this case, independent normal variables with the standard deviations (s1, s2, s3) == r/sqrt(1.29) will satisfy your conditions. I hope that other members of the list correct me, if i overlooked something. Petr Savicky. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.