On 02-Jul-05 Peter Dalgaard wrote: > "Jim Brennan" <[EMAIL PROTECTED]> writes: > >> OK now I am skeptical especially when you say in a weird way:-) >> This may be OK but look at plot(x,y) and I am suspicious. Is it still >> alright with this kind of relationship? > ... >> N <- 10000 >> rho <- .6 >> x <- runif(N, -.5,.5) >> y <- x * sample(c(1,-1), N, replace=T, prob=c((1+rho)/2,(1-rho)/2)) > > Well, the covariance is (everything has mean zero, of course) > > E(XY) = (1+rho)/2*EX^2 + (1-rho)/2*E(X*-X) = rho*EX^2 > > The marginal distribution of Y is a mixture of two identical uniforms > (X and -X) so is uniform and in particular has the same variance as X. > > In summary, EXY/sqrt(EX^2EY^2) == rho > > So as I said, it satisfies the formal requirements. X and Y are > uniformly distributed and their correlation is rho. > > If for nothing else, I suppose that this example is good for > demonstrating that independence and uncorrelatedness is not the same > thing.
That was a nice sneaky solution! I was toying with something similar, but less sneaky, until I saw Peter's, on the lines of x<-runif(2N, -0.5,0.5); ix<-(N-k):(N+k); y<-x; y[ix]<-(-y[ix]) (which makes the same point about independence and correlation). The larger k as a fraction of N, the more you swing from rho = 1 to rho = -1, but you cannot achieve, as Peter did, an arbitrary correlation coefficient rho since the value depends on k which can only take discrete values. Another approach which leads to a less "special" joint distribution is x<-sort(runif(N, -0.5,0.5)); y<-sort(runif(N, -0.5,0.5)) followed by a rho-dependent permutation of y. I'm still pondering a way of choosing the permutation so as to get a desired rho. The extremes are the identity, which for a given sample will give as close as you can get to rho = +1, and reversal, which gives as close as you can get to rho = -1. However, the maximum theoretical rho which you can get (as opposed to what is possible for particular samples, which may get arbitrarily close to +1) depends on N. For instance, with N=3, it looks as though the theoretical rho is about 0.9 with the "identity" permutation (for N=1000, however, just about all samples give rho > 0.99). I smell a source of interesting exam questions ... Over to you! Best wishes, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <[EMAIL PROTECTED]> Fax-to-email: +44 (0)870 094 0861 Date: 02-Jul-05 Time: 12:22:09 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html