On Feb 21, 2011, at 9:53 AM, Erich Neuwirth wrote:

We want to generate a distribution on the unit square with the following
properties
* It is concentrated on a "reasonable" subset of the square,
 and the restricted distribution is uniform on this subset.
* Both marginal distributions are uniform on the unit interval.
* All horizontal and all vertical cross sections are sets of lines
 segments with the same total length

If we find a geometric figure with these properties, we have solved the
problem.

So we define the distribution to be uniform on the following area:
(it is distorted but should give the idea)

x***/-----------------/***x
|**/-----------------/****|
|*/-----------------/*****|
|/-----------------/******|
|-----------------/******/|
|----------------/******/-|
|---------------/******/--|
|--------------/******/---|
|-------------/******/----|
|------------/******/-----|
|-----------/******/------|
|----------/******/-------|
|---------/******/--------|
|--------/******/---------|
|-------/******/----------|
|------/******/-----------|
|-----/******/------------|
|----/******/-------------|
|---/******/--------------|
|--/******/---------------|
|-/******/----------------|
|/******/-----------------|
|******/-----------------/|
|*****/-----------------/*|
|****/-----------------/**|
x***/-----------------/***x

There is the same number of stars in each horizontal row and each
vertical column.

I love it! I plotted the data that your method generated and got a plot with points in the regions you displayed. After pondering its curious pathology (three disjoint regions), I thought I could cure that pathology by flipping quadrants. I proceeded to do so but discovered that the pathology wasn't really cured but only concentrated at the ends and middle. Using your ASCII art, my version wuld look like this where the \\\ regions are "double dense".

y[x>0.5 & y <0.5] = 0.5 +abs(0.5-y[x>0.5 & y <0.5])
y[x<0.5 & y >0.5] = 0.5 -abs(0.5-y[x<0.5 & y >0.5])
plot(x,y)

x---------------------\\\\x
|--------------------/*\\\|
|-------------------/***\\|
|------------------/*****\|
|-----------------/******/|
|----------------/******/-|
|---------------/******/--|
|--------------/******/---|
|-------------/******/----|
|-------------|\\***/-----|
|-------------|\\\*/------|
|-------------|\\\\-------|
|-------------*-----------|
|--------/\\\\------------|
|-------/**\\\------------|
|------/****\\------------|
|-----/******\------------|
|----/******/-------------|
|---/******/--------------|
|--/******/---------------|
|-/******/----------------|
|/******/-----------------|
|\*****/------------------|
|\\***/-------------------|
|\\\\/--------------------|
\\\\\---------------------x

So it not only created a still slightly less pathological counterpart, but the correlation jumped from 0.5 to 0.95. It looks to be a promising basis for homework problems in probability courses, an experience I have never has the ?pleasure? to experience except during self-study to repair my (many) mathematical deficiencies.

> cor(x,y)
[1] 0.9449256

--
David.




So we define
g(x1,x2)= 1 abs(x1-x2) <= a or
           abs(x1-x2+1) <= a or
           abs(x1-x2-1) <= a
         0 elsewhere

The total area of the shape is 2*a.
The admissible range for a is <0,1/2>
therefore
f(x1,x2)=g(x1,x2)/(2*a)
is a density functions.
This is where simple algebra comes in.
This distribution has
expected value 1/2 and variance 1/12 for both margins
(uniform distribution), and it has
covariance = (1-3*a+2*a2)/12
and correlation = 1 - 3*a + 2*a2

The inverse function of 1 - 3*2 + 2*a2 is
(3-sqrt(1+8*r))/4

Therefore we can compute that our distribution with
a=(3-sqrt(1+8*r))/4
will produce a given r.


Ho do we create random numbers from this distribution?
By using conditional densities.
x1 is sampled from the uniform distribution, and for a give x1
we produce x2 by a uniform distribution on the along the vertical cross
cut of the geometrical shape (which is either 1 or 2 intervals).
And which is most easily implemented by using the modulo operator %%.

This mechanism is NOT a convolution. Applying module after the addition
makes it a nonconvolution. Adding independent random variables
without doing anything further is a convolution, by applying a trimming
operation, the convolution property gets lost.


--
David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to