Re: Generate correlated binary sequence

Ian Jermyn Thu, 14 Nov 2002 04:53:01 -0800

I believe, if I understand you, that this problem is very underdetermined,
even before you start to think about sampling. I interpret what you are
saying in the following way.


Your configuration space is all binary sequences of length n, which
therefore has 2^{n} elements. You wish to assign a probability to each of
these configurations. Thus, because of the normalization constraint, you
want to find 2^{n} - 1 numbers.

On the other hand, you want the marginal distributions to be Bernoulli
(which of course they necessarily are) with known parameters p(k), and you
want the covariance to be some known n x n matrix.

The Bernoulli parameters give you n constraints on the distribution.

The covariance matrix is symmetric, and in addition the diagonal elements
are already determined by the marginal distributions and the Bernoulli
parameters (hence you cannot pick arbitrary variances for the y(k)). Thus
there are n(n-1)/2 constraints resulting from the covariance.

Thus you have n(n + 1)/2 constraints and 2^{n} - 1 numbers to be determined.
For n > 2, you have more numbers to be determined than you have constraints.
It therefore seems highly likely that you do not know the probability
distribution you are talking about. For larger n, the number of possible
distributions compatible with your constraints will increase exponentially.
For n = 1 or 2, the number of constraints and the number of numbers to be
determined are the same, but I guess you are interested in higher n.

To have a well-posed problem, you need some further constraint. If the only
information you have is the Bernoulli parameters and the covariance, then it
would make sense to choose the maximum entropy distribution compatible with
these constraints. This ensures that you assume no information other than
that supplied by the constraints. In this case, your distribution will take
the following form:

Pr(Y) = Z^{-1} exp - (A*Y + Y*BY)       (1)

where A is a vector of length n, B is an n x n symmetric matrix with zeros
on the diagonal, both to be determined from the constraints, and * means the
transpose. Z is a normalisation constant.

At this point it is necessary to impose the various constraints. This means
calculating the means (that is, the Bernoulli parameters) and the covariance
matrix in terms of A and B from equation (1), and then solving for A and B
given the known means and covariance. As far as I can see, this is
difficult. I will think about it and post again if I come up with anything
useful. In the case that n is small, the problem can probably be solved
numerically. For large n, maybe there are approximations you can use.

Ian.







"Lehemann" <[EMAIL PROTECTED]> wrote in message
news:3DD28E64.7E24AACC@;yahoo.com...
> I need help to do simulation. Can anybody give some suggestion to
> generate
> the following data?
>
> Let Y={y(1), y(2),..., y(n)}, n is known.
> For each component, y(k) is Bernoulli(p(k)). P={p(1), p(2),..., p(n)} is
> known.
> Correlation matrix: Cov(Y)=R is known.
>
> How to generate Y for simulation purpose?
>


.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: Generate correlated binary sequence

Reply via email to