Dear All,
a few weeks ago I have posted a question on the R help listserv that some of
you have responded to with a great solution, would like to thank you for that
again. I thought I would reach out to you with the issue I am trying to solve
now. I have posted the question a few days ago, but probably it was not clear
enough, so I thought i try it again. At times I have a multivariate example on
my hand with known information of means, SDs and medians for the variables, and
the covariance matrix of those variables. Occasionally, these parameters have a
strong enough relationship between them that a covariance matrix can be
established. Please see attached document as an example. Usually when I (a
medicine people) simulate (and it is not to say that this is the best
approach), we use a lognormal distribution to avoid from negative values being
generated because physiologic variables almost are never negative (we also
really do not know better,
unfortunatelly). For the most part I use another software that is capable of
reproducing reasonable means and medians and SD if I enter the covariance
matrix, but that is not a free resource (so I can not share the solutions with
others), nor does it have the Sweave option for standard reports like R does
that can be distributed for free. Unfortunately in R I am having a hard time
figuring the solution out. I have tried to use the multivariate normal
distribution function mvrnorm from the MASS package, or the Mvnorm from mvtnorm
package, but will get negative values simulated, which I can not afford, also,
at times the simulated means, medians and SDs are quiet different from what I
started with (which may be due to the assumption I make with regards to the
distribution of the data). I was wondering if anyone would be willing to
provide some thoughts on how you think one should try to attempt to simulate in
R a multivariate distribution
with covariance matrix (using the attached data as an example) that would
result in reasonable means, medians and SD as compared to the original values?
While to have a better idea about the actual distribution of the data would
probably be invaluable to accurately reproduce the data (and to choose a
probability distribution to simulate with), often times in the medical
literature we only have information available similar to what I have attached,
(and we make the assumption of it being log normally distributed as I have
mentioned it above). I would greatly appreciate your help,
Sincerely,
Andras
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.