Dear Tom,

In my opinion you should first transform your data to the log-scale and then 
calculate the mean and st.dev. of the log-transformed data. Because 
mean(log(x)) is not equal to log(mean(x)).

HTH,

Thierry


----------------------------------------------------------------------------
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology 
and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
[EMAIL PROTECTED] 
www.inbo.be 

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-----Oorspronkelijk bericht-----
Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Namens Tom Cohen
Verzonden: dinsdag 1 april 2008 14:17
Aan: [EMAIL PROTECTED]
Onderwerp: [R] set the lower bound of normal distribution to 0 ?



Tom Cohen <[EMAIL PROTECTED]> skrev:    Thanks Prof Brian for your suggestion. 
I should know that for right-skewed data,
one should generate the samples from a lognormal. 
  
My problem is that x and y are two instruments that were thought to 
be measured the same thing but somehow show a wide confidence interval
of the difference between the two intruments.This may be true that
these two measure differently but can also due to the small 
number of observations, so the idea is if I increases the sample size 
then I may get better precision between the two instrument by generating
samples based on the means and standard deviations
from x and y.
  
I am using 'urlnorm' which allows sampling from 
truncated distribution since I want the samples 
to take values from 0 to the max(x) respectively max(y). 
I am unsure how to specify the means and standard deviations
in 'urlnorm'. Based on x- and y-values I have standard deviations
sd_x=0.3372137, sd_y=0.5120841 and the means mean_x=0.3126667 
mean_y=0.4223137 which are not on log scale as required in urlnorm.
  
To covert sd_x, sd_y and mean_x, mean_y on a log-scale I did
sd_logx=sqrt(log(1.3372137))=0.54, sd_logy=sqrt(log(1.5120841))=0.64,
mean_logx=-(0.54^2)/2 and mean_logy=-(0.64^2)/2. Can anyone tell if these 
are correctly calculated? Are these the values to be specified in urlnorm?
Do the lower respectively upper bound have to be on the log-scale as well
or which scale?
   
   set.seed(7)
> for(i in 1:len){
> s1[[i]]<-cbind.data.frame(x=urlnorm(n*i,meanlog=mean_logx,sdlog=sd_logx, 
> lb=0, ub=max(x)),
> y=urlnorm(n*i,meanlog=mean_logy,sdlog=sd_logy, lb=0, ub=max(y)))
> }
   
  Thanks again for any suggetions.

Prof Brian Ripley <[EMAIL PROTECTED]> skrev:
  On Thu, 27 Mar 2008, Tom Cohen wrote:

>
> Dear list,

> I have a dataset containing values obtained from two different 
> instruments (x and y). I want to generate 5 samples from normal 
> distribution for each instrument based on their means and standard 
> deviations. The problem is values from both instruments are 
> non-negative, so if using rnorm I would get some negative values. Is 
> there any options to determine the lower bound of normal distribution to 
> be 0 or can I simulate the samples in different ways to avoid the 
> negative values?

Well, that would not be a normal distribution.

If you want a _truncated_ normal distribution it is very easy by 
inversion. E.g.

trunc_rnorm <- function(n, mean = 0, sd = 1, lb = 0)
{
lb <- pnorm(lb, mean, sd)
qnorm(runif(n, lb, 1), mean, sd)
}

but I suggest you may rather want samples from a lognormal.

>
>
> > dat
> id x y
> 75 101 0.134 0.1911315
> 79 102 0.170 0.1610306
> 76 103 0.134 0.1911315
> 84 104 0.170 0.1610306
> 74 105 0.134 0.1911315
> 80 106 0.170 0.1610306
> 77 107 0.134 0.1911315
> 81 108 0.170 0.1610306
> 82 109 0.170 0.1610306
> 78 111 0.170 0.1610306
> 83 112 0.170 0.1610306
> 85 113 0.097 0.2777778
> 2 201 1.032 1.5510434
> 1 202 0.803 1.0631001
> 5 203 1.032 1.5510434
>
> mu<-apply(dat[,-1],2,mean)
> sigma<-apply(dat[,-1],2,sd)
> len<-5
> n<-20
> s1<-vector("list",len)
> set.seed(7)
> for(i in 1:len){
> s1[[i]]<-cbind.data.frame(x=rnorm(n*i,mean=mu[1],sd=sigma[1]),
> y=rnorm(n*i,mean=mu[2],sd=sigma[2]))
> }
>
> Thanks for any help,
> Tom
>
>
> ---------------------------------
> S?? efter k??leken!
>
> [[alternative HTML version deleted]]
>
>

-- 
Brian D. Ripley, [EMAIL PROTECTED]
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595

    
---------------------------------
  Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling.


       
---------------------------------
Låna pengar utan säkerhet.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to