I'm trying to program a simple Monte Carlo experiment that after generating 
sample data within each MC rep then runs a GLM within it, accumulating the 
coefficients from each rep in a matrix. As the GLM command needs a 
dataframe I convert all my matrix held simulated data into this form and 
then try and run a for loop (see end of included code below). It does not 
like the $i notation. Can anyone please come up with a workaround? I had a 
quick look in the glm code to see if I could figure out how to create a 
modelmatrix myself and call the glm directly on that - it defeated me (as 
I'm not the greatest of programmers). If anyone could offer advice on that 
too I'd be grateful - would save the glm command having to convert the 
dataframe on each call - which on a large number of MC reps might be worth 
while. 

Any help much appreciated (and apologies for all the code comments - my 
memories terrible so need to include).

Best,

Steve

using DataFrames, GLM, Distributions
reps=100 # number of MC sims
n=50 # sample size in each sim
X0 = rep(1,n) # the constant
nLevelsX1 = 5 # distinct values in continuous X1 - make it a factor of n
X1=gl(nLevelsX1,int(n/nLevelsX1)) # gl() is in DataFrames
X2 = gl(2,1,n) - 1 # the treatmnt dummy
X = [X0 X1 X2] # design matrix
T=1.0 # the treatment effect
B = [10, 1, T] # Beta vector
mu = float(X*B) # varying mu parameter in lognormal distribution
sigma = rep(1.0,n) # constant sigma parameter in lognormal distribution

Obs = Array(Float64, reps,n) # holds data for analysis - each row new 
dataset
# loop below creates the Y (dependent) data - one row for each MC sim
for i = 1:n
  Obs[:,i]= rand(LogNormal(mu[i],sigma[i]),reps)
end


# create dataframe where first two cols hold the unvarying X
# and the remaining cols hold each Y sim
df= convert(DataFrame, [X[:,2:3] Obs'])
rename!(df.colindex, [(symbol("x$i")=>symbol("Y$(i-2)")) for i in 3:(n+2)])
names(df) # x1 x2 Y1 Y2 ...Yreps

GLMresults =  Array(Float64, reps,3) # will hold all coeff
#next loop fails - please help - won't accept $i
for i = 1:reps
  GLMresults[i,:] = coef(glm(Y$i ~ x1 + x2,df,Normal(),LogLink()))
end

Reply via email to