You can just call glm(X, y, Normal(), LogLink()) directly with a design 
matrix X and response vector y, e.g.:

julia> glm(randn(100, 3), rand(100), Normal(), LogLink())
GLM.GeneralizedLinearModel:

Coefficients:
      Estimate Std.Error  z value Pr(>|z|)
x1   0.0179427 0.0574712 0.312202   0.7549
x2    0.025542 0.0627278 0.407187   0.6839
x3   0.0667554 0.0528705  1.26262   0.2067

On Saturday, February 21, 2015 at 6:37:27 AM UTC-5, Steve Kay wrote:
>
> I'm trying to program a simple Monte Carlo experiment that after 
> generating sample data within each MC rep then runs a GLM within it, 
> accumulating the coefficients from each rep in a matrix. As the GLM command 
> needs a dataframe I convert all my matrix held simulated data into this 
> form and then try and run a for loop (see end of included code below). It 
> does not like the $i notation. Can anyone please come up with a workaround? 
> I had a quick look in the glm code to see if I could figure out how to 
> create a modelmatrix myself and call the glm directly on that - it defeated 
> me (as I'm not the greatest of programmers). If anyone could offer advice 
> on that too I'd be grateful - would save the glm command having to convert 
> the dataframe on each call - which on a large number of MC reps might be 
> worth while. 
>
> Any help much appreciated (and apologies for all the code comments - my 
> memories terrible so need to include).
>
> Best,
>
> Steve
>
> using DataFrames, GLM, Distributions
> reps=100 # number of MC sims
> n=50 # sample size in each sim
> X0 = rep(1,n) # the constant
> nLevelsX1 = 5 # distinct values in continuous X1 - make it a factor of n
> X1=gl(nLevelsX1,int(n/nLevelsX1)) # gl() is in DataFrames
> X2 = gl(2,1,n) - 1 # the treatmnt dummy
> X = [X0 X1 X2] # design matrix
> T=1.0 # the treatment effect
> B = [10, 1, T] # Beta vector
> mu = float(X*B) # varying mu parameter in lognormal distribution
> sigma = rep(1.0,n) # constant sigma parameter in lognormal distribution
>
> Obs = Array(Float64, reps,n) # holds data for analysis - each row new 
> dataset
> # loop below creates the Y (dependent) data - one row for each MC sim
> for i = 1:n
>   Obs[:,i]= rand(LogNormal(mu[i],sigma[i]),reps)
> end
>
>
> # create dataframe where first two cols hold the unvarying X
> # and the remaining cols hold each Y sim
> df= convert(DataFrame, [X[:,2:3] Obs'])
> rename!(df.colindex, [(symbol("x$i")=>symbol("Y$(i-2)")) for i in 3:(n+2)])
> names(df) # x1 x2 Y1 Y2 ...Yreps
>
> GLMresults =  Array(Float64, reps,3) # will hold all coeff
> #next loop fails - please help - won't accept $i
> for i = 1:reps
>   GLMresults[i,:] = coef(glm(Y$i ~ x1 + x2,df,Normal(),LogLink()))
> end
>

Reply via email to