[R] Posthoc test for 3-way interaction (log-linear model)
Hi, I have analyzed my data using log-linear model as seen below: > yes.no <- c("Yes","No") > tk <- c("On","Off") > ats <- c("S","V","M") > L <- gl(2,1,12,yes.no) > T <- gl(2,2,12,tk) > A <- gl(3,4,12,ats) > n <- c(1056,4774,22,283,326,2916,27,360,274,1770,15,226) > library(MASS) > l.loglm <- data.frame(A,T,L,n) > l.loglm A T L n 1 S On Yes1056 2 S On No4774 3 S Off Yes22 4 S Off No283 5 V On Yes 326 6 V On No2916 7 V OffYes 27 8 V Off No360 9 M OnYes 274 10M OnNo 1770 11M OffYes 15 12 M OffNo 226 Model comparison based on likelihood ratio chi-square statistics revealed that the 3-way interaction (saturated) model was marginally significantly different from the 2-way association model (see below): > anova(loglm.null,loglm.LA.LT.AT) LR tests for hierarchical log-linear models Model 1: n ~ T + A + L Model 2: n ~ L:T + L:A + A:T Deviance df Delta(Dev) Delta(df) P(> Delta(Dev) Model 1 305.997600 7 Model 2 4.620979 2 301.376622 50.0 Saturated 0.00 0 4.620979 20.09921 Now, I'd like to run a post-hoc test and see which one of the 3 levels of the variable "A" is significantly different from each other (S vs. V vs. M). I'd greatly appreciate if anyone can let me know how to run the post-hoc test. Thank you in advance! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GEE: estimate of predictor with high time dependency
Hi, I'm analyzing my data using GEE, which looks like below: > interact <- geeglm(L ~ O + A + O:A, + data = data1, id = id, + family = binomial, corstr = "ar1") > summary(interact) Call: geeglm(formula = lateral ~ ontask + attachment + ontask:attachment, family = binomial, data = firstgroupnowalking, id = id, corstr = "ar1") Coefficients: Estimate Std.err Wald Pr(>|W|) (Intercept)-1.89133 0.30363 38.80 4.7e-10 *** O0.00348 0.00100 12.03 0.00052 *** A1 -0.21729 0.37350 0.34 0.56073 A2 -0.14151 0.43483 0.11 0.74486 O:A1 -0.37540 0.16596 5.12 0.02370 * O:A2 -0.27626 0.16651 2.75 0.09708 . --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Estimated Scale Parameters: Estimate Std.err (Intercept) 1.27 0.369 Correlation: Structure = ar1 Link = identity Estimated Correlation Parameters: Estimate Std.err alpha0.979 0.00586 Number of clusters: 49 Maximum cluster size: 533 I decided to use auto-regression as the correlation structure because of the high auto-correlation of the dependent variable, "L". However, because one of the predictors, "O", also has high time dependency (high autocorrelation), the estimate of "O" (0.00348) seems to be too small. In this case, how shall I interpret the parameter? Should I be using another analysis, such as loglm? Thank you in advance for your help! Sachi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Significance in GEE
Hi, I'm using 'geepack' and have a question regarding how to determine which variable is significant. Is Wald test the one to determine significance? If so, how is it calculated in regard to the estimate and standard error? Is there another test to show significance? Thank you, Sachi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] QIC for GEE
Hi, I'm using 'geepack' to run Generalized Estimating Equations. I'm aware that I can use anova to compare two models, but would it be possible to test QIC on R? It seems that there were similar questions a couple of years ago, but the question has not been answered yet. I'd appreciate if someone could show me the code! Thank you, Sachi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generalized Estimating Equation (GEE): Why is Link = Identity?
Hi, I'm running GEE using geepack. I set corstr = "ar1" as below: > m.ar <- geeglm(L ~ O + A, + data = firstgrouptxt, id = id, + family = binomial, corstr = "ar1") > summary(m.ar) Call: geeglm(formula = L ~ O + A, family = binomial, data = firstgrouptxt, id = id, corstr = "ar1") Coefficients: Estimate Std.errWald Pr(>|W|) (Intercept) -2.62516 0.21154 154.001 <2e-16 *** ontask 0.00498 0.12143 0.002 0.9673 attachmentB 0.73216 0.35381 4.282 0.0385 * attachmentC 0.25960 0.33579 0.598 0.4395 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Estimated Scale Parameters: Estimate Std.err (Intercept)1.277 0.3538 Correlation: Structure = ar1 Link = identity Estimated Correlation Parameters: Estimate Std.err alpha0.978 0.005725 Number of clusters: 49 Maximum cluster size: 533 Then, it shows that : Correlation: Link = identity Why is it not Link = logit? Thank you, Sachi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Changing from 32-bit builds to 64-bit builds
Hi, Probably this is a very simple question for all the programmers, but how do you change from 32-bit builds (default) to 64-bit builds? I've been trying to run Anova to compare two models, but I get the following error message: Error: cannot allocate vector of size 1.2 Gb R(3122,0xa0ab44e0) malloc: *** mmap(size=1337688064) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug R(3122,0xa0ab44e0) malloc: *** mmap(size=1337688064) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug I suppose it's a problem with memory allocation because of the big data size, so I thought I should use 64-bit builds instead of 32. As it was recommended in a manual, I've entered the following: > CC='gcc -arch x86_64' > CXX='g++ -arch x86_64' > F77='gfortran -arch x86_64' > FC='gfortran -arch x86_64' > OBJC='gcc -arch x86_64' But it still gave me error. I'd greatly appreciate if someone can answer this question! Thank you, Sachi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sig differences in Loglinear Models for Three-Way Tables
Hi all, I've been running loglinear models for three-way tables: one of the variables having three levels, and the other two having two levels each. An example looks like below: > yes.no <- c("Yes","No") > switch <- c("On","Off") > att <- c("BB","AA","CC") > L <- gl(2,1,12,yes.no) > T <- gl(2,2,12,switch) > A <- gl(3,4,12,att) > n <- c(1136,4998,25,339,305,2752,31,692,251,1677,17,557) > d.table <- data.frame(A,T,L,n) > d.table A T Ln 1 BB On Yes 1136 2 BB On No 4998 3 BB Off Yes 25 4 BB Off No 339 5 AA On Yes 305 6 AA On No 2752 7 AA Off Yes 31 8 AA Off No 692 9 CC On Yes 251 10 CC On No 1677 11 CC Off Yes 17 12 CC Off No 557 First, I run the independence model and found a poor fit: > library(MASS) > loglm(n~A+T+L) Call: loglm(formula = n ~ A + T + L) Statistics: X^2 df P(> X^2) Likelihood Ratio 1001.431 70 Pearson 1006.287 70 Thus, I went on and run the two-way association model and found a good fit: > loglm(n~A:T+A:L+T:L) Call: loglm(formula = n ~ A:T + A:L + T:L) Statistics: X^2 df P(> X^2) Likelihood Ratio 4.827261 2 0.08948981 Pearson 4.680124 2 0.09632168 I compared the independence model (Model1), two-way association model (Model 2), and three-way interaction model (Saturated), and found that the two-way association model was the most parsimonious one: > ind <- loglm(n~A+T+L) > twoway <- loglm(n~A:T+A:L+T:L) > anova(ind,twoway) LR tests for hierarchical log-linear models Model 1: n ~ T + A + L Model 2: n ~ A:L + A:T + T:L Deviance df Delta(Dev) Delta(df) P(> Delta(Dev) Model 1 1001.430955 7 Model 2 4.827261 2 996.603694 50.0 Saturated0.00 0 4.827261 20.08949 By running a Chi-square test, I found that all of the three two-way associations are significant. > drop1(twoway,test="Chisq") Single term deletions Model: n ~ A:T + A:L + T:L DfAICLRT Pr(Chi) 24.83 A:T 2 645.91 625.08 < 2.2e-16 *** A:L 2 152.93 132.10 < 2.2e-16 *** T:L 1 143.60 120.77 < 2.2e-16 *** --- Signif. codes: 0 ¡***¢ 0.001 ¡**¢ 0.01 ¡*¢ 0.05 ¡.¢ 0.1 ¡ ¢ 1 Then, I got the coefficients: > coef(twoway) $`(Intercept)` [1] 5.866527 $A BB AA CC 0.27277069 -0.01475892 -0.25801177 $T On Off 1.156143 -1.156143 $L YesNo -1.225228 1.225228 $A.T T AOnOff BB 0.4809533 -0.4809533 AA -0.1783651 0.1783651 CC -0.3025882 0.3025882 $A.L L AYes No BB 0.19166429 -0.19166429 AA -0.15632604 0.15632604 CC -0.03533825 0.03533825 $T.L L TYes No On 0.2933774 -0.2933774 Off -0.2933774 0.2933774 I, then, hand-calculated odds ratio for A x T, A x L, and T x L. T x L: *èTL *=* e4(.293) *= 3.23 A x L: *èAL(BB vs. AA) *= *e 2(.19166) + 2(.1563) = *2.01 *èAL(BB vs. CC) *= *e 2(.19166) + 2(.03533) = *1.57 A x T: *èAT(BB vs. AA) *= *e 2(.48095) + 2(.17837) = 3.74* * * *èAT(BB vs. CC) = e 2(.48095) + 2(.30259) = 4.79 * Now, I'd like to know if BB and AA (or BB and CC) are significantly different from each other (i.e., the odds of BB to be 2.01 times larger than AA is significant) and the differences between BB and CC are significant (i.e., the odds of BB to be 1.6 times bigger is significant) etc. I'd really appreciate if someone can answer this question! Thank you, Sachi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] HLM - centering level 2 predictor
Dear R-helpers, I'm analyzing a data with hierarchical linear model. I have one level 1 predictor and one level 2 predictor, which looks like below: fm1 <- lmer(y ~ 1 + x1 + x2 + x1:x2 + (1 + x1 | id.full)) where: y is the outcome variable. x1 is the level 1 predictor variable. x2 is the level 2 predictor variable. id.full is the conditioned variable. It runs beautifully when only x1 is centered (I subtracted the mean from each value). However, when I also center x2 variable with the same procedure, it gives me the following error message: Warning message: In mer_finalize(ans) : singular convergence (7) I'd appreciate if someone could explain me what it means. One of the differences between "non-centered values" and "centered values" is that the "centered values" include negative values. Could it be the reason? If so, what shall I do? Thank you! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] coefficient graph
Dear R list members, I'd like to make a graph of coefficients of the intercept, variable 1, and variable 2 (and possibly the interaction between variable 1 and variable 2). When I use the lmList function as attached below, it shows a nice coefficient graph. > PACRP.lis <- lmList(PAffect ~ CRPC + CRPT + CINT | ID, redinteract) > coef(PACRP.lis) > PACRPlis.coef <- coef(PACRP.lis) > plot(PACRPlis.coef) However, when I'd like to make a graph using the lmer function, it shows the error message below. > PACRP <- lmer(PAffect ~ CRPC + CRPT + CINT + (1 + CRPC + CRPT + CINT |ID), redinteract) > coef(PACRP) > PACRP.coef <- coef(PACRP) > plot(PACRP.coef) *Error in eval(expr, envir, enclos) : object ".grp" not found * Does it mean that I cannot make a graph using the lmer function? Or, do I need to install another package? I'd greatly appreciate if someone could give me suggestions. Thank you so much for your help in advance, Sachi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xyplot of a specific ID number
Dear R list members, I have a question regarding xyplot. I managed to make a xyplot of all the IDs by using the syntax below: xyplot(PA ~ CRPC + CRPT | ID, data = redinteract) Now, I'd like to make a graph of a specific ID number (e.g., only ID number 301). I thought I could use "subset", but it seems to be not working. Could anyone let me know how I can make a graph of a specific ID? Thank you for your help in advance! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How can we predict differences in a slope, given that the random component was significant?
Dear R users, Using R lme function, I found that both fixed and random effects of variable A on variable B are significant. Now, I'd like to analyze what variables are predicting differences in the slope. In other words, I'd like to know what variables (e.g., variable C) are predicting individual differences in the effects of A on B. I have many data points for A and B for each individual, whereas I have only one data point for C. I'd appreciate if anyone could answer the question. Thank you for your attention. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.