[R] Posthoc test for 3-way interaction (log-linear model)

2010-08-30 Thread Sachi Ito
Hi,

I have analyzed my data using log-linear model  as seen below:


> yes.no <- c("Yes","No")
> tk <- c("On","Off")
> ats <- c("S","V","M")

> L <- gl(2,1,12,yes.no)
> T <- gl(2,2,12,tk)
> A <- gl(3,4,12,ats)
> n <- c(1056,4774,22,283,326,2916,27,360,274,1770,15,226)

> library(MASS)
> l.loglm <- data.frame(A,T,L,n)
> l.loglm

A   T   L   n
1  S On  Yes1056
2  S On   No4774
3  S Off Yes22
4  S Off  No283
5  V On  Yes 326
6  V On   No2916
7  V OffYes 27
8  V Off  No360
9  M OnYes  274
10M  OnNo   1770
11M  OffYes 15
12   M   OffNo  226


Model comparison based on likelihood ratio chi-square statistics revealed that 
the 3-way interaction (saturated) model was marginally significantly different 
from the 2-way association model (see below):


> anova(loglm.null,loglm.LA.LT.AT)
LR tests for hierarchical log-linear models

Model 1:
 n ~ T + A + L 
Model 2:
 n ~ L:T + L:A + A:T 

Deviance df Delta(Dev) Delta(df) P(> Delta(Dev)
Model 1   305.997600  7
Model 2 4.620979  2 301.376622 50.0
Saturated   0.00  0   4.620979 20.09921


Now, I'd like to run a post-hoc test and see which one of the 3 levels of the 
variable "A" is significantly different from each other (S vs. V vs. M). 

I'd greatly appreciate if anyone can let me know how to run the post-hoc test.

Thank you in advance!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GEE: estimate of predictor with high time dependency

2010-06-08 Thread Sachi Ito
Hi,

I'm analyzing my data using GEE, which looks like below:

> interact <- geeglm(L ~ O + A + O:A,
+ data = data1, id = id,
+ family = binomial, corstr = "ar1")

> summary(interact)

Call:
geeglm(formula = lateral ~ ontask + attachment + ontask:attachment,
family = binomial, data = firstgroupnowalking, id = id, corstr = "ar1")

 Coefficients:
   Estimate  Std.err  Wald Pr(>|W|)
(Intercept)-1.89133  0.30363 38.80  4.7e-10 ***
O0.00348  0.00100 12.03  0.00052 ***
A1  -0.21729  0.37350  0.34  0.56073
A2  -0.14151  0.43483  0.11  0.74486
O:A1   -0.37540  0.16596  5.12  0.02370 *
O:A2   -0.27626  0.16651  2.75  0.09708 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Estimated Scale Parameters:
Estimate Std.err
(Intercept) 1.27   0.369

Correlation: Structure = ar1  Link = identity

Estimated Correlation Parameters:
  Estimate Std.err
alpha0.979 0.00586
Number of clusters:   49   Maximum cluster size: 533



I decided to use auto-regression as the correlation structure because of the
high auto-correlation of the dependent variable, "L".  However, because one
of the predictors, "O", also has high time dependency (high
autocorrelation), the estimate of "O" (0.00348) seems to be too small.  In
this case, how shall I interpret the parameter?  Should I be using another
analysis, such as loglm?

Thank you in advance for your help!

Sachi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Significance in GEE

2010-05-28 Thread Sachi Ito
Hi,

I'm using 'geepack' and have a question regarding how to determine which
variable is significant.  Is Wald test the one to determine significance?
 If so, how is it calculated in regard to the estimate and standard error?
 Is there another test to show significance?

Thank you,
Sachi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] QIC for GEE

2010-04-30 Thread Sachi Ito
Hi,

I'm using 'geepack' to run Generalized Estimating Equations.  I'm aware that
I can use anova to compare two models, but would it be possible to test QIC
on R?  It seems that there were similar questions a couple of years ago, but
the question has not been answered yet.

I'd appreciate if someone could show me the code!

Thank you,
Sachi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Generalized Estimating Equation (GEE): Why is Link = Identity?

2010-04-29 Thread Sachi Ito
Hi,

I'm running GEE using geepack.

I set corstr = "ar1" as below:

> m.ar <- geeglm(L ~ O + A,
 + data = firstgrouptxt, id = id,
 + family = binomial, corstr = "ar1")


> summary(m.ar)

Call:
geeglm(formula = L ~ O + A, family = binomial,
data = firstgrouptxt, id = id, corstr = "ar1")

 Coefficients:
Estimate  Std.errWald Pr(>|W|)
(Intercept) -2.62516  0.21154 154.001   <2e-16 ***
ontask   0.00498  0.12143   0.002   0.9673
attachmentB  0.73216  0.35381   4.282   0.0385 *
attachmentC  0.25960  0.33579   0.598   0.4395
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Estimated Scale Parameters:
Estimate Std.err
(Intercept)1.277  0.3538

Correlation: Structure = ar1  Link = identity

Estimated Correlation Parameters:
  Estimate  Std.err
alpha0.978 0.005725
Number of clusters:   49   Maximum cluster size: 533


Then, it shows that :
Correlation: Link = identity

Why is it not Link = logit?


Thank you,
Sachi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Changing from 32-bit builds to 64-bit builds

2010-04-29 Thread Sachi Ito
Hi,

Probably this is a very simple question for all the programmers, but how do
you change from 32-bit builds (default) to 64-bit builds?

I've been trying to run Anova to compare two models, but I get the following
error message:

Error: cannot allocate vector of size 1.2 Gb
R(3122,0xa0ab44e0) malloc: *** mmap(size=1337688064) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
R(3122,0xa0ab44e0) malloc: *** mmap(size=1337688064) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug


I suppose it's a problem with memory allocation because of the big data
size, so I thought I should use 64-bit builds instead of 32.

As it was recommended in a manual, I've entered the following:

> CC='gcc -arch x86_64'
> CXX='g++ -arch x86_64'
> F77='gfortran -arch x86_64'
> FC='gfortran -arch x86_64'
> OBJC='gcc -arch x86_64'

But it still gave me error.  I'd greatly appreciate if someone can answer
this question!

Thank you,
Sachi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sig differences in Loglinear Models for Three-Way Tables

2010-04-14 Thread Sachi Ito
Hi all,

I've been running loglinear models for three-way tables: one of the
variables having three levels, and the other two having two levels each.

An example looks like below:

> yes.no <- c("Yes","No")

> switch <- c("On","Off")

> att <- c("BB","AA","CC")

> L <- gl(2,1,12,yes.no)

> T <- gl(2,2,12,switch)

> A <- gl(3,4,12,att)

> n <- c(1136,4998,25,339,305,2752,31,692,251,1677,17,557)

> d.table <- data.frame(A,T,L,n)

> d.table

A   T   Ln

1  BB  On Yes 1136

2  BB  On  No 4998

3  BB Off Yes   25

4  BB Off  No  339

5  AA  On Yes  305

6  AA  On  No 2752

7  AA Off Yes   31

8  AA Off  No  692

9  CC  On Yes  251

10 CC  On  No 1677

11 CC Off Yes   17

12 CC Off  No  557



First, I run the independence model and found a poor fit:

> library(MASS)
> loglm(n~A+T+L)
Call:
loglm(formula = n ~ A + T + L)

Statistics:
  X^2 df P(> X^2)
Likelihood Ratio 1001.431  70
Pearson  1006.287  70



Thus, I went on and run the two-way association model and found a good fit:

> loglm(n~A:T+A:L+T:L)
Call:
loglm(formula = n ~ A:T + A:L + T:L)

Statistics:
  X^2 df   P(> X^2)
Likelihood Ratio 4.827261  2 0.08948981
Pearson  4.680124  2 0.09632168


I compared the independence model (Model1), two-way association model (Model
2), and three-way interaction model (Saturated), and found that the two-way
association model was the most parsimonious one:

> ind <- loglm(n~A+T+L)
> twoway <- loglm(n~A:T+A:L+T:L)
> anova(ind,twoway)
LR tests for hierarchical log-linear models

Model 1:
 n ~ T + A + L
Model 2:
 n ~ A:L + A:T + T:L

 Deviance df Delta(Dev) Delta(df) P(> Delta(Dev)
Model 1   1001.430955  7
Model 2  4.827261  2 996.603694 50.0
Saturated0.00  0   4.827261 20.08949


By running a Chi-square test, I found that all of the three two-way
associations are significant.
> drop1(twoway,test="Chisq")
Single term deletions

Model:
n ~ A:T + A:L + T:L
   DfAICLRT   Pr(Chi)
 24.83
A:T 2 645.91 625.08 < 2.2e-16 ***
A:L 2 152.93 132.10 < 2.2e-16 ***
T:L 1 143.60 120.77 < 2.2e-16 ***
---
Signif. codes:  0 ¡***¢ 0.001 ¡**¢ 0.01 ¡*¢ 0.05 ¡.¢ 0.1 ¡ ¢ 1


Then, I got the coefficients:
> coef(twoway)
$`(Intercept)`
[1] 5.866527

$A
 BB  AA  CC
 0.27277069 -0.01475892 -0.25801177

$T
   On   Off
 1.156143 -1.156143

$L
  YesNo
-1.225228  1.225228

$A.T
T
AOnOff
  BB  0.4809533 -0.4809533
  AA -0.1783651  0.1783651
  CC -0.3025882  0.3025882

$A.L
L
AYes  No
  BB  0.19166429 -0.19166429
  AA -0.15632604  0.15632604
  CC -0.03533825  0.03533825

$T.L
 L
TYes No
  On   0.2933774 -0.2933774
  Off -0.2933774  0.2933774


I, then, hand-calculated odds ratio for A x T, A x L, and T x L.

T x L:
*èTL *=* e4(.293) *= 3.23

A x L:
*èAL(BB vs. AA) *= *e 2(.19166) + 2(.1563) = *2.01

*èAL(BB vs. CC) *= *e 2(.19166) + 2(.03533) = *1.57

A x T:

*èAT(BB vs. AA) *= *e 2(.48095) + 2(.17837) = 3.74*
*
*
*èAT(BB vs. CC) = e 2(.48095) + 2(.30259) = 4.79 *



Now, I'd like to know if BB and AA (or BB and CC) are significantly
different from each other (i.e., the odds of BB to be 2.01 times larger than
AA is significant) and the differences between BB and CC are significant
(i.e., the odds of BB to be 1.6 times bigger is significant) etc.


I'd really appreciate if someone can answer this question!

Thank you,
Sachi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] HLM - centering level 2 predictor

2009-06-28 Thread Sachi Ito
Dear R-helpers,

I'm analyzing a data with hierarchical linear model.  I have one level 1
predictor and one level 2 predictor, which looks like below:

fm1 <- lmer(y ~ 1 + x1 + x2 + x1:x2 + (1 + x1 | id.full))

where:
y is the outcome variable.
x1 is the level 1 predictor variable.
x2 is the level 2 predictor variable.
id.full is the conditioned variable.

It runs beautifully when only x1 is centered (I subtracted the mean from
each value).  However, when I also center x2 variable with the same
procedure, it gives me the following error message:

Warning message:
In mer_finalize(ans) : singular convergence (7)

I'd appreciate if someone could explain me what it means.

One of the differences between "non-centered values" and "centered values"
is that the "centered values" include negative values.  Could it be the
reason?  If so, what shall I do?

Thank you!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] coefficient graph

2009-03-17 Thread Sachi Ito
Dear R list members,

I'd like to make a graph of coefficients of the intercept, variable 1, and
variable 2 (and possibly the interaction between variable 1 and variable
2).  When I use the lmList function as attached below, it shows a nice
coefficient graph.

> PACRP.lis <- lmList(PAffect ~ CRPC + CRPT + CINT | ID, redinteract)
> coef(PACRP.lis)
> PACRPlis.coef <- coef(PACRP.lis)
> plot(PACRPlis.coef)



However, when I'd like to make a graph using the lmer function, it shows the
error message below.

> PACRP <- lmer(PAffect ~ CRPC + CRPT + CINT + (1 + CRPC + CRPT + CINT |ID),
redinteract)
> coef(PACRP)
> PACRP.coef <- coef(PACRP)
> plot(PACRP.coef)
*Error in eval(expr, envir, enclos) : object ".grp" not found
*

Does it mean that I cannot make a graph using the lmer function?  Or, do I
need to install another package?  I'd greatly appreciate if someone could
give me suggestions.


Thank you so much for your help in advance,
Sachi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] xyplot of a specific ID number

2009-03-15 Thread Sachi Ito
Dear R list members,
I have a question regarding xyplot.  I managed to make a xyplot of all the
IDs by using the syntax below:

xyplot(PA ~ CRPC + CRPT | ID, data = redinteract)

Now, I'd like to make a graph of a specific ID number (e.g., only ID number
301).  I thought I could use "subset", but it seems to be not working.
 Could anyone let me know how I can make a graph of a specific ID?

Thank you for your help in advance!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How can we predict differences in a slope, given that the random component was significant?

2008-12-12 Thread Sachi Ito
Dear R users,

Using R lme function, I found that both fixed and random effects of variable
A on variable B are significant.  Now, I'd like to analyze what variables
are predicting differences in the slope.  In other words, I'd like to know
what variables (e.g., variable C) are predicting individual differences in
the effects of A on B.  I have many data points for A and B for each
individual, whereas I have only one data point for C.
I'd appreciate if anyone could answer the question.

Thank you for your attention.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.