Thanks a lot, Simon. That was a silly mistake on my part. After correcting, there is still a slight difference between the p-values, but this is so small that I guess it is just numerics.
Best, Øystein On Tue, Oct 2, 2018 at 2:26 PM Simon Wood <simon.w...@bath.edu> wrote: > Dear Øystein, > > In your code 'id' is set up to be numeric. In your first gamm call it > gets coerced to a factor (since nothing else would make sense). In your > second gamm call it is simply treated as numeric, since that could makes > sense. To make the two models equivalent you just need to substitute: > > id <- as.factor(rep(1:ng, n/ng)) > > into your loop. > > best, > Simon > > On 01/10/18 08:15, Øystein Sørensen wrote: > > I have longitudinal data where a response y_{ij} has been measured > > repeatedly for i=1,...,n individuals, j=1,...,m times for each > individual. > > Let x_{ij} be the age of individual i at her/his jth measurement. I wish > to > > see how the response varies with age, and have reason to believe that a > > nonlinear fit is needed. I hence wish to model the relationship using an > > additive mixed model of the form y_{ij} = f(x_{ij}) + b_{i} + > > \epsilon_{ij}, where f() is some smooth function to be estimated, b_{i} > > \sim N(0, \sigma_{b}^{2}) is the random effect for individual i, > > \epsilon_{ij} \sim N(0, \sigma^{2}) is the residual. > > > > Reading the documentation to the mgcv package, I see that such models can > > be set up by calling the mgcv::gamm two different ways. One way shown to > > set up such a model is with the call > > b1 <- gamm(y ~ s(x), random = list(id =~ 1)), > > where id is an indicator of the specific individual. In this way, the > > random effect is specified in a list. The other way is to set up the > random > > effect with a smooth of the re: > > b2 <- gamm(y ~ s(x) + s(id, bs = "re")). > > > > As far as I can understand, these two setups should create the same > > underlying model matrices. However, when running them on my data, the > > p-values for the smooth terms, as well as their estimated degrees of > > freedom, are different. > > > > Below is a reproducible example on simulated data, which show that these > > two types of specifying the models give different p-values and estimated > > degrees of freedom. On my real data, these differences are sometimes even > > more exaggerated. > > > > My question is: Are not these two calls to gamm suppose to estimate the > > same model, or have I misunderstood? > > > > Here is a reproducible example: > > > > library(mgcv) > > set.seed(100) > > n_sim <- 100 > > df <- data.frame( > > model1_pval = numeric(n_sim), > > model1_edf = numeric(n_sim), > > model2_pval = numeric(n_sim), > > model2_edf = numeric(n_sim) > > ) > > > > # Number of observations > > n <- 500 > > # Number of groups > > ng <- 100 > > > > for(i in 1:n_sim){ > > # Draw the fixed effect covariate > > x <- rnorm(n) > > # Set up the group > > id <- rep(1:ng, n/ng) > > # Draw the random effect > > z <- rnorm(ng) > > # Draw the response > > y <- 0.1 * x + 0.2 * x^3 + z[id] + rnorm(n) > > > > # Fit the two different models > > b1 <- gamm(y ~ s(x, k = 5), random = list(id =~ 1)) > > b2 <- gamm(y ~ s(x, k = 5) + s(id, bs = "re")) > > > > df[i, "model1_pval"] <- anova(b1$gam)$s.pv > > df[i, "model1_edf"] <- anova(b1$gam)$edf > > df[i, "model2_pval"] <- anova(b2$gam)$s.pv[[1]] > > df[i, "model2_edf"] <- anova(b2$gam)$edf[[1]] > > } > > > > # Differences between model 1 and model 2 p values > > mean(df$model1_pval) > > #> [1] 6.790546e-21 > > mean(df$model2_pval) > > #> [1] 3.090694e-14 > > max(abs(df$model1_pval - df$model2_pval)) > > #> [1] 2.770545e-12 > > > > # Differences between model1 and model 2 estimated degrees of freedom > > mean(df$model1_edf) > > #> [1] 3.829992 > > mean(df$model2_edf) > > #> [1] 3.731438 > > max(abs(df$model1_edf - df$model2_edf)) > > #> [1] 0.6320281 > > > > > > Thanks in advance, > > Øystein Sørensen > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.