Re: [R] GAM model with interactions between continuous variables and factors
Just to clarify: gam.1 has wealth inside the smooths and as a fixed effect predictor while gam.2 only have wealth inside the smooths. Thanks On Mon, Mar 25, 2013 at 6:09 PM, Antonio P. Ramos ramos.grad.stud...@gmail.com wrote: Hi all, I am not sure how to handle interactions with categorical predictors in the GAM models. For example what is the different between these bellow two models. Tests are indicating that they are different but their predictions are essentially the same. Thanks a bunch, gam.1 - gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)+ +s(birth_year,by=wealth) + ++ wealth + sex + +residence+ maternal_educ + birth_order, + ,data=rwanda2,family=binomial) gam.2 - gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)+ +s(birth_year,by=wealth) + + + sex + +residence+ maternal_educ + birth_order, + ,data=rwanda2,family=binomial) anova(gam.1,gam.2,test=Chi) Analysis of Deviance Table Model 1: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + s(birth_year, by = wealth) + +wealth + sex + residence + maternal_educ + birth_order Model 2: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + s(birth_year, by = wealth) + +sex + residence + maternal_educ + birth_order Resid. Df Resid. Dev Df Deviance Pr(Chi) 1 28986 24175 2 28989 24196 -3.6952 -21.378 0.0001938 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 str(rwanda2) 'data.frame': 29027 obs. of 18 variables: $ CASEID: Factor w/ 10718 levels 1 5 2,..: 289 2243 7475 9982 6689 10137 7426 428 8415 10426 ... $ mortality.under.2 : int 0 1 0 0 0 0 0 0 1 0 ... $ maternal_age_disct: Factor w/ 3 levels -25,+35,25-35: 1 1 1 1 1 1 3 1 3 1 ... $ maternal_age : int 18 21 21 23 21 22 26 18 27 21 ... $ time : int 3 3 3 3 3 3 3 3 3 3 ... $ child_mortality : num 0.232 0.232 0.232 0.232 0.232 ... $ democracy : Factor w/ 1 level dictatorship: 1 1 1 1 1 1 1 1 1 1 ... $ wealth: Factor w/ 5 levels Lowest quintile,..: 2 4 1 4 5 1 4 1 4 5 ... $ birth_year: int 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 ... $ residence : Factor w/ 2 levels Rural,Urban: 1 1 1 1 2 1 1 1 1 2 ... $ birth_order : int 1 2 2 5 1 1 3 1 2 2 ... $ maternal_educ : Factor w/ 4 levels Higher,No education,..: 3 2 2 3 4 2 3 2 2 2 ... $ sex : Factor w/ 2 levels Female,Male: 1 1 2 2 1 1 2 2 2 2 ... $ quinquennium : Factor w/ 7 levels 00-5's,70-4,..: 2 2 2 2 2 2 2 2 2 2 ... $ time.1: int 3 3 3 3 3 3 3 3 3 3 ... $ new_time : int 0 0 0 0 0 0 0 0 0 0 ... $ maternal_age_c: num -6.12 -3.12 -3.12 -1.12 -3.12 ... $ birth_year_c : num -14.8 -14.8 -14.8 -14.8 -14.8 ... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GAM model with interactions between continuous variables and factors
Hi Antonio, If wealth is a factor variable, you should include the main effect in the model, as the smooths will be centered. Cheers, Josh On Mon, Mar 25, 2013 at 6:09 PM, Antonio P. Ramos ramos.grad.stud...@gmail.com wrote: Hi all, I am not sure how to handle interactions with categorical predictors in the GAM models. For example what is the different between these bellow two models. Tests are indicating that they are different but their predictions are essentially the same. Thanks a bunch, gam.1 - gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)+ +s(birth_year,by=wealth) + ++ wealth + sex + +residence+ maternal_educ + birth_order, + ,data=rwanda2,family=binomial) gam.2 - gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)+ +s(birth_year,by=wealth) + + + sex + +residence+ maternal_educ + birth_order, + ,data=rwanda2,family=binomial) anova(gam.1,gam.2,test=Chi) Analysis of Deviance Table Model 1: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + s(birth_year, by = wealth) + +wealth + sex + residence + maternal_educ + birth_order Model 2: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + s(birth_year, by = wealth) + +sex + residence + maternal_educ + birth_order Resid. Df Resid. Dev Df Deviance Pr(Chi) 1 28986 24175 2 28989 24196 -3.6952 -21.378 0.0001938 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 str(rwanda2) 'data.frame': 29027 obs. of 18 variables: $ CASEID: Factor w/ 10718 levels 1 5 2,..: 289 2243 7475 9982 6689 10137 7426 428 8415 10426 ... $ mortality.under.2 : int 0 1 0 0 0 0 0 0 1 0 ... $ maternal_age_disct: Factor w/ 3 levels -25,+35,25-35: 1 1 1 1 1 1 3 1 3 1 ... $ maternal_age : int 18 21 21 23 21 22 26 18 27 21 ... $ time : int 3 3 3 3 3 3 3 3 3 3 ... $ child_mortality : num 0.232 0.232 0.232 0.232 0.232 ... $ democracy : Factor w/ 1 level dictatorship: 1 1 1 1 1 1 1 1 1 1 ... $ wealth: Factor w/ 5 levels Lowest quintile,..: 2 4 1 4 5 1 4 1 4 5 ... $ birth_year: int 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 ... $ residence : Factor w/ 2 levels Rural,Urban: 1 1 1 1 2 1 1 1 1 2 ... $ birth_order : int 1 2 2 5 1 1 3 1 2 2 ... $ maternal_educ : Factor w/ 4 levels Higher,No education,..: 3 2 2 3 4 2 3 2 2 2 ... $ sex : Factor w/ 2 levels Female,Male: 1 1 2 2 1 1 2 2 2 2 ... $ quinquennium : Factor w/ 7 levels 00-5's,70-4,..: 2 2 2 2 2 2 2 2 2 2 ... $ time.1: int 3 3 3 3 3 3 3 3 3 3 ... $ new_time : int 0 0 0 0 0 0 0 0 0 0 ... $ maternal_age_c: num -6.12 -3.12 -3.12 -1.12 -3.12 ... $ birth_year_c : num -14.8 -14.8 -14.8 -14.8 -14.8 ... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://joshuawiley.com/ Senior Analyst - Elkhart Group Ltd. http://elkhartgroup.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GAM model with interactions between continuous variables and factors
Just to clarify: I should include wealth - the categorical variable - as a fixed effects *and* within the smooth using the argument by. It that correct? thanks a bunch On Mon, Mar 25, 2013 at 6:18 PM, Joshua Wiley jwiley.ps...@gmail.comwrote: Hi Antonio, If wealth is a factor variable, you should include the main effect in the model, as the smooths will be centered. Cheers, Josh On Mon, Mar 25, 2013 at 6:09 PM, Antonio P. Ramos ramos.grad.stud...@gmail.com wrote: Hi all, I am not sure how to handle interactions with categorical predictors in the GAM models. For example what is the different between these bellow two models. Tests are indicating that they are different but their predictions are essentially the same. Thanks a bunch, gam.1 - gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)+ +s(birth_year,by=wealth) + ++ wealth + sex + +residence+ maternal_educ + birth_order, + ,data=rwanda2,family=binomial) gam.2 - gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)+ +s(birth_year,by=wealth) + + + sex + +residence+ maternal_educ + birth_order, + ,data=rwanda2,family=binomial) anova(gam.1,gam.2,test=Chi) Analysis of Deviance Table Model 1: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + s(birth_year, by = wealth) + +wealth + sex + residence + maternal_educ + birth_order Model 2: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + s(birth_year, by = wealth) + +sex + residence + maternal_educ + birth_order Resid. Df Resid. Dev Df Deviance Pr(Chi) 1 28986 24175 2 28989 24196 -3.6952 -21.378 0.0001938 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 str(rwanda2) 'data.frame': 29027 obs. of 18 variables: $ CASEID: Factor w/ 10718 levels 1 5 2,..: 289 2243 7475 9982 6689 10137 7426 428 8415 10426 ... $ mortality.under.2 : int 0 1 0 0 0 0 0 0 1 0 ... $ maternal_age_disct: Factor w/ 3 levels -25,+35,25-35: 1 1 1 1 1 1 3 1 3 1 ... $ maternal_age : int 18 21 21 23 21 22 26 18 27 21 ... $ time : int 3 3 3 3 3 3 3 3 3 3 ... $ child_mortality : num 0.232 0.232 0.232 0.232 0.232 ... $ democracy : Factor w/ 1 level dictatorship: 1 1 1 1 1 1 1 1 1 1 ... $ wealth: Factor w/ 5 levels Lowest quintile,..: 2 4 1 4 5 1 4 1 4 5 ... $ birth_year: int 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 ... $ residence : Factor w/ 2 levels Rural,Urban: 1 1 1 1 2 1 1 1 1 2 ... $ birth_order : int 1 2 2 5 1 1 3 1 2 2 ... $ maternal_educ : Factor w/ 4 levels Higher,No education,..: 3 2 2 3 4 2 3 2 2 2 ... $ sex : Factor w/ 2 levels Female,Male: 1 1 2 2 1 1 2 2 2 2 ... $ quinquennium : Factor w/ 7 levels 00-5's,70-4,..: 2 2 2 2 2 2 2 2 2 2 ... $ time.1: int 3 3 3 3 3 3 3 3 3 3 ... $ new_time : int 0 0 0 0 0 0 0 0 0 0 ... $ maternal_age_c: num -6.12 -3.12 -3.12 -1.12 -3.12 ... $ birth_year_c : num -14.8 -14.8 -14.8 -14.8 -14.8 ... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://joshuawiley.com/ Senior Analyst - Elkhart Group Ltd. http://elkhartgroup.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GAM model with interactions between continuous variables and factors
Yep that's exactly right! :) On Mon, Mar 25, 2013 at 6:22 PM, Antonio P. Ramos ramos.grad.stud...@gmail.com wrote: Just to clarify: I should include wealth - the categorical variable - as a fixed effects *and* within the smooth using the argument by. It that correct? thanks a bunch On Mon, Mar 25, 2013 at 6:18 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi Antonio, If wealth is a factor variable, you should include the main effect in the model, as the smooths will be centered. Cheers, Josh On Mon, Mar 25, 2013 at 6:09 PM, Antonio P. Ramos ramos.grad.stud...@gmail.com wrote: Hi all, I am not sure how to handle interactions with categorical predictors in the GAM models. For example what is the different between these bellow two models. Tests are indicating that they are different but their predictions are essentially the same. Thanks a bunch, gam.1 - gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)+ +s(birth_year,by=wealth) + ++ wealth + sex + +residence+ maternal_educ + birth_order, + ,data=rwanda2,family=binomial) gam.2 - gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)+ +s(birth_year,by=wealth) + + + sex + +residence+ maternal_educ + birth_order, + ,data=rwanda2,family=binomial) anova(gam.1,gam.2,test=Chi) Analysis of Deviance Table Model 1: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + s(birth_year, by = wealth) + +wealth + sex + residence + maternal_educ + birth_order Model 2: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + s(birth_year, by = wealth) + +sex + residence + maternal_educ + birth_order Resid. Df Resid. Dev Df Deviance Pr(Chi) 1 28986 24175 2 28989 24196 -3.6952 -21.378 0.0001938 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 str(rwanda2) 'data.frame': 29027 obs. of 18 variables: $ CASEID: Factor w/ 10718 levels 1 5 2,..: 289 2243 7475 9982 6689 10137 7426 428 8415 10426 ... $ mortality.under.2 : int 0 1 0 0 0 0 0 0 1 0 ... $ maternal_age_disct: Factor w/ 3 levels -25,+35,25-35: 1 1 1 1 1 1 3 1 3 1 ... $ maternal_age : int 18 21 21 23 21 22 26 18 27 21 ... $ time : int 3 3 3 3 3 3 3 3 3 3 ... $ child_mortality : num 0.232 0.232 0.232 0.232 0.232 ... $ democracy : Factor w/ 1 level dictatorship: 1 1 1 1 1 1 1 1 1 1 ... $ wealth: Factor w/ 5 levels Lowest quintile,..: 2 4 1 4 5 1 4 1 4 5 ... $ birth_year: int 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 ... $ residence : Factor w/ 2 levels Rural,Urban: 1 1 1 1 2 1 1 1 1 2 ... $ birth_order : int 1 2 2 5 1 1 3 1 2 2 ... $ maternal_educ : Factor w/ 4 levels Higher,No education,..: 3 2 2 3 4 2 3 2 2 2 ... $ sex : Factor w/ 2 levels Female,Male: 1 1 2 2 1 1 2 2 2 2 ... $ quinquennium : Factor w/ 7 levels 00-5's,70-4,..: 2 2 2 2 2 2 2 2 2 2 ... $ time.1: int 3 3 3 3 3 3 3 3 3 3 ... $ new_time : int 0 0 0 0 0 0 0 0 0 0 ... $ maternal_age_c: num -6.12 -3.12 -3.12 -1.12 -3.12 ... $ birth_year_c : num -14.8 -14.8 -14.8 -14.8 -14.8 ... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://joshuawiley.com/ Senior Analyst - Elkhart Group Ltd. http://elkhartgroup.com -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://joshuawiley.com/ Senior Analyst - Elkhart Group Ltd. http://elkhartgroup.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GAM model with interactions between continuous variables and factors
Thanks! On Mon, Mar 25, 2013 at 6:25 PM, Joshua Wiley jwiley.ps...@gmail.comwrote: Yep that's exactly right! :) On Mon, Mar 25, 2013 at 6:22 PM, Antonio P. Ramos ramos.grad.stud...@gmail.com wrote: Just to clarify: I should include wealth - the categorical variable - as a fixed effects *and* within the smooth using the argument by. It that correct? thanks a bunch On Mon, Mar 25, 2013 at 6:18 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi Antonio, If wealth is a factor variable, you should include the main effect in the model, as the smooths will be centered. Cheers, Josh On Mon, Mar 25, 2013 at 6:09 PM, Antonio P. Ramos ramos.grad.stud...@gmail.com wrote: Hi all, I am not sure how to handle interactions with categorical predictors in the GAM models. For example what is the different between these bellow two models. Tests are indicating that they are different but their predictions are essentially the same. Thanks a bunch, gam.1 - gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)+ +s(birth_year,by=wealth) + ++ wealth + sex + +residence+ maternal_educ + birth_order, + ,data=rwanda2,family=binomial) gam.2 - gam(mortality.under.2~ maternal_age_c+ I(maternal_age_c^2)+ +s(birth_year,by=wealth) + + + sex + +residence+ maternal_educ + birth_order, + ,data=rwanda2,family=binomial) anova(gam.1,gam.2,test=Chi) Analysis of Deviance Table Model 1: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + s(birth_year, by = wealth) + +wealth + sex + residence + maternal_educ + birth_order Model 2: mortality.under.2 ~ maternal_age_c + I(maternal_age_c^2) + s(birth_year, by = wealth) + +sex + residence + maternal_educ + birth_order Resid. Df Resid. Dev Df Deviance Pr(Chi) 1 28986 24175 2 28989 24196 -3.6952 -21.378 0.0001938 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 str(rwanda2) 'data.frame': 29027 obs. of 18 variables: $ CASEID: Factor w/ 10718 levels 1 5 2,..: 289 2243 7475 9982 6689 10137 7426 428 8415 10426 ... $ mortality.under.2 : int 0 1 0 0 0 0 0 0 1 0 ... $ maternal_age_disct: Factor w/ 3 levels -25,+35,25-35: 1 1 1 1 1 1 3 1 3 1 ... $ maternal_age : int 18 21 21 23 21 22 26 18 27 21 ... $ time : int 3 3 3 3 3 3 3 3 3 3 ... $ child_mortality : num 0.232 0.232 0.232 0.232 0.232 ... $ democracy : Factor w/ 1 level dictatorship: 1 1 1 1 1 1 1 1 1 1 ... $ wealth: Factor w/ 5 levels Lowest quintile,..: 2 4 1 4 5 1 4 1 4 5 ... $ birth_year: int 1970 1970 1970 1970 1970 1970 1970 1970 1970 1970 ... $ residence : Factor w/ 2 levels Rural,Urban: 1 1 1 1 2 1 1 1 1 2 ... $ birth_order : int 1 2 2 5 1 1 3 1 2 2 ... $ maternal_educ : Factor w/ 4 levels Higher,No education,..: 3 2 2 3 4 2 3 2 2 2 ... $ sex : Factor w/ 2 levels Female,Male: 1 1 2 2 1 1 2 2 2 2 ... $ quinquennium : Factor w/ 7 levels 00-5's,70-4,..: 2 2 2 2 2 2 2 2 2 2 ... $ time.1: int 3 3 3 3 3 3 3 3 3 3 ... $ new_time : int 0 0 0 0 0 0 0 0 0 0 ... $ maternal_age_c: num -6.12 -3.12 -3.12 -1.12 -3.12 ... $ birth_year_c : num -14.8 -14.8 -14.8 -14.8 -14.8 ... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://joshuawiley.com/ Senior Analyst - Elkhart Group Ltd. http://elkhartgroup.com -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://joshuawiley.com/ Senior Analyst - Elkhart Group Ltd. http://elkhartgroup.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.