Hi, there is no automatic variable selection in the mgcv package. You should remove the superfluous terms manually. You can choose them using ML-test , comparing AIC values or using plot function.
An example: set.seed(3) n<-200 ## simulate data dat <- gamSim(1,n=n,scale=.15,dist="poisson") str(dat) ## spurious predictors dat$x4 <- runif(n, 0, 1) dat$x5 <- runif(n, 0, 1) b1<-gam(y~s(x0)+s(x1)+s(x2)+s(x3)+s(x4)+s(x5),data=dat,family=poisson) # full model summary(b1) # you can choose superfluous predictors based on this output b2<-gam(y~s(x0)+s(x1)+s(x2)+s(x3)+s(x4),data=dat, family=poisson) # reduced model without x5 anova(b,b2,test="Chisq") # comparing the two models plot(b1,pages=1) # smooth function is a nearly horizontal line for superfluous predictors Setting select=T may give more clear pattern, however in my toy-example the difference is small. Best wishes Zoltan 2011.05.24. 5:21 keltezéssel, ARISTIDES LOPEZ írta: > Hello all, > > > Just a question, I´m trying to fit my model throughout stepwise > selection.At this point (with the valuable help of Gavin and Ben) my > model are like > this: > > > model 1<-gam(Young (No. ind)~s(Lat, k=6)+s(Long, k=6)+s(Deep, k=6)+s(Area > (km2),k=6)+as.factor (year),family=poisson,data=L. synagris) > > > I have 4 species * 3 groups (young, adult and total) * 5 explanatory > variables (Lat, Lon, Deep, Area, Year). So I´m looking for a stepwise > algorithm that help me to select the best model. I tried with step () in > the stats package but R give me the following error message: > > > "Error en glm.control(irls.reg = 0, epsilon = 1e-06, maxit = 100, trace = > FALSE, : el argumento(s) no fue utilizado(s) (irls.reg = 0, mgcv.tol = > 1e-07, mgcv.half = 15,..............." > > > Any suggestion? > > > Cheers > > > Date: Wed, 18 May 2011 10:53:41 -0500 > From: ARISTIDES LOPEZ<aristides...@gmail.com> > To: r-sig-ecology@r-project.org > Subject: [R-sig-eco] Error message in GAM > Message-ID:<BANLkTikz-dQ=jv9ykftggeyo5ubwmcu...@mail.gmail.com> > Content-Type: text/plain > > Dear members list, > > I'm trying to make a model for descrive the distribution of demersal fishes > in the Colombian Caribbean Sea. I have a data set of n= 56, the model is > like this: Density (ind/km2) ~ s(Lat) + s(Long) + s(deep). The problem is > that R give me the error message *"Model has more coefficients than data"*. > > Anybody knows how can avoid this? > > Faithfully. > > -- > Aristides López-Peña > > > > Date: Wed, 18 May 2011 17:48:04 +0100 > From: Gavin Simpson<gavin.simp...@ucl.ac.uk> > To: ARISTIDES LOPEZ<aristides...@gmail.com> > Cc: r-sig-ecology@r-project.org > Subject: Re: [R-sig-eco] Error message in GAM > Message-ID:<1305737284.25148.15.ca...@prometheus.geog.ucl.ac.uk> > Content-Type: text/plain; charset="UTF-8" > > On Wed, 2011-05-18 at 10:53 -0500, ARISTIDES LOPEZ wrote: >> Dear members list, >> >> I'm trying to make a model for descrive the distribution of demersal > fishes >> in the Colombian Caribbean Sea. I have a data set of n= 56, the model is >> like this: Density (ind/km2) ~ s(Lat) + s(Long) + s(deep). The problem is >> that R give me the error message *"Model has more coefficients than > data"*. >> Anybody knows how can avoid this? >> >> Faithfully. > Each of your smooths will be using k = 10 degrees of freedom so that is > 30 degrees of freedom already, which is a lot for a data set of 56 > observations. > > Are all the data unique? i.e. you have 56 unique density values, 56 > unique lats, 56 unique lons etc. If not, it might be the the unique > information in the data is not sufficient to support the complexity of > the smooths. > > My money would be on that you did something you haven't actually told > us, and have more smooths in the model than you say and they are using > more degrees of freedom than it appears to us. > > The easy way to try to solve the problem, will be to restrict the > complexity of the individual smooths: > > response ~ s(Lat, k = 6) + s(Long, k = 6) + s(deep, k = 6) > > for example. > > You could probably model these data as a Possion with an offset term for > the km2 covered by each sample, rather than treating these as a density. > > HTH, > > G > > -- > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > Dr. Gavin Simpson [t] +44 (0)20 7679 0522 > ECRC, UCL Geography, [f] +44 (0)20 7679 0565 > Pearson Building, [e] > gavin.simpsonATNOSPAMucl.ac.uk<http://gavin.simpsonatnospamucl.ac.uk/> > Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > > > > ------------------------------ > > Message: 9 > Date: Wed, 18 May 2011 17:16:10 -0500 > From: ARISTIDES LOPEZ<aristides...@gmail.com> > To: r-sig-ecology@r-project.org, gavin.simp...@ucl.ac.uk > Subject: Re: [R-sig-eco] Error message in GAM > Message-ID:<banlktimuqhnjhdox9lnnddt60gsiwnx...@mail.gmail.com> > Content-Type: text/plain > > Dear Dr. Gavin, > > Thank you very much for your help. All my data are unique (because I have 56 > different stations). As you suggest I restrict the > complexity of the individual smooths: > > response ~ s(Lat, k = 9) + s(Long, k = 9) + s(deep, k = 9) > > Problem solved. > > Now I try to make other model: > > modelo2<-gam(Density~s(year, k=6)+s(Month, k=6)+s(rainfall, k=6), > family=Gamma, data=at) > > The "new" problem is that R give me the next error *"Error en > smooth.construct.tp.smooth.spec(object, dk$data, dk$knots) : > A term has fewer unique covariate combinations than specified maximum > degrees of freedom"*. > > Anybody knows what mean this? > > Regards. > > 2011/5/18 Gavin Simpson<gavin.simp...@ucl.ac.uk> > >> On Wed, 2011-05-18 at 10:53 -0500, ARISTIDES LOPEZ wrote: >>> Dear members list, >>> >>> I'm trying to make a model for descrive the distribution of demersal >> fishes >>> in the Colombian Caribbean Sea. I have a data set of n= 56, the model is >>> like this: Density (ind/km2) ~ s(Lat) + s(Long) + s(deep). The problem > is >>> that R give me the error message *"Model has more coefficients than >> data"*. >>> Anybody knows how can avoid this? >>> >>> Faithfully. >> Each of your smooths will be using k = 10 degrees of freedom so that is >> 30 degrees of freedom already, which is a lot for a data set of 56 >> observations. >> >> Are all the data unique? i.e. you have 56 unique density values, 56 >> unique lats, 56 unique lons etc. If not, it might be the the unique >> information in the data is not sufficient to support the complexity of >> the smooths. >> >> My money would be on that you did something you haven't actually told >> us, and have more smooths in the model than you say and they are using >> more degrees of freedom than it appears to us. >> >> The easy way to try to solve the problem, will be to restrict the >> complexity of the individual smooths: >> >> response ~ s(Lat, k = 6) + s(Long, k = 6) + s(deep, k = 6) >> >> for example. >> >> You could probably model these data as a Possion with an offset term for >> the km2 covered by each sample, rather than treating these as a density. >> >> HTH, >> >> G >> >> -- >> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% >> Dr. Gavin Simpson [t] +44 (0)20 7679 0522 >> ECRC, UCL Geography, [f] +44 (0)20 7679 0565 >> Pearson Building, [e] >> gavin.simpsonATNOSPAMucl.ac.uk<http://gavin.simpsonatnospamucl.ac.uk/> >> Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ >> UK. WC1E 6BT. [w] http://www.freshwaters.org.uk >> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% >> >> > > -- > Aristides López-Peña > > [[alternative HTML version deleted]] > > > > ------------------------------ > > Message: 10 > Date: Wed, 18 May 2011 18:28:20 -0400 > From: Ben Bolker<bbol...@gmail.com> > To: r-sig-ecology@r-project.org > Subject: Re: [R-sig-eco] Error message in GAM > Message-ID:<4dd44804.1020...@gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 05/18/2011 06:16 PM, ARISTIDES LOPEZ wrote: >> Dear Dr. Gavin, >> >> Thank you very much for your help. All my data are unique (because I have > 56 >> different stations). As you suggest I restrict the >> complexity of the individual smooths: >> >> response ~ s(Lat, k = 9) + s(Long, k = 9) + s(deep, k = 9) >> >> Problem solved. >> >> Now I try to make other model: >> >> modelo2<-gam(Density~s(year, k=6)+s(Month, k=6)+s(rainfall, k=6), >> family=Gamma, data=at) >> >> The "new" problem is that R give me the next error *"Error en >> smooth.construct.tp.smooth.spec(object, dk$data, dk$knots) : >> A term has fewer unique covariate combinations than specified maximum >> degrees of freedom"*. >> >> Anybody knows what mean this? >> >> Regards. > It means you're pushing your data too hard: how about being > old-fashioned and fitting quadratic models [e.g. poly(Lat,2)] for each > of your predictor variables (this of course ignores interactions, which > you might ?? want to worry about in some cases -- but you probably > can't. In principle, gam() in the mgcv package (which is what I assume > you are using) tries to adjust the degree of complexity of your model > downward as appropriate, but it may be having a hard time doing so; can > you set k lower? For the models that do succeed, I would suspect that > the effective degrees of freedom fitted are much lower than the k values > you are specifying, so you could afford to reduce them (see ?choose.k ) > > Remember the rule of thumb that you should not be trying to fit more > than *at most* N/10 parameters, where N is your number of points -- so > quadratic models of 3 independent predictors (= 7 parameters, intercept > + 2 for each predictor variable) would already be overfitting slightly. > > cheers > Ben Bolker > >> 2011/5/18 Gavin Simpson<gavin.simp...@ucl.ac.uk> >> >>> On Wed, 2011-05-18 at 10:53 -0500, ARISTIDES LOPEZ wrote: >>>> Dear members list, >>>> >>>> I'm trying to make a model for descrive the distribution of demersal >>> fishes >>>> in the Colombian Caribbean Sea. I have a data set of n= 56, the model is >>>> like this: Density (ind/km2) ~ s(Lat) + s(Long) + s(deep). The problem > is >>>> that R give me the error message *"Model has more coefficients than >>> data"*. >>>> Anybody knows how can avoid this? >>>> >>>> Faithfully. >>> Each of your smooths will be using k = 10 degrees of freedom so that is >>> 30 degrees of freedom already, which is a lot for a data set of 56 >>> observations. >>> >>> Are all the data unique? i.e. you have 56 unique density values, 56 >>> unique lats, 56 unique lons etc. If not, it might be the the unique >>> information in the data is not sufficient to support the complexity of >>> the smooths. >>> >>> My money would be on that you did something you haven't actually told >>> us, and have more smooths in the model than you say and they are using >>> more degrees of freedom than it appears to us. >>> >>> The easy way to try to solve the problem, will be to restrict the >>> complexity of the individual smooths: >>> >>> response ~ s(Lat, k = 6) + s(Long, k = 6) + s(deep, k = 6) >>> >>> for example. >>> >>> You could probably model these data as a Possion with an offset term for >>> the km2 covered by each sample, rather than treating these as a density. >>> >>> HTH, >>> >>> G >>> >>> -- >>> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% >>> Dr. Gavin Simpson [t] +44 (0)20 7679 0522 >>> ECRC, UCL Geography, [f] +44 (0)20 7679 0565 >>> Pearson Building, [e] >>> gavin.simpsonATNOSPAMucl.ac.uk<http://gavin.simpsonatnospamucl.ac.uk/> >>> Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ >>> UK. WC1E 6BT. [w] http://www.freshwaters.org.uk >>> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% >>> >>> >> >> >> >> _______________________________________________ >> R-sig-ecology mailing list >> R-sig-ecology@r-project.org >> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > -----END PGP SIGNATURE----- > > > > ------------------------------ > > Message: 11 > Date: Thu, 19 May 2011 07:35:39 +0100 > From: Gavin Simpson<gavin.simp...@ucl.ac.uk> > To: ARISTIDES LOPEZ<aristides...@gmail.com> > Cc: r-sig-ecology@r-project.org > Subject: Re: [R-sig-eco] Error message in GAM > Message-ID:<1305786939.2773.3.ca...@chrysothemis.geog.ucl.ac.uk> > Content-Type: text/plain; charset="UTF-8" > > On Wed, 2011-05-18 at 17:16 -0500, ARISTIDES LOPEZ wrote: >> Dear Dr. Gavin, >> >> Thank you very much for your help. All my data are unique (because I have > 56 >> different stations). As you suggest I restrict the >> complexity of the individual smooths: >> >> response ~ s(Lat, k = 9) + s(Long, k = 9) + s(deep, k = 9) >> >> Problem solved. >> >> Now I try to make other model: >> >> modelo2<-gam(Density~s(year, k=6)+s(Month, k=6)+s(rainfall, k=6), >> family=Gamma, data=at) >> >> The "new" problem is that R give me the next error *"Error en >> smooth.construct.tp.smooth.spec(object, dk$data, dk$knots) : >> A term has fewer unique covariate combinations than specified maximum >> degrees of freedom"*. > It means exactly what it says. One of the terms in the model: > > * s(year, k = 6) > * s(Month, k = 6) > * s(rainfall, k = 6) > > has *fewer* then 6 unique values. Look at the outputs from > > with(at, table(year)) > with(at, table(Month)) > with(at, table(rainfall)) > > to see which it(they) is(are). > > G > >> Anybody knows what mean this? >> >> Regards. >> >> 2011/5/18 Gavin Simpson<gavin.simp...@ucl.ac.uk> >> >>> On Wed, 2011-05-18 at 10:53 -0500, ARISTIDES LOPEZ wrote: >>>> Dear members list, >>>> >>>> I'm trying to make a model for descrive the distribution of demersal >>> fishes >>>> in the Colombian Caribbean Sea. I have a data set of n= 56, the model > is >>>> like this: Density (ind/km2) ~ s(Lat) + s(Long) + s(deep). The problem > is >>>> that R give me the error message *"Model has more coefficients than >>> data"*. >>>> Anybody knows how can avoid this? >>>> >>>> Faithfully. >>> Each of your smooths will be using k = 10 degrees of freedom so that is >>> 30 degrees of freedom already, which is a lot for a data set of 56 >>> observations. >>> >>> Are all the data unique? i.e. you have 56 unique density values, 56 >>> unique lats, 56 unique lons etc. If not, it might be the the unique >>> information in the data is not sufficient to support the complexity of >>> the smooths. >>> >>> My money would be on that you did something you haven't actually told >>> us, and have more smooths in the model than you say and they are using >>> more degrees of freedom than it appears to us. >>> >>> The easy way to try to solve the problem, will be to restrict the >>> complexity of the individual smooths: >>> >>> response ~ s(Lat, k = 6) + s(Long, k = 6) + s(deep, k = 6) >>> >>> for example. >>> >>> You could probably model these data as a Possion with an offset term for >>> the km2 covered by each sample, rather than treating these as a density. >>> >>> HTH, >>> >>> G >>> >>> -- >>> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% >>> Dr. Gavin Simpson [t] +44 (0)20 7679 0522 >>> ECRC, UCL Geography, [f] +44 (0)20 7679 0565 >>> Pearson Building, [e] >>> gavin.simpsonATNOSPAMucl.ac.uk<http://gavin.simpsonatnospamucl.ac.uk/> >>> Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ >>> UK. WC1E 6BT. [w] http://www.freshwaters.org.uk >>> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% >>> >>> >> > -- > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > Dr. Gavin Simpson [t] +44 (0)20 7679 0522 > ECRC, UCL Geography, [f] +44 (0)20 7679 0565 > Pearson Building, [e] > gavin.simpsonATNOSPAMucl.ac.uk<http://gavin.simpsonatnospamucl.ac.uk/> > Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > > > > > _______________________________________________ > R-sig-ecology mailing list > R-sig-ecology@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology