Simon, Many thanks as always for your help.
I see and appreciate the example that you cited, but I'm having a hard time generalizing it to a multivariate case. A bit about my context -- my dataset is response ratios; the log of a treatment over a control. One of my explanatory variables is treatment intensity. When intensity goes to zero, the expectation of the response ratio should also go to zero. Here is the model that I would like to fit: model = (ResponseRatio~ +s(as.factor(study),bs="re",by=intensity) +s(intensity) +s(x1),by=intensity) +s(x2),by=intensity) +te(x1,x2),by=intensity) +te(x1,intensity) ) Here is the example that you gave: library(mgcv) set.seed(0) n <- 100 x <- runif(n)*4-1;x <- sort(x); f <- exp(4*x)/(1+exp(4*x));y <- f+rnorm(100)*0.1;plot(x,y) dat <- data.frame(x=x,y=y) ## Create a spline basis and penalty, making sure there is a knot ## at the constraint point, (0 here, but could be anywhere) knots <- data.frame(x=seq(-1,3,length=9)) ## create knots ## set up smoother... sm <- smoothCon(s(x,k=9,bs="cr"),dat,knots=knots)[[1]] ## 3rd parameter is value of spline at knot location 0, ## set it to 0 by dropping... X <- sm$X[,-3] ## spline basis S <- sm$S[[1]][-3,-3] ## spline penalty off <- y*0 + .6 ## offset term to force curve through (0, .6) ## fit spline constrained through (0, .6)... b <- gam(y ~ X - 1 + offset(off),paraPen=list(X=list(S))) lines(x,predict(b)) ## compare to unconstrained fit... b.u <- gam(y ~ s(x,k=9),data=dat,knots=knots) lines(x,predict(b.u)) *My question*: how can I extend your example to more than one smooth terms, and several smooth interactions? In the context of a model where E[ResponseRatio] must equal 0 when intensity equals zero? I am not sure that I understand exactly what is going on when you call smoothCon to specify the `sm`. I've written my own penalized spline code from scratch, but it is far less sophisticated than mgcv, and is basically a ridge regression where I optimize to get a lambda after specifying a model with a lot of knots. mgcv clearly has a lot more going on, and is far preferable to my rudimentary code for its handling of tensors and random effects. (Also, for prediction, how can I do "by=dum" AND "by=intensity" at the same time?) Many thanks, Andrew PS: I am aware that interacting my model with an intensity variable makes my model quite heteroskedastic. I am thinking of using a cluster wild bootstrap to construct confidence intervals. If a better way forward immediately comes to your mind -- especially if its computationally cheaper -- I'd greatly appreciate if you could share it. On 04/17/2013 02:16 AM, Simon Wood wrote: > hi Andrew. > > gam does suppress the intercept, it's just that this doesn't force the > smooth through the intercept in the way that you would like. Basically > for the parameteric component of the model '-1' behaves exactly like > it does in 'lm' (it's using the same code). The smooths are 'added on' > to the parametric component of the model, with sum to zero constraints > to force identifiability. > > There is a solution to forcing a spline through a particular point at > http://r.789695.n4.nabble.com/Use-pcls-in-quot-mgcv-quot-package-to-achieve-constrained-cubic-spline-td4660966.html > > > (i.e. the R help thread "Re: [R] Use pcls in "mgcv" package to achieve > constrained cubic spline") > > best, > Simon > > On 16/04/13 22:36, Andrew Crane-Droesch wrote: >>> Dear List, >>> >>> I've just tried to specify a GAM without an intercept -- I've got one >>> of the (rare) cases where it is appropriate for E(y) -> 0 as X ->0. >>> Naively running a GAM with the "-1" appended to the formula and the >>> calling "predict.gam", I see that the model isn't behaving as expected. >>> >>> I don't understand why this would be. Google turns up this old R help >>> thread: >>> http://r.789695.n4.nabble.com/GAM-without-intercept-td4645786.html >>> >>> Simon writes: >>> >>> *Smooth terms are constrained to sum to zero over the covariate >>> values. ** >>> **This is an identifiability constraint designed to avoid >>> confounding with ** >>> **the intercept (particularly important if you have more than one >>> smooth). * >>> If you remove the intercept from you model altogether (m2) then >>> the >>> smooth will still sum to zero over the covariate values, which in >>> your >>> case will mean that the smooth is quite a long way from the data. >>> When >>> you include the intercept (m1) then the intercept is effectively >>> shifting the constrained curve up towards the data, and you get a >>> nice fit. >>> >>> Why? I haven't read Simon's book in great detail, though I have read >>> Ruppert et al.'s Semiparametric Regression. I don't see a reason why >>> a penalized spline model shouldn't equal the intercept (or zero) when >>> all of the regressors equals zero. >>> >>> Is anyone able to help with a bit of intuition? Or relevant passages >>> from a good description of why this would be the case? >>> >>> Furthermore, why does the "-1" formula specification work if it >>> doesn't work "as intended" by for example lm? >>> >>> Many thanks, >>> Andrew >>> >>> >>> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.