[R] bsts posterior distributions
hello I couldn't find previous posts to answer this, but please point me to any. I am trying to understand bsts, starting with no regressors Here is code which appears to mimic bsts, producing graphs similar to the model plot, but gives a rather different posterior distribution for the parameters. I may misunderstand SdPrior, or perhaps the differences are with mcmc. Thanks for any help Greg ### library(bsts) library(Boom) library(mcmc) #simulate some data y<-rep(NA,50) y[1]=1 y[2]=1 s=2 set.seed(5) for (k in 1:48) { y[2+k]=y[1+k]+0.1*y[k]+s*rnorm(1) } plot(1:50,y[1:50],main=paste("seed =",j)) #bsts model ss<-AddLocalLevel(list(),y) mod1<-bsts(y,state.specification=ss,niter=1000) plot(mod1) #reproduce the plot using mod1$state.contributions par(mfrow=c(1,2)) plot(1:50,y,col="blue",ylim=c(-12,8),main="quantiles of mod1$state.contributions") for (i in 1:99) { qi<-qin<-rep(NA,50) tj<-1:50 for (j in 1:50) { qi[j]<-quantile(mod1$state.con[501:1000,1,j],(i-0.5)/100) qin[j]<-quantile(mod1$state.con[501:1000,1,j],(i+1-0.5)/100) } polygon(c(tj,rev(tj)),c(qi[1:50],rev(qin[1:50])),col=rgb(0,0,0,45*dnorm(i,mean=50,sd=18)),border=FALSE) } plot(mod1,ylim=c(-12,8),main="plot(mod1)") par(mfrow=c(1,1)) #the state specification is based on sd(y) sd(y) ss #the likelihood, using the kalman filter, as a function of the error sd's kal<-function(par) { a<-par[1] b<-par[2] H=matrix(1,1,1) F=matrix(1,1,1) #1-dimensional state N=50 dim(y)=c(1,N) xe<-ye<-ze<-matrix(NA,1,N) w<-rnorm(1) xe[,1]<-1 ye[,1]<-H%*%xe[,1] ze[,1]<-y[,1]-ye[,1] P<-K<-array(data=NA,dim=c(N,1,1)) #P[1,,] initial guess P[1,,]<-1 K[1,,]<-1 xe[1,1]<-1 for (i in 1:(N-1)) { P[i+1,,]<-F%*%P[i,,]%*%t(F)+a^2 K[i+1,,]<-P[i+1,,]%*%t(H)%*%solve(b^2+H%*%P[i+1,,]%*%t(H)) xe[1,i+1]<-F%*%xe[,i]+K[i+1,,]%*%(y[,i+1]-H%*%F%*%xe[,i]) P[i+1,,]<-(diag(1,1)-K[i+1,,]%*%H)%*%P[i+1,,] } -1/2*log(b^2)-1/2*sum(((y[1,]-xe[1,])/b)^2) } #independent priors on a and b #bsts uses sd(y) to set both priors. #for a (ss[[1]]$sigma.prior) it uses SdPrior with #$prior.guess 0.04600655, $prior.df 0.01, $initial.value 0.04600655, $upper.limit 4.600655 #for b (mod1$prior) it uses SdPrior with #$prior.guess 4.600655, $prior.df 0.01, $initial.value 4.600655, $upper.limit 5.520786 #try an inverse gamma prior (on a^2, then converted to a prior on a) lpr1<-function(a) { v=0.01 ifelse((a<=0)|a>sd(y),-Inf,-(v/2+1)*log(a^2)-v*(sd(y)/100)^2/(2*a^2)+1/2*log(a^2)) } lpr2<-function(b) { v=0.01 ifelse((b<=0)|b>1.2*sd(y),-Inf,-(v/2+1)*log(b^2)-v*(sd(y))^2/(2*b^2)+1/2*log(b^2)) } lpost1<-function(par) { a<-par[1] b<-par[2] lpr1(a)+lpr2(b)+kal(par) } nlpost1<-function(par) -lpost1(par) pop<-optim(c(2,1),nlpost1) pop<-optim(pop$par,nlpost1,hessian=T) pop lpost1m<-function(par) lpost1(par+pop$par) nlpost1m<-function(par) -lpost1m(par) pop1m<-optim(c(0,0),nlpost1m) pop1m<-optim(pop1m$par,nlpost1m,hessian=T) pop1m sc<-chol(pop1m$hess) sd<-diag(sc) nb=5000 #with batches, spacing, ~5 min run outm<-metrop(lpost1m,1e-2*sd,nb,blen=5,nspac=5) samm<-outm$batch dim(samm) acf(samm) samm<-thin(samm,5) dim(samm) acf(samm) par(mfrow=c(2,1)) plot(samm[501:1000,1],type="l") plot(samm[501:1000,2],type="l") par(mfrow=c(1,1)) #plot kalman using samm from lpost1m ax<-samm[501:1000,1]+pop$par[1] bx<-samm[501:1000,2]+pop$par[2] state<-matrix(NA,500,50) for (j in 1:500) { a<-ax[j] b<-bx[j] H=matrix(1,1,1) F=matrix(1,1,1) N=50 dim(y)=c(1,N) xe<-ye<-ze<-matrix(NA,1,N) xe[,1]<-1 ye[,1]<-H%*%xe[,1] ze[,1]<-y[,1]-ye[,1] P<-K<-array(data=NA,dim=c(N,1,1)) P[1,,]<-0 for (i in 1:(N-1)) { P[i+1,,]<-F%*%P[i,,]%*%t(F)+a^2 K[i+1,,]<-P[i+1,,]%*%t(H)%*%solve(b^2+H%*%P[i+1,,]%*%t(H)) P[i+1,,]<-(diag(1,1)-K[i+1,,]%*%H)%*%P[i+1,,] xe[1,i+1]<-F%*%xe[,i]+K[i+1,,]%*%(y[,i+1]-H%*%F%*%xe[,i]) } P[1,,]<-P[2,,] state[j,]<-rnorm(N,xe[1,1:N],P[1:N,,]^0.5) } par(mfrow=c(1,2)) plot(1:N,y,col="blue",ylim=c(-12,8)) for (i in 1:99) { qi<-qin<-rep(NA,N) tj<-1:N for (k in 1:N) { qi[k]<-quantile(state[,k],(i-0.5)/100) qin[k]<-quantile(state[,k],(i+1-0.5)/100) } polygon(c(tj,rev(tj)),c(qi[1:N],rev(qin[1:N])),col=rgb(0,0,0,45*dnorm(i,mean=50,sd=18)),border=FALSE) } plot(mod1,ylim=c(-12,8)) par(mfrow=c(1,1)) #very plausible, but mean(ax) mean(mod1$sigma.level[501:1000]) mean(bx) mean(mod1$sigma.obs[501:1000]) par(mfrow=c(1,2)) qqplot(ax,mod1$sigma.level[501:1000]) lines(ax,ax,col="red") qqplot(bx,mod1$sigma.obs[501:1000]) lines(bx,bx,col="red") par(mfrow=c(1,1)) #ax not sampling sigma.level #bx not quite sampling sigma.obs __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mgcv: distribution of dev with Poisson data
Dear Prof Ripley yes, but if the estimate is biased it's good to know what the bias is. The problem illustrated in the simulations has nothing to do with ML, though, as the default fitting method in mgcv when scale is unknown is "GCV" and that is what was used, by default, here. The point about ML was that I'd looked separately at what happens when scale varies in fitting a quasipoisson model, and found that ML was minimised (in an example) at the scale estimate (i.e. the default estimate using dev). A very brief look at the code made that seem plausible as it does involve the derivatives with respect to scale. In my limited experience ML is actually better behaved than GCV or REML, giving more conservative estimates and tighter CIs which behave better under simulation. re the Pearson estimates being preferable, that is also what Simon Wood's book says on p172, another reason why I was puzzled that the default estimate used dev instead. thanks Greg > On 05/02/2014 12:56, Greg Dropkin wrote: >> thanks Simon >> also, it appears at least with ML that the default scale estimate with quasipoisson (i.e. using dev) is the scale which minimises the ML value of >> the fitted model. So it is the "best" model but doesn't actually give the >> correct mean-variance relation. Is that right? > > Well, estimates do not usually give the correct value > > But ML is biased in many cases, sometimes badly so here, and most people prefer the Pearson estimator of the dispersion. This is McCullagn & Nelder's book, even the first edition, but without much evidence: Venables & Ripley (2002, ยง7.5) has plots to back up their hunch. > >> thanks again >> Greg >>> Greg, >>> The deviance being chi^2 distributed on the residual degrees of freedom >>> works in some cases (mostly where the response itself can be reasonably >>> approximated as Gaussian), but rather poorly in others (noteably low count data). This is what you are seeing in your simulations - in the first case the Poisson mean is relatively high - so the normal approximation to the Poisson is not too bad and the deviance is approx. >>> chi.sq on residual df. In the second case the Poisson mean is low, there >>> are lots of zeroes, and the approximation breaks down. So yes, the Pearson estimator is probably a better bet in the latter case. best, >>> Simom >>> ps. Note that the chi squared approximation for the *difference* in deviance between two nested models does not suffer from this problem. On 04/02/14 09:25, Greg Dropkin wrote: >>>> mgcv: distribution of dev >>>> hi >>>> I can't tell if this is a simple error. >>>> I'm puzzled by the distribution of dev when fitting a gam to Poisson generated data. >>>> I expected dev to be approximately chi-squared on residual d.f., i.e. about 1000 in each case below. >>>> In particular, the low values in the 3rd and 4th versions would suggest >>>> scale < 1, yet the data is Poisson generated. >>>> The problem isn't caused by insufficient k values in the smoother. Does this mean that with sparse data the distribution of dev is no longer >>>> approx chi-sq on residual df? >>>> Does it mean the scale estimate when fitting quasipoisson should be the >>>> Pearson version? >>>> thanks >>>> greg >>>> library(mgcv) >>>> set.seed(1) >>>> x1<-runif(1000) >>>> linp<-2+exp(-2*x1)*sin(8*x1) >>>> sum(exp(linp)) >>>> dev1<-dev2<-sums<-rep(0,20) >>>> for (j in 1:20) >>>> { >>>> y<-rpois(1000,exp(linp)) >>>> sums[j]<-sum(y) >>>> m1<-gam(y~s(x1),family="poisson") >>>> m2<-gam(y~s(x1,k=20),family="poisson") >>>> dev1[j]=m1$dev >>>> dev2[j]=m2$dev >>>> } >>>> mean(sums) >>>> sd(sums) >>>> mean(dev1) >>>> sd(dev1) >>>> mean(dev2) >>>> sd(dev2) >>>> #dev slighly higher than expected >>>> linpa<--1+exp(-2*x1)*sin(8*x1) >>>> sum(exp(linpa)) >>>> dev1a<-dev2a<-sumsa<-rep(0,20) >>>> for (j in 1:20) >>>> { >>>> y<-rpois(1000,exp(linpa)) >>>> sumsa[j]<-sum(y) >>>> m1<-gam(y~s(x1),family="poisson") >>>> m2<-gam(y~s(x1,k=20),family="poisson") >>>> dev1a[j]=m1$dev >>>> dev2a[j]=m2$dev >>>> } >>>> mean(sumsa) >>>> sd(sumsa) >>>> mean(dev1a) >>>> sd(dev1a) >>>
Re: [R] mgcv: distribution of dev with Poisson data
hi Simon yes, I also got the right shape of the mean-variance relation but the wrong estimate of the parameter. thanks very much Greg > Hi Greg, > > Yes, this sounds right - with quasipoisson gam uses `extended > quasi-likelihood' (see McCullagh and Nelder's GLM book) to allow > estimation of the scale parameter along with the smoothing parameters > via (RE)ML, and it could well be that this gives a biased scale estimate > with low counts (although the shape of the mean variance relationship is > unaffected by this). > > It might well be more sensible to default to a Pearson estimate of the > scale parameter for 'gam' ('bam' and 'gamm' already do this), to avoid > this issue with quasipoisson and low counts. Your email reminded me that > someone had reported rather too high p-values for quasipoisson with > these sort of data, and looking back at my list of open issues I see > that `someone' was you as well! It's possible that moving to a Pearson > estimator of the scale will solve that problem too. > > Thanks for this... very helpful. > > best, > Simon > > > On 05/02/14 12:56, Greg Dropkin wrote: >> thanks Simon >> >> also, it appears at least with ML that the default scale estimate with >> quasipoisson (i.e. using dev) is the scale which minimises the ML value >> of >> the fitted model. So it is the "best" model but doesn't actually give >> the >> correct mean-variance relation. Is that right? >> >> thanks again >> >> Greg >> >>> Greg, >>> >>> The deviance being chi^2 distributed on the residual degrees of freedom >>> works in some cases (mostly where the response itself can be reasonably >>> approximated as Gaussian), but rather poorly in others (noteably low >>> count data). This is what you are seeing in your simulations - in the >>> first case the Poisson mean is relatively high - so the normal >>> approximation to the Poisson is not too bad and the deviance is approx. >>> chi.sq on residual df. In the second case the Poisson mean is low, >>> there >>> are lots of zeroes, and the approximation breaks down. So yes, the >>> Pearson estimator is probably a better bet in the latter case. >>> >>> best, >>> Simom >>> >>> ps. Note that the chi squared approximation for the *difference* in >>> deviance between two nested models does not suffer from this problem. >>> >>> >>> On 04/02/14 09:25, Greg Dropkin wrote: >>>> mgcv: distribution of dev >>>> >>>> hi >>>> >>>> I can't tell if this is a simple error. >>>> I'm puzzled by the distribution of dev when fitting a gam to Poisson >>>> generated data. >>>> I expected dev to be approximately chi-squared on residual d.f., i.e. >>>> about 1000 in each case below. >>>> In particular, the low values in the 3rd and 4th versions would >>>> suggest >>>> scale < 1, yet the data is Poisson generated. >>>> The problem isn't caused by insufficient k values in the smoother. >>>> Does this mean that with sparse data the distribution of dev is no >>>> longer >>>> approx chi-sq on residual df? >>>> Does it mean the scale estimate when fitting quasipoisson should be >>>> the >>>> Pearson version? >>>> >>>> thanks >>>> >>>> greg >>>> >>>> library(mgcv) >>>> set.seed(1) >>>> x1<-runif(1000) >>>> linp<-2+exp(-2*x1)*sin(8*x1) >>>> sum(exp(linp)) >>>> dev1<-dev2<-sums<-rep(0,20) >>>> for (j in 1:20) >>>> { >>>> y<-rpois(1000,exp(linp)) >>>> sums[j]<-sum(y) >>>> m1<-gam(y~s(x1),family="poisson") >>>> m2<-gam(y~s(x1,k=20),family="poisson") >>>> dev1[j]=m1$dev >>>> dev2[j]=m2$dev >>>> } >>>> mean(sums) >>>> sd(sums) >>>> mean(dev1) >>>> sd(dev1) >>>> mean(dev2) >>>> sd(dev2) >>>> >>>> #dev slighly higher than expected >>>> >>>> linpa<--1+exp(-2*x1)*sin(8*x1) >>>> sum(exp(linpa)) >>>> dev1a<-dev2a<-sumsa<-rep(0,20) >>>> for (j in 1:20) >>>> { >>>> y<-rpois(1000,exp(linpa)) >>>> sumsa[j]<-sum(y) >>>> m1<-gam(y~s(x1),family="poisson"
Re: [R] mgcv: distribution of dev with Poisson data
thanks Simon also, it appears at least with ML that the default scale estimate with quasipoisson (i.e. using dev) is the scale which minimises the ML value of the fitted model. So it is the "best" model but doesn't actually give the correct mean-variance relation. Is that right? thanks again Greg > Greg, > > The deviance being chi^2 distributed on the residual degrees of freedom > works in some cases (mostly where the response itself can be reasonably > approximated as Gaussian), but rather poorly in others (noteably low > count data). This is what you are seeing in your simulations - in the > first case the Poisson mean is relatively high - so the normal > approximation to the Poisson is not too bad and the deviance is approx. > chi.sq on residual df. In the second case the Poisson mean is low, there > are lots of zeroes, and the approximation breaks down. So yes, the > Pearson estimator is probably a better bet in the latter case. > > best, > Simom > > ps. Note that the chi squared approximation for the *difference* in > deviance between two nested models does not suffer from this problem. > > > On 04/02/14 09:25, Greg Dropkin wrote: >> mgcv: distribution of dev >> >> hi >> >> I can't tell if this is a simple error. >> I'm puzzled by the distribution of dev when fitting a gam to Poisson >> generated data. >> I expected dev to be approximately chi-squared on residual d.f., i.e. >> about 1000 in each case below. >> In particular, the low values in the 3rd and 4th versions would suggest >> scale < 1, yet the data is Poisson generated. >> The problem isn't caused by insufficient k values in the smoother. >> Does this mean that with sparse data the distribution of dev is no >> longer >> approx chi-sq on residual df? >> Does it mean the scale estimate when fitting quasipoisson should be the >> Pearson version? >> >> thanks >> >> greg >> >> library(mgcv) >> set.seed(1) >> x1<-runif(1000) >> linp<-2+exp(-2*x1)*sin(8*x1) >> sum(exp(linp)) >> dev1<-dev2<-sums<-rep(0,20) >> for (j in 1:20) >> { >> y<-rpois(1000,exp(linp)) >> sums[j]<-sum(y) >> m1<-gam(y~s(x1),family="poisson") >> m2<-gam(y~s(x1,k=20),family="poisson") >> dev1[j]=m1$dev >> dev2[j]=m2$dev >> } >> mean(sums) >> sd(sums) >> mean(dev1) >> sd(dev1) >> mean(dev2) >> sd(dev2) >> >> #dev slighly higher than expected >> >> linpa<--1+exp(-2*x1)*sin(8*x1) >> sum(exp(linpa)) >> dev1a<-dev2a<-sumsa<-rep(0,20) >> for (j in 1:20) >> { >> y<-rpois(1000,exp(linpa)) >> sumsa[j]<-sum(y) >> m1<-gam(y~s(x1),family="poisson") >> m2<-gam(y~s(x1,k=20),family="poisson") >> dev1a[j]=m1$dev >> dev2a[j]=m2$dev >> } >> mean(sumsa) >> sd(sumsa) >> mean(dev1a) >> sd(dev1a) >> mean(dev2a) >> sd(dev2a) >> >> #dev a bit lower than expected >> >> linpb<--1.5+exp(-2*x1)*sin(8*x1) >> sum(exp(linpb)) >> dev1b<-dev2b<-sumsb<-rep(0,20) >> for (j in 1:20) >> { >> y<-rpois(1000,exp(linpb)) >> sumsb[j]<-sum(y) >> m1<-gam(y~s(x1),family="poisson") >> m2<-gam(y~s(x1,k=20),family="poisson") >> dev1b[j]=m1$dev >> dev2b[j]=m2$dev >> } >> mean(sumsb) >> sd(sumsb) >> mean(dev1b) >> sd(dev1b) >> mean(dev2b) >> sd(dev2b) >> >> #dev much lower than expected >> >> m1q<-gam(y~s(x1),family="quasipoisson",scale=-1) >> m1q$scale >> m1q$dev/(1000-sum(m1q$edf)) >> >> #scale estimate < 1 >> >> sum((y-fitted(m1q))^2/fitted(m1q))/(1000-sum(m1q$edf)) >> >> #Pearson estimate of scale closer, but also < 1 >> >> >> linpc<--2+exp(-2*x1)*sin(8*x1) >> sum(exp(linpc)) >> dev1c<-dev2c<-sumsc<-rep(0,20) >> for (j in 1:20) >> { >> y<-rpois(1000,exp(linpc)) >> sumsc[j]<-sum(y) >> m1<-gam(y~s(x1),family="poisson") >> m2<-gam(y~s(x1,k=20),family="poisson") >> dev1c[j]=m1$dev >> dev2c[j]=m2$dev >> } >> mean(sumsc) >> sd(sumsc) >> mean(dev1c) >> sd(dev1c) >> mean(dev2c) >> sd(dev2c) >> >> #dev much lower than expected >> >> m1q<-gam(y~s(x1),family="quasipoisson",scale=-1) >> m1q$scale >> m1q$sig2 >> m1q$dev/(1000-sum(m1q$edf)) >> >> #scale estimate < 1 >> >> sum((y-fitted(m1q))^2/fitted(m1q))/(1000-sum(m1q$edf)) >> >> #Pearson estimate of scale ok >> >> > > > -- > Simon Wood, Mathematical Science, University of Bath BA2 7AY UK > +44 (0)1225 386603 http://people.bath.ac.uk/sw283 > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mgcv: distribution of dev with Poisson data
mgcv: distribution of dev hi I can't tell if this is a simple error. I'm puzzled by the distribution of dev when fitting a gam to Poisson generated data. I expected dev to be approximately chi-squared on residual d.f., i.e. about 1000 in each case below. In particular, the low values in the 3rd and 4th versions would suggest scale < 1, yet the data is Poisson generated. The problem isn't caused by insufficient k values in the smoother. Does this mean that with sparse data the distribution of dev is no longer approx chi-sq on residual df? Does it mean the scale estimate when fitting quasipoisson should be the Pearson version? thanks greg library(mgcv) set.seed(1) x1<-runif(1000) linp<-2+exp(-2*x1)*sin(8*x1) sum(exp(linp)) dev1<-dev2<-sums<-rep(0,20) for (j in 1:20) { y<-rpois(1000,exp(linp)) sums[j]<-sum(y) m1<-gam(y~s(x1),family="poisson") m2<-gam(y~s(x1,k=20),family="poisson") dev1[j]=m1$dev dev2[j]=m2$dev } mean(sums) sd(sums) mean(dev1) sd(dev1) mean(dev2) sd(dev2) #dev slighly higher than expected linpa<--1+exp(-2*x1)*sin(8*x1) sum(exp(linpa)) dev1a<-dev2a<-sumsa<-rep(0,20) for (j in 1:20) { y<-rpois(1000,exp(linpa)) sumsa[j]<-sum(y) m1<-gam(y~s(x1),family="poisson") m2<-gam(y~s(x1,k=20),family="poisson") dev1a[j]=m1$dev dev2a[j]=m2$dev } mean(sumsa) sd(sumsa) mean(dev1a) sd(dev1a) mean(dev2a) sd(dev2a) #dev a bit lower than expected linpb<--1.5+exp(-2*x1)*sin(8*x1) sum(exp(linpb)) dev1b<-dev2b<-sumsb<-rep(0,20) for (j in 1:20) { y<-rpois(1000,exp(linpb)) sumsb[j]<-sum(y) m1<-gam(y~s(x1),family="poisson") m2<-gam(y~s(x1,k=20),family="poisson") dev1b[j]=m1$dev dev2b[j]=m2$dev } mean(sumsb) sd(sumsb) mean(dev1b) sd(dev1b) mean(dev2b) sd(dev2b) #dev much lower than expected m1q<-gam(y~s(x1),family="quasipoisson",scale=-1) m1q$scale m1q$dev/(1000-sum(m1q$edf)) #scale estimate < 1 sum((y-fitted(m1q))^2/fitted(m1q))/(1000-sum(m1q$edf)) #Pearson estimate of scale closer, but also < 1 linpc<--2+exp(-2*x1)*sin(8*x1) sum(exp(linpc)) dev1c<-dev2c<-sumsc<-rep(0,20) for (j in 1:20) { y<-rpois(1000,exp(linpc)) sumsc[j]<-sum(y) m1<-gam(y~s(x1),family="poisson") m2<-gam(y~s(x1,k=20),family="poisson") dev1c[j]=m1$dev dev2c[j]=m2$dev } mean(sumsc) sd(sumsc) mean(dev1c) sd(dev1c) mean(dev2c) sd(dev2c) #dev much lower than expected m1q<-gam(y~s(x1),family="quasipoisson",scale=-1) m1q$scale m1q$sig2 m1q$dev/(1000-sum(m1q$edf)) #scale estimate < 1 sum((y-fitted(m1q))^2/fitted(m1q))/(1000-sum(m1q$edf)) #Pearson estimate of scale ok __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gam and optim
just to clarify how I see the error, it was the mis-definition of the penalty term in the function dv. The following code corrects this error. What is actually being minimised at this step is the penalised deviance conditional on the smoothing parameter. A second issue is that the optim default ("Nelder-Mead") would have to be recycled several times to approximate the minimum obtained by gam, however, in this example, "BFGS" gets it straight away. sorry for all the confusion! greg set.seed(1) library(mgcv) x<-runif(100) lp<-exp(-2*x)*sin(8*x) y<-rpois(100,exp(lp)) m1<-gam(y~s(x),poisson) W<-diag(fitted(m1)) X<-predict(m1,type="lpmatrix") S<-m1$smooth[[1]]$S[[1]] S<-rbind(0,cbind(0,S)) A<-X%*%solve(t(X)%*%W%*%X+m1$sp*S)%*%t(X)%*%W sum(diag(A)) sum(m1$edf) fit<-fitted(m1) b<-m1$coef range(exp(X%*%b)-fit) z<-y/fit-1+X%*%b range(A%*%z-X%*%b) s<-m1$sp dv<-function(t) { f<-exp(X%*%t) -2*sum(y*log(f)-f-ifelse(y==0,0,y*log(y))+y)+s*t%*%S%*%t } dv(b) m1$dev+m1$sp*b%*%S%*%b t1<-optim(rep(0,10),dv,method="BFGS") t1$p b dv(t1$p) dv(b) fit1<-exp(X%*%t1$p) plot(x,y) points(x,exp(lp),pch=16,col="green3") points(x,fitted(m1),pch=16,cex=0.5,col="blue") points(x,fit1,cex=1.5,col="red") > please ignore this, I see the error. > > greg > >> hi >> >> probably a silly mistake, but I expected gam to minimise the penalised >> deviance. >> >> thanks >> >> greg >> >> set.seed(1) >> library(mgcv) >> x<-runif(100) >> lp<-exp(-2*x)*sin(8*x) >> y<-rpois(100,exp(lp)) >> plot(x,y) >> m1<-gam(y~s(x),poisson) >> points(x,exp(lp),pch=16,col="green3") >> points(x,fitted(m1),pch=16,cex=0.5,col="blue") >> W<-diag(fitted(m1)) >> X<-predict(m1,type="lpmatrix") >> S<-m1$smooth[[1]]$S[[1]] >> S<-rbind(0,cbind(0,S)) >> A<-X%*%solve(t(X)%*%W%*%X+m1$sp*S)%*%t(X)%*%W >> sum(diag(A)) >> sum(m1$edf) >> fit<-fitted(m1) >> b<-m1$coef >> range(exp(X%*%b)-fit) >> z<-y/fit-1+X%*%b >> range(A%*%z-X%*%b) >> >> dv<-function(t) >> { >> f<-exp(X%*%t) >> -2*sum(y*log(f)-f-ifelse(y==0,0,y*log(y))+y)+t%*%S%*%t >> } >> dv(b) >> m1$dev+b%*%S%*%b >> >> #so far, so good >> >> >> t1<-optim(rep(0,10),dv) >> t1$p >> b >> >> #different >> >> dv(t1$p) >> dv(b) >> >> #different, and dv(t1$p) is lower! >> >> fit1<-exp(X%*%t1$p) >> points(x,fit1,pch=16,cex=0.5,col="red") >> >> # different >> # gam found b which does approximate the true curve, but does not >> minimise >> the penalised deviance, by a long shot. >> >> > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gam and optim
hi probably a silly mistake, but I expected gam to minimise the penalised deviance. thanks greg set.seed(1) library(mgcv) x<-runif(100) lp<-exp(-2*x)*sin(8*x) y<-rpois(100,exp(lp)) plot(x,y) m1<-gam(y~s(x),poisson) points(x,exp(lp),pch=16,col="green3") points(x,fitted(m1),pch=16,cex=0.5,col="blue") W<-diag(fitted(m1)) X<-predict(m1,type="lpmatrix") S<-m1$smooth[[1]]$S[[1]] S<-rbind(0,cbind(0,S)) A<-X%*%solve(t(X)%*%W%*%X+m1$sp*S)%*%t(X)%*%W sum(diag(A)) sum(m1$edf) fit<-fitted(m1) b<-m1$coef range(exp(X%*%b)-fit) z<-y/fit-1+X%*%b range(A%*%z-X%*%b) dv<-function(t) { f<-exp(X%*%t) -2*sum(y*log(f)-f-ifelse(y==0,0,y*log(y))+y)+t%*%S%*%t } dv(b) m1$dev+b%*%S%*%b #so far, so good t1<-optim(rep(0,10),dv) t1$p b #different dv(t1$p) dv(b) #different, and dv(t1$p) is lower! fit1<-exp(X%*%t1$p) points(x,fit1,pch=16,cex=0.5,col="red") # different # gam found b which does approximate the true curve, but does not minimise the penalised deviance, by a long shot. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gam and optim
please ignore this, I see the error. greg > hi > > probably a silly mistake, but I expected gam to minimise the penalised > deviance. > > thanks > > greg > > set.seed(1) > library(mgcv) > x<-runif(100) > lp<-exp(-2*x)*sin(8*x) > y<-rpois(100,exp(lp)) > plot(x,y) > m1<-gam(y~s(x),poisson) > points(x,exp(lp),pch=16,col="green3") > points(x,fitted(m1),pch=16,cex=0.5,col="blue") > W<-diag(fitted(m1)) > X<-predict(m1,type="lpmatrix") > S<-m1$smooth[[1]]$S[[1]] > S<-rbind(0,cbind(0,S)) > A<-X%*%solve(t(X)%*%W%*%X+m1$sp*S)%*%t(X)%*%W > sum(diag(A)) > sum(m1$edf) > fit<-fitted(m1) > b<-m1$coef > range(exp(X%*%b)-fit) > z<-y/fit-1+X%*%b > range(A%*%z-X%*%b) > > dv<-function(t) > { > f<-exp(X%*%t) > -2*sum(y*log(f)-f-ifelse(y==0,0,y*log(y))+y)+t%*%S%*%t > } > dv(b) > m1$dev+b%*%S%*%b > > #so far, so good > > > t1<-optim(rep(0,10),dv) > t1$p > b > > #different > > dv(t1$p) > dv(b) > > #different, and dv(t1$p) is lower! > > fit1<-exp(X%*%t1$p) > points(x,fit1,pch=16,cex=0.5,col="red") > > # different > # gam found b which does approximate the true curve, but does not minimise > the penalised deviance, by a long shot. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mgcv summary.gam
hi I'm trying to understand (a little) the code behind summary.gam, and have the Biometrika article referred to in the Help. But am stuck early on. In the code, starting at line 167: if (est.disp) rdf <- residual.df else rdf <- -1 res <- testStat(p, Xt, V, df[i], type = p.type, res.df = rdf) 1) shouldn't there be some brackets in the if else statement? 2) what is testStat? thanks Greg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] REML - quasipoisson
hi the explanation seems to be that fix.family.ls is invoked to define a saturated log likelihood for quasi families, which means that the F2q term which I had thought to be undefined, is actually defined in this example as F2q<--500 * log(phiq)/2 The formula then becomes F1q-F2q+F3q-F4q which does coincide with m3$gcv Note that when phiq=1, the F2q term vanishes. greg > hi > > I'm puzzled as to the relation between the REML score computed by gam and > the formula (4) on p.4 here: > http://opus.bath.ac.uk/22707/1/Wood_JRSSB_2011_73_1_3.pdf > > I'm ok with this for poisson, or for quasipoisson when phi=1. > > However, when phi differs from 1, I'm stuck. > > #simulate some data > library(mgcv) > set.seed(1) > x1<-runif(500) > x2<-rnorm(500) > linp<--0.5+x1+exp(-x2^2/2)*sin(4*x2) > y<-rpois(500,exp(linp)) > > ##poisson > #phi=1 > m1<-gam(y~s(x1)+s(x2),family="poisson",method="REML") > phi<-m1$scale > > #formula > #1st term > S1<-m1$smooth[[1]]$S[[1]]*m1$sp[1] > S2<-m1$smooth[[2]]$S[[1]]*m1$sp[2] > S<-matrix(0,19,19) > for (i in 2:10) > { > for (j in 2:10) > { > S[i,j]=S1[i-1,j-1] > S[i+9,j+9]=S2[i-1,j-1] > } > } > beta<-m1$coef > #penalised deviance > Dp<-m1$dev+t(beta)%*%S%*%beta > F1<-Dp/(2*phi) > > #2nd term > F2<-sum(ifelse(y==0,0,y*log(y)-y-log(factorial(y > > #3rd term > X<-predict(m1,type="lpmatrix") > W<-diag(fitted(m1)) > H<-t(X)%*%W%*%X > ldhs<-determinant(H+S,log=TRUE)$modulus[1] > eigS<-eigen(S,only.values=TRUE)$val > lds<-sum(log(eigS[1:16])) > F3<-(ldhs-lds)/2 > > #4th term > Mp=3 > F4<-Mp/2*log(2*pi*phi) > > F1-F2+F3-F4 > m1$gcv > #reml score = formula > > ##quasipoisson with scale = 1 > #fitting is identical to the poisson case > #F1, F3, and F4 unchanged but F2 is now undefined > > m2<-gam(y~s(x1)+s(x2),family="quasipoisson",method="REML",scale=1) > F1+F3-F4 > m2$gcv > #reml score = formula with F2 omitted > > ##quasipoisson with unknown scale > m3<-gam(y~s(x1)+s(x2),family="quasipoisson",method="REML",scale=-1) > phiq<-m3$scale > > #1st term > S1q<-m3$smooth[[1]]$S[[1]]*m3$sp[1] > S2q<-m3$smooth[[2]]$S[[1]]*m3$sp[2] > Sq<-matrix(0,19,19) > for (i in 2:10) > { > for (j in 2:10) > { > Sq[i,j]=S1q[i-1,j-1] > Sq[i+9,j+9]=S2q[i-1,j-1] > } > } > betaq<-m3$coef > #penalised deviance > Dpq<-m3$dev+t(betaq)%*%Sq%*%betaq > F1q<-Dpq/(2*phiq) > > #2nd term undefined > > #3rd term > Xq<-predict(m3,type="lpmatrix") > Wq<-diag(fitted(m3)) > Hq<-t(Xq)%*%Wq%*%Xq > ldhsq<-determinant(Hq+Sq,log=TRUE)$modulus[1] > eigSq<-eigen(Sq,only.values=TRUE)$val > ldsq<-sum(log(eigSq[1:16])) > F3q<-(ldhsq-ldsq)/2 > > #4th term > Mp=3 > F4q<-Mp/2*log(2*pi*phiq) > > F1q+F3q-F4q > m3$gcv > > #quite different > > #but if phiq is replaced by the Pearson estimate of the scale > P<-sum((y-fitted(m3))^2/fitted(m3)) > phip<-P/(500-sum(m3$edf)) > F1p<-Dpq/(2*phip) > F3p<-F3q > #third term independent of scale > F4p<-Mp/2*log(2*pi*phip) > F1p+F3p-F4p > m3$gcv > > #closer but still wrong > > #is there a value of phi which makes this work? > f<-function(t) (Dpq/(2*t) + F3q + Mp/2*log(2*pi*t) - m3$gcv)^2 > optimise(f,interval=c(0.5,1.5))$min->phix > > Dpq/(2*phix) + F3q + Mp/2*log(2*pi*phix) > m3$gcv > #but what is phix - not the Pearson estimate or the scale returned by gam? > > thanks > > Greg Dropkin > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] REML - quasipoisson
hi I'm puzzled as to the relation between the REML score computed by gam and the formula (4) on p.4 here: http://opus.bath.ac.uk/22707/1/Wood_JRSSB_2011_73_1_3.pdf I'm ok with this for poisson, or for quasipoisson when phi=1. However, when phi differs from 1, I'm stuck. #simulate some data library(mgcv) set.seed(1) x1<-runif(500) x2<-rnorm(500) linp<--0.5+x1+exp(-x2^2/2)*sin(4*x2) y<-rpois(500,exp(linp)) ##poisson #phi=1 m1<-gam(y~s(x1)+s(x2),family="poisson",method="REML") phi<-m1$scale #formula #1st term S1<-m1$smooth[[1]]$S[[1]]*m1$sp[1] S2<-m1$smooth[[2]]$S[[1]]*m1$sp[2] S<-matrix(0,19,19) for (i in 2:10) { for (j in 2:10) { S[i,j]=S1[i-1,j-1] S[i+9,j+9]=S2[i-1,j-1] } } beta<-m1$coef #penalised deviance Dp<-m1$dev+t(beta)%*%S%*%beta F1<-Dp/(2*phi) #2nd term F2<-sum(ifelse(y==0,0,y*log(y)-y-log(factorial(y #3rd term X<-predict(m1,type="lpmatrix") W<-diag(fitted(m1)) H<-t(X)%*%W%*%X ldhs<-determinant(H+S,log=TRUE)$modulus[1] eigS<-eigen(S,only.values=TRUE)$val lds<-sum(log(eigS[1:16])) F3<-(ldhs-lds)/2 #4th term Mp=3 F4<-Mp/2*log(2*pi*phi) F1-F2+F3-F4 m1$gcv #reml score = formula ##quasipoisson with scale = 1 #fitting is identical to the poisson case #F1, F3, and F4 unchanged but F2 is now undefined m2<-gam(y~s(x1)+s(x2),family="quasipoisson",method="REML",scale=1) F1+F3-F4 m2$gcv #reml score = formula with F2 omitted ##quasipoisson with unknown scale m3<-gam(y~s(x1)+s(x2),family="quasipoisson",method="REML",scale=-1) phiq<-m3$scale #1st term S1q<-m3$smooth[[1]]$S[[1]]*m3$sp[1] S2q<-m3$smooth[[2]]$S[[1]]*m3$sp[2] Sq<-matrix(0,19,19) for (i in 2:10) { for (j in 2:10) { Sq[i,j]=S1q[i-1,j-1] Sq[i+9,j+9]=S2q[i-1,j-1] } } betaq<-m3$coef #penalised deviance Dpq<-m3$dev+t(betaq)%*%Sq%*%betaq F1q<-Dpq/(2*phiq) #2nd term undefined #3rd term Xq<-predict(m3,type="lpmatrix") Wq<-diag(fitted(m3)) Hq<-t(Xq)%*%Wq%*%Xq ldhsq<-determinant(Hq+Sq,log=TRUE)$modulus[1] eigSq<-eigen(Sq,only.values=TRUE)$val ldsq<-sum(log(eigSq[1:16])) F3q<-(ldhsq-ldsq)/2 #4th term Mp=3 F4q<-Mp/2*log(2*pi*phiq) F1q+F3q-F4q m3$gcv #quite different #but if phiq is replaced by the Pearson estimate of the scale P<-sum((y-fitted(m3))^2/fitted(m3)) phip<-P/(500-sum(m3$edf)) F1p<-Dpq/(2*phip) F3p<-F3q #third term independent of scale F4p<-Mp/2*log(2*pi*phip) F1p+F3p-F4p m3$gcv #closer but still wrong #is there a value of phi which makes this work? f<-function(t) (Dpq/(2*t) + F3q + Mp/2*log(2*pi*t) - m3$gcv)^2 optimise(f,interval=c(0.5,1.5))$min->phix Dpq/(2*phix) + F3q + Mp/2*log(2*pi*phix) m3$gcv #but what is phix - not the Pearson estimate or the scale returned by gam? thanks Greg Dropkin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mgcv: REML increase with new predictor
hi Simon this follows on from the example where gcv increased unexpectedly with increasing basis dimension. I'm now looking at t2 tensor splines with REML, and find that the REML score can increase when adding a new predictor. Again, this seems odd. thanks Greg library(mgcv) set.seed(0) #simulate some data x1<-runif(2000) x2<-rnorm(2000) x3<-rpois(2000,3) d<-runif(2000) linp<--5+x1+0.3*x2+4*cos(x1*x2) y<-rpois(2000,exp(linp)) #y does not depend on d table(y) sum(y) m0<-gam(y~t2(x1)+t2(x2)+t2(x1,x2),family="quasipoisson",method="REML",scale=-1) sc0<-m0$scale sc0 m0<-gam(as.formula(m0),family="quasipoisson",method="REML",scale=sc0) m1<-gam(update.formula(m0,~.+t2(d)),family="quasipoisson",method="REML",scale=sc0) #dev has decreased slightly, though not significantly m0$dev m1$dev #REML has increased, unexpected (to me) m0$gcv m1$gcv #does this go away if using full=TRUE? m0t<-gam(y~t2(x1,full=TRUE)+t2(x2,full=TRUE)+t2(x1,x2,full=TRUE),family="quasipoisson",method="REML",scale=-1) sc0t<-m0t$scale sc0t m0t<-gam(as.formula(m0t),family="quasipoisson",method="REML",scale=sc0t) m1t<-gam(update.formula(m0t,~.+t2(d,full=TRUE)),family="quasipoisson",method="REML",scale=sc0t) #dev has still decreased slightly m0t$dev m1t$dev #REML has still increased m0t$gcv m1t$gcv #how about te rather than t2? m0te<-gam(y~te(x1)+te(x2)+te(x1,x2),family="quasipoisson",method="REML",scale=-1) sc0te<-m0t$scale sc0te m0te<-gam(as.formula(m0t),family="quasipoisson",method="REML",scale=sc0t) m1te<-gam(update.formula(m0t,~.+te(d)),family="quasipoisson",method="REML",scale=sc0t) #dev has still decreased slightly m0te$dev m1te$dev #now REML decreases m0te$gcv m1te$gcv __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mgcv: increasing basis dimension
thanks Simon I'll upgrade R to try t2. The data I'm actually analysing requires scaled Poisson so I don't think REML is an option. thanks Greg On 14/02/12 11:22 Simon Wood wrote: That's interesting. Playing with the example, it doesn't seem to be a local minimum. I think that this happens because, although the higher rank basis contains the lower rank basis, the penalty can not simply suppress all the extra components in the higher rank basis and recover exactly what the lower rank basis gave: it's forced to include some of the extra stuff, even if heavily penalized, and this is what is degrading the higher rank fit in this case. t2 tensor product smooths seem to be less susceptible to this effect, and for reasons I don't understand so does REML based smoothness selection (gam(...,method="REML")) best, Simon > hi > > Using a ts or tprs basis, I expected gcv to decrease when increasing the > basis dimension, as I thought this would minimise gcv over a larger > subspace. But gcv increased. Here's an example. thanks for any comments. > > greg > > #simulate some data > set.seed(0) > x1<-runif(500) > x2<-rnorm(500) > x3<-rpois(500,3) > d<-runif(500) > linp<--1+x1+0.5*x2+0.3*exp(-2*d)*sin(10*d)*x3 > y<-rpois(500,exp(linp)) > sum(y) > > library(mgcv) > #basis dimension k=5 > m1<-gam(y~x1+x2+te(d,bs="ts")+te(x3,bs="ts")+te(d,x3,bs="ts"),family="poisson") > > #basis dimension k=10 > m2<-gam(y~x1+x2+te(d,bs="ts",k=10)+te(x3,bs="ts",k=10)+te(d,x3,bs="ts",k=10),family="poisson") > > #gcv increased > m1$gcv > m2$gcv > > summary(m1) > summary(m2) > > gam.check(m1) > gam.check(m2) > > > #is this due to bs="ts"? > > #basis dimension k=5 > m1tp<-gam(y~x1+x2+te(d,bs="tp")+te(x3,bs="tp")+te(d,x3,bs="tp"),family="poisson") > > #basis dimension k=10 > m2tp<-gam(y~x1+x2+te(d,bs="tp",k=10)+te(x3,bs="tp",k=10)+te(d,x3,bs="tp",k=10),family="poisson") > > m1tp$gcv > m2tp$gcv > > #no > > summary(m1tp) > summary(m2tp) > > gam.check(m1tp) > gam.check(m2tp) > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mgcv: increasing basis dimension
hi Using a ts or tprs basis, I expected gcv to decrease when increasing the basis dimension, as I thought this would minimise gcv over a larger subspace. But gcv increased. Here's an example. thanks for any comments. greg #simulate some data set.seed(0) x1<-runif(500) x2<-rnorm(500) x3<-rpois(500,3) d<-runif(500) linp<--1+x1+0.5*x2+0.3*exp(-2*d)*sin(10*d)*x3 y<-rpois(500,exp(linp)) sum(y) library(mgcv) #basis dimension k=5 m1<-gam(y~x1+x2+te(d,bs="ts")+te(x3,bs="ts")+te(d,x3,bs="ts"),family="poisson") #basis dimension k=10 m2<-gam(y~x1+x2+te(d,bs="ts",k=10)+te(x3,bs="ts",k=10)+te(d,x3,bs="ts",k=10),family="poisson") #gcv increased m1$gcv m2$gcv summary(m1) summary(m2) gam.check(m1) gam.check(m2) #is this due to bs="ts"? #basis dimension k=5 m1tp<-gam(y~x1+x2+te(d,bs="tp")+te(x3,bs="tp")+te(d,x3,bs="tp"),family="poisson") #basis dimension k=10 m2tp<-gam(y~x1+x2+te(d,bs="tp",k=10)+te(x3,bs="tp",k=10)+te(d,x3,bs="tp",k=10),family="poisson") m1tp$gcv m2tp$gcv #no summary(m1tp) summary(m2tp) gam.check(m1tp) gam.check(m2tp) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cubic splines in package "mgcv"
re: Cubic splines in package "mgcv" I don't have access to Gu (2002) but clearly the function R(x,z) defined on p126 of Simon Wood's book is piecewise quartic, not piecewise cubic. Like Kunio Takezawa (below) I was puzzled by the word "cubic" on p126. As Simon Wood writes, this basis is not actually used by mgcv when specifying bs="cr". Maybe the point is that at the knot, this continuous function has continuous 1st and 2nd derivatives, but a discontinuous 3rd derivative, so in that sense it behaves like a cubic spline. Greg #using the code from p127 of Wood: #compare Wood Fig 3.4 (p125) #if the function were piecewise cubic the 3rd derivative would be piecewise constant rk<-function(x,z) { ((z-0.5)^2-1/12)*((x-0.5)^2-1/12)/4-((abs(x-z)-0.5)^4-(abs(x-z)-0.5)^2/2+7/240)/24 } par(mfrow=c(2,2)) u<-seq(0,1,by=0.001) plot(u,rk(u,5/6),main="function") plot(u[-1],1e3*diff(rk(u,5/6),differences=1),main="1st derivative") plot(u[-(1:2)],1e6*diff(rk(u,5/6),differences=2),main="2nd derivative") plot(u[-(1:3)],1e9*diff(rk(u,5/6),differences=3),main="3rd derivative") par(mfrow=c(1,1)) --- From: Simon Wood Date: Sun, 6 Jan 2008 16:59:35 + On Wednesday 26 December 2007 04:14, Kunio takezawa wrote: > R-users > E-mail: r-help at r-project.org > My understanding is that package "mgcv" is based on > "Generalized Additive Models: An Introduction with R (by Simon N. Wood)". > On the page 126 of this book, eq(3.4) looks a quartic equation with respect > to > "x", not a cubic equation. I am wondering if all routines which uses > cubic splines in mgcv are based on this quartic equation. --- No, `mgcv' does not use the basis given on page 126. See sections 4.1.2-4.1.8 of the same book for the bases used. > In my humble opinion, the '^4' in the first term > of the second line of this equation should be '^3'. --- Perhaps take a look at section 2.3.3 of Gu (2002) "Smoothing Spline ANOVA" for a bit more detail on this/ > > K. Takezawa -- > Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK > +44 1225 386603 www.maths.bath.ac.uk/~sw283 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gam plots and seWithMean
hadn't realised the answer would be in the source code! anyway, this appears to work. The only difference is in the last section. greg -- library(mgcv) #simulate some data x1<-runif(500) x2<-rnorm(500) x3<-rpois(500,3) d<-runif(500) t<-runif(500,20,50) linp<--6.5+x1+2*x2-x3+2*exp(-2*d)*sin(2*pi*d) lam<-t*exp(linp) y<-rpois(500,lam) sum(y) table(y) #fit the data without d by glm and with d by gam f1<-glm(y~offset(log(t))+x1+x2+x3,poisson) f2<-gam(update.formula(as.formula(f1),~.+s(d)),poisson) anova(f1,f2) summary(f2) plot(f2) #the solid line s(d) dat<-data.frame(t,d,x1,x2,x3) datn<-transform(dat,d=0) dif<-predict(f2)-predict(f2,datn) cdif<-dif-mean(dif) points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2)) #another approach to the solid line s(d) devAskNewPage(ask=T) plot(f2) premat2<-PredictMat(f2$smooth[[1]],data=dat) dim(premat2) pars<-f2$coef pars2<-pars[5:13] pars2<-as.matrix(pars2,9,1) pars2 points(d,premat2%*%pars2,cex=0.5,col=rgb(0,0.6,0.3,0.2)) #premat2%*%pars2 = cdif #confidence intervals when seWithMean = FALSE devAskNewPage(ask=T) plot(f2) points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2)) Vp2<-f2$Vp[5:13,5:13] se2<-sqrt(diag(premat2%*%Vp2%*%t(premat2))) points(d,cdif+qnorm(0.975)*se2,cex=0.5,col=rgb(1,0,0,0.2)) points(d,cdif-qnorm(0.975)*se2,cex=0.5,col=rgb(0,0,1,0.2)) #numerical output for the confidence bands is given by #cdif+qnorm(0.975)*se2 #cdif-qnorm(0.975)*se2 #confidence intervals when seWithMean = TRUE devAskNewPage(ask=T) plot(f2,seWithMean=T) points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2)) premat<-cbind(rep(1,500),x1,x2,x3,premat2) pars<-as.matrix(pars,13,1) range(predict.gam(f2)-log(t)-as.numeric(premat%*%pars)) #so predict.gam(f2) = log(t) + as.numeric(premat%*%pars) sew<-sqrt(diag(premat%*%f2$Vp%*%t(premat))) range(sew-predict(f2,se.fit=T)$se.fit) #sew = predict(f2,se.fit=T)$se.fit points(d,cdif+qnorm(0.975)*sew,cex=0.5,col=rgb(1,0,0,0.2)) points(d,cdif-qnorm(0.975)*sew,cex=0.5,col=rgb(0,0,1,0.2)) #so this is not what plot.gam is doing #but this is devAskNewPage(ask=T) plot(f2,seWithMean=T) points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2)) premat<-cbind(rep(1,500),x1,x2,x3,premat2) pars<-as.matrix(pars,13,1) premat1<-premat premat1[,1:4]<-rep(f2$cmX[1:4],each=500) sew1<-sqrt(diag(premat1%*%f2$Vp%*%t(premat1))) points(d,cdif+qnorm(0.975)*sew1,cex=0.5,col=rgb(1,0,0,0.2)) points(d,cdif-qnorm(0.975)*sew1,cex=0.5,col=rgb(0,0,1,0.2)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gam plots and seWithMean
hello I'm learning mgcv and would like to obtain numerical output corresponding to plot.gam. I can do so when seWithMean=FALSE (the default) but only approximately when seWithMean=TRUE. Can anyone show how to obtain the exact values? Alternatively, can you clarify the explanation in the manual "Note that, if seWithMean=TRUE, the confidence bands include the uncertainty about the overall mean. In other words although each smooth is shown centred, the confidence bands are obtained as if every other term in the model was constrained to have average 0 (average taken over the covariate values), except for the smooth concerned". an example with a poisson gam (re-run this a few times as the plots can vary significantly) --- library(mgcv) #simulate some data x1<-runif(500) x2<-rnorm(500) x3<-rpois(500,3) d<-runif(500) t<-runif(500,20,50) linp<--6.5+x1+2*x2-x3+2*exp(-2*d)*sin(2*pi*d) lam<-t*exp(linp) y<-rpois(500,lam) sum(y) table(y) #fit the data without d by glm and with d by gam f1<-glm(y~offset(log(t))+x1+x2+x3,poisson) f2<-gam(update.formula(as.formula(f1),~.+s(d)),poisson) anova(f1,f2) summary(f2) plot(f2) #the solid line s(d) dat<-data.frame(t,d,x1,x2,x3) datn<-transform(dat,d=0) dif<-predict(f2)-predict(f2,datn) cdif<-dif-mean(dif) points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2)) #another approach to the solid line s(d) devAskNewPage(ask=T) plot(f2) premat2<-PredictMat(f2$smooth[[1]],data=dat) dim(premat2) pars<-f2$coef pars2<-pars[5:13] pars2<-as.matrix(pars2,9,1) pars2 points(d,premat2%*%pars2,cex=0.5,col=rgb(0,0.6,0.3,0.2)) #premat2%*%pars2 = cdif #confidence intervals when seWithMean = FALSE devAskNewPage(ask=T) plot(f2) points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2)) Vp2<-f2$Vp[5:13,5:13] se2<-sqrt(diag(premat2%*%Vp2%*%t(premat2))) points(d,cdif+qnorm(0.975)*se2,cex=0.5,col=rgb(1,0,0,0.2)) points(d,cdif-qnorm(0.975)*se2,cex=0.5,col=rgb(0,0,1,0.2)) #numerical output for the confidence bands is given by #cdif+qnorm(0.975)*se2 #cdif-qnorm(0.975)*se2 #confidence intervals when seWithMean = TRUE devAskNewPage(ask=T) plot(f2,seWithMean=T) points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2)) premat<-cbind(rep(1,500),x1,x2,x3,premat2) pars<-as.matrix(pars,13,1) range(predict.gam(f2)-log(t)-as.numeric(premat%*%pars)) #so predict.gam(f2) = log(t) + as.numeric(premat%*%pars) sew<-sqrt(diag(premat%*%f2$Vp%*%t(premat))) range(sew-predict(f2,se.fit=T)$se.fit) #sew = predict(f2,se.fit=T)$se.fit points(d,cdif+qnorm(0.975)*sew,cex=0.5,col=rgb(1,0,0,0.2)) points(d,cdif-qnorm(0.975)*sew,cex=0.5,col=rgb(0,0,1,0.2)) #sort of #smoothing the sew estimates devAskNewPage(ask=T) plot(f2,seWithMean=T) points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2)) lines(smooth.spline(cdif+qnorm(0.975)*sew~d),col="red") lines(smooth.spline(cdif-qnorm(0.975)*sew~d),col="blue") points(smooth.spline(cdif+qnorm(0.975)*sew~d)$x,smooth.spline(cdif+qnorm(0.975)*sew~d)$y,cex=0.8,col=rgb(1,0,0,0.1)) points(smooth.spline(cdif+qnorm(0.975)*sew~d)$x,smooth.spline(cdif-qnorm(0.975)*sew~d)$y,cex=0.8,col=rgb(0,0,1,0.1)) #close, but not exact #numerical values corresponding to sort(d) are extracted as #smooth.spline(cdif+qnorm(0.975)*sew~d)$y #smooth.spline(cdif-qnorm(0.975)*sew~d)$y #smoothing sew with 93% confidence levels devAskNewPage(ask=T) plot(f2,seWithMean=T) points(d,cdif,cex=0.5,col=rgb(0,1,0,0.2)) lines(smooth.spline(cdif+qnorm(0.965)*sew~d),col="red") lines(smooth.spline(cdif-qnorm(0.965)*sew~d),col="blue") points(smooth.spline(cdif+qnorm(0.965)*sew~d)$x,smooth.spline(cdif+qnorm(0.965)*sew~d)$y,cex=0.8,col=rgb(1,0,0,0.1)) points(smooth.spline(cdif+qnorm(0.965)*sew~d)$x,smooth.spline(cdif-qnorm(0.965)*sew~d)$y,cex=0.8,col=rgb(0,0,1,0.1)) #possibly closer, but not exact #numerical values corresponding to sort(d) are extracted as #smooth.spline(cdif+qnorm(0.965)*sew~d)$y #smooth.spline(cdif-qnorm(0.965)*sew~d)$y __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] matrix^(-1/2)
re [R] matrix^(-1/2) re the discussion in November on this thread. I don't know about expm but the problem must be equivalent to solve(B^(1/2)) and a solution will exist iff B is invertible and has a square root A with A%*%A = B. For 2x2 matrices necessary and sufficient conditions for B to have a square root are that either B = diag(0,2) or B%*%B != diag(0,2). This follows from the fact that B%*%B - t(B)*B + det(B) = 0 and A%*%A - t(A)*A + det(A) = 0. Many non-symmetric B satisfy B%*%B != diag(0,2) and many of them are invertible. e.g. all the rotations. If expm has a function, perhaps Exp, which exponentiates matrices using matrix multiplication, then if B = Exp(C), B^(-1/2) = Exp(-C/2) would be a solution. There is an open neighbourhood of the identity matrix in which B = Exp(C) must hold. Since t(B) = Exp(t(C)), such B will not in general be symmetric. If N satisfies N%*%N=0, then B = I+N where I is the identity matrix will have B^(-1/2) = 1-N/2. Such B are not symmetric as N cannot be symmetric (for non-zero N). So, for B^(-1/2) to exist, B must be invertible but need not be symmetric. hope that helps Greg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scaled Schoenfeld residuals
hi thanks, I see that cox.zph is plotting and smoothing the "scaled Schoenfeld" residuals as generated by R, but since the term is already in the literature with a formula, maybe the help should clarify the offset. I found it confusing anyway. thanks for help greg Thomas Lumley tlumley at u.washington.edu Thu Sep 24 16:18:17 CEST 2009 Previous message: [R] scaled Schoenfeld residuals Next message: [R] generate random number without repetition Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] On Wed, 23 Sep 2009, Greg Dropkin wrote: > hi > > sorry if this has been discussed before, but I'm wondering why the scaled > Schoenfeld residuals do not follow the defining formula for obtaining them > from the ordinary Schoenfeld residuals, but are instead offset by the > estimated parameter values. > Because their purpose in life is to be smoothed against time to get an estimate of the parameter as a function of time (plot.cox.zph). -thomas Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scaled Schoenfeld residuals
hi sorry if this has been discussed before, but I'm wondering why the scaled Schoenfeld residuals do not follow the defining formula for obtaining them from the ordinary Schoenfeld residuals, but are instead offset by the estimated parameter values. e.g. library(survival) attach(ovarian) sv<-Surv(futime,fustat) f1<-coxph(sv~age+ecog.ps) f1 schres<-resid(f1,type="schoenfeld") schresc<-resid(f1,type="scaledsch") n=sum(fustat) V<-f1$var schresc1<-t(n*V%*%t(schres)) #schresc1 is how the scaled Schoenfeld residuals are defined #in terms of the number of events #variance of the parameter estimates, #and ordinary Schoenfeld residuals #but schresc1 and schresc differ schresc schresc1 #schresc is schresc1 offset by the parameter estimates beta<-as.vector(f1$coef) nbeta<-outer(rep(1,n),beta) nbeta schresc-nbeta schresc1 #is there a reason for the offset #or am I missing something? thanks Greg Greg Dropkin gr...@gn.apc.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] score test statistic in logistic regression
re post from bkelcey at umich.edu bkelcey at umich.edu Wed Feb 27 15:09:48 CET 2008 If the response y is given as the proportion of successes out of n trials, and y, n, p, x, and z are vectors of length M, and the model is logit(p) = b0 + b1*x + b2*z then for the score test for the null hypothesis b1=0 use des<-array(c(rep(1,M),x,z),dim=c(M,3)) m0<-glm(y~z,binomial,weights=n) f<-fitted(m0) efscor<-1:3 for (i in 1:3) { efscor[i]<-sum((y-f)*n*des[,i]) } fim<-outer(1:3,1:3) for (i in 1:3) { for (j in 1:3) { fim[i,j]<-sum(des[,i]*des[,j]*n*f*(1-f)) } } score<-crossprod(efscor,crossprod(solve(fim),efscor)) score ok? greg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.