Dear Rune, Thank you for your prompt reply and it looks like the ordinal package could be the answer I was looking for!
If you don't mind, I'd also like to know please what to do if the tests show the proportional odds assumption is NOT met. (Unfortunately I notice effects from almost all variables that breach the proportional odds assumption in my dataset) Would you recommend a multinomial logistic model? Or re-scaling of the data? Thank you for your time, Best wishes, Charlie On 26 Nov 2014, at 14:08, Rune Haubo <rune.ha...@gmail.com> wrote: > Dear Charlie, > > I admit that I haven't read your email closely, but here is a way to > test for non-proportional odds using the ordinal package (warning: > self-promotion) using the wine data set also from the ordinal package. > There is more information in the package vignettes > > Hope this is something you can use. > Cheers, > Rune > >> library(ordinal) >> ## Fit model: >> fm <- clm(rating ~ temp + contact, data=wine) >> summary(fm) > formula: rating ~ temp + contact > data: wine > > link threshold nobs logLik AIC niter max.grad cond.H > logit flexible 72 -86.49 184.98 6(0) 4.64e-15 2.7e+01 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > tempwarm 2.5031 0.5287 4.735 2.19e-06 *** > contactyes 1.5278 0.4766 3.205 0.00135 ** > --- > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > Threshold coefficients: > Estimate Std. Error z value > 1|2 -1.3444 0.5171 -2.600 > 2|3 1.2508 0.4379 2.857 > 3|4 3.4669 0.5978 5.800 > 4|5 5.0064 0.7309 6.850 >> ## Model with non-proportional odds for contact: >> fm2 <- clm(rating ~ temp, nominal=~contact, data=wine) >> ## Likelihood ratio test of non-proportional odds: >> anova(fm, fm2) > Likelihood ratio tests of cumulative link models: > > formula: nominal: link: threshold: > fm rating ~ temp + contact ~1 logit flexible > fm2 rating ~ temp ~contact logit flexible > > no.par AIC logLik LR.stat df Pr(>Chisq) > fm 6 184.98 -86.492 > fm2 9 190.42 -86.209 0.5667 3 0.904 >> ## Automatic tests of non-proportional odds for all varibles: >> nominal_test(fm) > Tests of nominal effects > > formula: rating ~ temp + contact > Df logLik AIC LRT Pr(>Chi) > <none> -86.492 184.98 > temp 3 -84.904 187.81 3.1750 0.3654 > contact 3 -86.209 190.42 0.5667 0.9040 > > On 25 November 2014 at 17:21, Charlotte Whitham > <charlotte.whit...@gmail.com> wrote: >> Dear list, >> >> I have used the ‘polr’ function in the MASS package to run an ordinal >> logistic regression for an ordinal categorical response variable with 15 >> continuous explanatory variables. >> I have used the code (shown below) to check that my model meets the >> proportional odds assumption following advice provided at >> (http://www.ats.ucla.edu/stat/r/dae/ologit.htm) – which has been extremely >> helpful, thank you to the authors! However, I’m a little worried about the >> output implying that not only are the coefficients across various cutpoints >> similar, but they are exactly the same (see graphic below). >> >> Here is the code I used (and see attached for the output graphic) >> >> FGV1b<-data.frame(FG1_val_cat=factor(FGV1b[,"FG1_val_cat"]),scale(FGV1[,c("X","Y","Slope","Ele","Aspect","Prox_to_for_FG","Prox_to_for_mL","Prox_to_nat_border","Prox_to_village","Prox_to_roads","Prox_to_rivers","Prox_to_waterFG","Prox_to_watermL","Prox_to_core","Prox_to_NR","PCA1","PCA2","PCA3")])) >> >> b<-polr(FGV1b$FG1_val_cat ~ FGV1b$X + FGV1b$Y + FGV1b$Slope + FGV1b$Ele + >> FGV1b$Aspect + FGV1b$Prox_to_for_FG + FGV1b$Prox_to_for_mL + >> FGV1b$Prox_to_nat_border + FGV1b$Prox_to_village + FGV1b$Prox_to_roads + >> FGV1b$Prox_to_rivers + FGV1b$Prox_to_waterFG + FGV1b$Prox_to_watermL + >> FGV1b$Prox_to_core + FGV1b$Prox_to_NR, data = FGV1b, Hess=TRUE) >> >> #Checking the assumption. So the following code will estimate the values to >> be graphed. First it shows us #the logit transformations of the >> probabilities of being greater than or equal to each value of the target >> #variable >> >> FGV1b$FG1_val_cat<-as.numeric(FGV1b$FG1_val_cat) >> >> sf <- function(y) { >> >> c('VC>=1' = qlogis(mean(FGV1b$FG1_val_cat >= 1)), >> >> 'VC>=2' = qlogis(mean(FGV1b$FG1_val_cat >= 2)), >> >> 'VC>=3' = qlogis(mean(FGV1b$FG1_val_cat >= 3)), >> >> 'VC>=4' = qlogis(mean(FGV1b$FG1_val_cat >= 4)), >> >> 'VC>=5' = qlogis(mean(FGV1b$FG1_val_cat >= 5)), >> >> 'VC>=6' = qlogis(mean(FGV1b$FG1_val_cat >= 6)), >> >> 'VC>=7' = qlogis(mean(FGV1b$FG1_val_cat >= 7)), >> >> 'VC>=8' = qlogis(mean(FGV1b$FG1_val_cat >= 8))) >> >> } >> >> (t <- with(FGV1b, summary(as.numeric(FGV1b$FG1_val_cat) ~ FGV1b$X + FGV1b$Y >> + FGV1b$Slope + FGV1b$Ele + FGV1b$Aspect + FGV1b$Prox_to_for_FG + >> FGV1b$Prox_to_for_mL + FGV1b$Prox_to_nat_border + FGV1b$Prox_to_village + >> FGV1b$Prox_to_roads + FGV1b$Prox_to_rivers + FGV1b$Prox_to_waterFG + >> FGV1b$Prox_to_watermL + FGV1b$Prox_to_core + FGV1b$Prox_to_NR, fun=sf))) >> >> >> >> #The table displays the (linear) predicted values we would get if we >> regressed our >> >> #dependent variable on our predictor variables one at a time, without the >> parallel slopes >> >> #assumption. So now, we can run a series of binary logistic regressions with >> varying cutpoints >> >> #on the dependent variable to check the equality of coefficients across >> cutpoints >> >> par(mfrow=c(1,1)) >> >> plot(t, which=1:8, pch=1:8, xlab='logit', main=' ', xlim=range(s[,7:8])) >> >> >> >> Apologies that I am no statistics expert and perhaps I am missing something >> obvious here. However, I have spent a long time trying to figure out if >> there is a problem in how I tested the model assumption and also trying to >> figure out other ways to run the same kind of model. >> >> For example, I read in many help mailing lists that others use the vglm >> function (in the VGAM package) and the lrm function (in the rms package) >> (for example see here: >> http://stats.stackexchange.com/questions/25988/proportional-odds-assumption-in-ordinal-logistic-regression-in-r-with-the-packag). >> I have tried to run the same models but am continuously coming up against >> warnings and errors. >> >> For example, when I try to fit the vglm model with the ‘parallel=FALSE’ >> argument (as the previous link mentions is important for testing the >> proportional odds assumption), I encounter the following error: >> >> >> >> Error in lm.fit(X.vlm, y = z.vlm, ...) : NA/NaN/Inf in 'y' >> >> In addition: Warning message: >> >> In Deviance.categorical.data.vgam(mu = mu, y = y, w = w, residuals = >> residuals, : >> >> fitted values close to 0 or 1 >> >> >> >> And after many searches for help, I can’t seem to find a way to fix this >> problem. >> >> I would like to ask please if there is anyone who might understand and be >> able to explain to me why the graph I produced above looks as it does. If >> indeed it means that something isn’t right, could you please help me find a >> way to test the proportional odds assumption when just using the polr >> function. Or if that is just not possible, then I will resort to trying to >> use the vglm function, but would then need some help to explain why I keep >> getting the error given above. >> >> I hope this is clear. Please do let me know if I should provide some more >> information that would help address this query. >> >> NOTE: As a background, there are 1000 datapoints here, which are actually >> location points across a study area. I am looking to see if there are any >> relationships between the categorical response variable and these 15 >> explanatory variables. All of those 15 explanatory variables are spatial >> characteristics (for example, elevation, x-y coordinates, proximity to >> forest etc.). The 1000 datapoints were randomly allocated using a GIS, but I >> took a stratified sampling approach. I made sure that 125 points were >> randomly chosen within each of the 8 different categorical response levels. >> I hope this information is also helpful. >> >> I am extremely grateful to anyone who could please give me some guidance >> with this. >> >> Thank you very much for your time, >> >> Charlie >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.