On Wed, 13 Aug 2008, Birgitle wrote:

I try tu use mob() with my data.frame ('data.frame':    288 obs. of  81
variables; factors, numerics and ordered factors)
My response is a binary variable and I should use for modelling a logistic
regression (family=binomial).

I read in the "MOB" Vignette that I could use a formula like this if I would
like to have only partitioning variables apart from the response.

Test.mob<-mob(Resp~1|Var1+Var2+...., data=dataframe, model=glinearModel,
family=binomial())

This works for me. Considering an example that is easily reproducible: classifying just two (out of three) species in the iris data.

iris2 <- iris[-(1:50),]
iris2$Species <- factor(iris2$Species)
mb <- mob(Species ~ 1 | Petal.Length + Petal.Width + Sepal.Length +
   Sepal.Width, data = iris2, model = glinearModel, family = binomial())

and this runs fine, just selecting a single split

R> mb
1) Petal.Width <= 1.7; criterion = 1, statistic = 81.818
   2)*  weights = 54
Terminal node model
Binomial GLM with coefficients:
(Intercept)
      -2.282

1) Petal.Width > 1.7
   3)*  weights = 46
Terminal node model
Binomial GLM with coefficients:
(Intercept)
       3.807

but this gives me back an error-message:

Error in `[.data.frame`(x, r, vars, drop = drop) :
 undefined columns selected

But Var1, Var2 and Resp are in my dataframe. Why do I get this error?

More importantly, when do you get this error? My guess is that this is during plotting, right?

If so, then the problem is that the plot() method for "mob" object by default calls node_bivplot() in each terminal node which is designed for generating partial regressor plots. In this situation this does not make sense because you don't have regressors in the terminal nodes.

We haven't got a panel function for the type of model you are looking at but I've just hacked a simple one that should be sufficient for your purposes. It is essentially like node_barplot() but exploits the binomial model. It is attached below. With this you can do
   plot(mb, terminal_panel = myplot, tnex = 2)

I am also wondering how I can find out which variables I should use for
partitioning and which for modelling?

For the variables for which a linear specification makes sense (at least in each component) then you should include them for modeling. And those variables for which it is not clear a priori what a useful parametric specification would be should be used as partitioning variables.

There are correlations between some variables in my dataframe. Would it be a
possibility to use always one variable of the correlated variable-pairs for
partitioning and one for modelling?

You can do that, but you could also do other combinations. That probably depends on your application.

hth,
Z

myplot <- function(ctreeobj,
                          col = "black",
                         fill = NULL,
                         beside = NULL,
                         ymax = NULL,
                         ylines = NULL,
                         widths = 1,
                         gap = NULL,
                         reverse = NULL,
                         id = TRUE)
{
     getMaxPred <- function(x) {
       mp <- max(x$prediction)
       mpl <- ifelse(x$terminal, 0, getMaxPred(x$left))
       mpr <- ifelse(x$terminal, 0, getMaxPred(x$right))
       return(max(c(mp, mpl, mpr)))
     }

     y <- response(ctreeobj)[[1]]

     if(is.factor(y) || class(y) == "was_ordered") {
         ylevels <- levels(y)
        if(is.null(beside)) beside <- if(length(ylevels) < 3) FALSE else TRUE
         if(is.null(ymax)) ymax <- if(beside) 1.1 else 1
        if(is.null(gap)) gap <- if(beside) 0.1 else 0
     } else {
         if(is.null(beside)) beside <- FALSE
         if(is.null(ymax)) ymax <- getMaxPred([EMAIL PROTECTED]) * 1.1
         ylevels <- seq(along = [EMAIL PROTECTED])
         if(length(ylevels) < 2) ylevels <- ""
        if(is.null(gap)) gap <- 1
     }
     if(is.null(reverse)) reverse <- !beside
     if(is.null(fill)) fill <- gray.colors(length(ylevels))
     if(is.null(ylines)) ylines <- if(beside) c(3, 2) else c(1.5, 2.5)

     ### panel function for barplots in nodes
     rval <- function(node) {

         ## parameter setup
        fm <- node$model
         pred <- fm$family$linkinv(coef(fm))
        if(reverse) {
          pred <- rev(pred)
          ylevels <- rev(ylevels)
        }
         np <- length(pred)
        nc <- if(beside) np else 1

        fill <- rep(fill, length.out = np)
         widths <- rep(widths, length.out = nc)
        col <- rep(col, length.out = nc)
        ylines <- rep(ylines, length.out = 2)

        gap <- gap * sum(widths)
         yscale <- c(0, ymax)
         xscale <- c(0, sum(widths) + (nc+1)*gap)

         top_vp <- viewport(layout = grid.layout(nrow = 2, ncol = 3,
                            widths = unit(c(ylines[1], 1, ylines[2]), c("lines", "null", 
"lines")),
                            heights = unit(c(1, 1), c("lines", "null"))),
                            width = unit(1, "npc"),
                            height = unit(1, "npc") - unit(2, "lines"),
                           name = paste("node_barplot", node$nodeID, sep = ""))

         pushViewport(top_vp)
         grid.rect(gp = gpar(fill = "white", col = 0))

         ## main title
         top <- viewport(layout.pos.col=2, layout.pos.row=1)
         pushViewport(top)
        mainlab <- paste(ifelse(id, paste("Node", node$nodeID, "(n = "), "n = 
"),
                         sum(node$weights), ifelse(id, ")", ""), sep = "")
         grid.text(mainlab)
         popViewport()

         plot <- viewport(layout.pos.col=2, layout.pos.row=2,
                          xscale=xscale, yscale=yscale,
                         name = paste("node_barplot", node$nodeID, "plot",
                          sep = ""))

         pushViewport(plot)

        if(beside) {
          xcenter <- cumsum(widths+gap) - widths/2
          for (i in 1:np) {
             grid.rect(x = xcenter[i], y = 0, height = pred[i],
                       width = widths[i],
                      just = c("center", "bottom"), default.units = "native",
                      gp = gpar(col = col[i], fill = fill[i]))
          }
           if(length(xcenter) > 1) grid.xaxis(at = xcenter, label = FALSE)
          grid.text(ylevels, x = xcenter, y = unit(-1, "lines"),
                     just = c("center", "top"),
                    default.units = "native", check.overlap = TRUE)
           grid.yaxis()
        } else {
          ycenter <- cumsum(pred) - pred

          for (i in 1:np) {
             grid.rect(x = xscale[2]/2, y = ycenter[i], height = min(pred[i], 
ymax - ycenter[i]),
                       width = widths[1],
                      just = c("center", "bottom"), default.units = "native",
                      gp = gpar(col = col[i], fill = fill[i]))
          }
           if(np > 1) {
            grid.text(ylevels[1], x = unit(-1, "lines"), y = 0,
                       just = c("left", "center"), rot = 90,
                      default.units = "native", check.overlap = TRUE)
            grid.text(ylevels[np], x = unit(-1, "lines"), y = ymax,
                       just = c("right", "center"), rot = 90,
                      default.units = "native", check.overlap = TRUE)
          }
           if(np > 2) {
            grid.text(ylevels[-c(1,np)], x = unit(-1, "lines"), y = 
ycenter[-c(1,np)],
                       just = "center", rot = 90,
                      default.units = "native", check.overlap = TRUE)
          }
           grid.yaxis(main = FALSE)
        }

         grid.rect(gp = gpar(fill = "transparent"))
         upViewport(2)
     }

     return(rval)
}
class(myplot) <- "grapcon_generator"


I would be very happy if somebody could give me some hints or answers to my
questions.

Many thanks in advance.

B.



-----
The art of living is more like wrestling than dancing.
(Marcus Aurelius)
--
View this message in context: 
http://www.nabble.com/mob%28party%29-formula-question-tp18959898p18959898.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to