If you want to do a stepwise selection there is a function in the klaR package 
to do it.  This is not what you are asking for, though.  You want a way of 
finding the successive error rates as additional variables are added in the 
forward selection process.  As far as I can see you have to do this yourself 
and it is a mildly interesting little exercise in R programming.  Here is one 
possible way to do it.

First you need a couple of functions:

errorRate <- function(object, ...) {
  if(!require(MASS)) stop("you need the MASS package installed")

errorRate.lda <- function(object, data = eval.parent(object$call$data),
                          type = "plug-in") {
  pred <- predict(object, data, type = type)$class
  actu <- eval(formula(object)[[2]], data)
  conf <- table(pred, actu)
  1 - sum(diag(conf))/sum(conf)

eRates <- function(object, data = eval.parent(object$call$data),
                   type = "plug-in") {
  f <- formula(object)
  r <- data.frame(formula = deparse(f[[3]]),
                  Error = errorRate(object, data,
                  type = type))
  while(length(f[[3]]) > 1) {
    f[[3]] <- f[[3]][[2]]
    object$call$formula <- f
    object <- update(object, data = data)
    r <- rbind(data.frame(formula = deparse(f[[3]]),
                          Error = errorRate(object, data,
                          type = type)),

(I have made errorRate generic as it is potentially a generic sort of 
Now look at your trivial example (extended a bit):

QRBdfa <-
    data.frame(LANDUSE = sample(c("A", "B", "C"), 270, rep = TRUE),
               Al = runif(270, 0, 125),
               Sb = runif(270, 0, 1),
               Ba = runif(270, 0, 235),
               Bi = runif(270, 0, 0.11),
               Cr = runif(270, 0, 65))

gw_obj <- greedy.wilks(LANDUSE ~ ., data = QRBdfa, niveau = 1) ## NB large 

To use the functions you need an lda fit with the same formula as for the gw 
object and the same data argument as in the original call.  (If you try to do 
this the way suggested in the help file for greedy.wilks the functions to be 
used here will not work. No dollars in formulae is always a good rule to 

The way greedy.wilks is written makes this a bit tricky, but unless you want to 
just type it in, here is a partly automated way of doing it:

fit <- do.call(lda, list(formula = formula(gw_obj),
                         data = quote(QRBdfa)))

To use the functions:

> errorRate(fit)  ## for one error rate
[1] 0.5962963
> eRates(fit)     ## for a sequence of error rates.
                 formula     Error
1                     Ba 0.6148148
2                Ba + Bi 0.6296296
3           Ba + Bi + Al 0.6074074
4      Ba + Bi + Al + Cr 0.5740741
5 Ba + Bi + Al + Cr + Sb 0.5962963

Since this example uses very artificial random data, the output will be 
different every time you re-create the data.  Note also that the error rates 
are not necessarily monotonically decreasing.

Bill Venables.

-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Ty Smith
Sent: Monday, 14 March 2011 3:51 AM
To: r-help@r-project.org
Subject: [R] Stepwise Discriminant... in R

Hello R list,

I'm looking to do some stepwise discriminant function analysis (DFA) based
on the minimization of Wilks' lambda in R to end up with a composite
signature (of metals "Al","Sb","Bi","Cr","Ba") capable of discriminating
100% of the source factors (LANDUSE: "A","B","C").

The Wilks' lambda portion seems straightforward. I am using the following:

gw_obj <- greedy.wilks(LANDUSE ~ ., data = QRBdfa, niveau = 0.1)

Thus determining the stepwise order of metals.But I can't seem to figure out
how to coerce the DFA to give me an output with the % of factors which each
successive metal (variable) correctly classifies (discriminates). e.g.

Step    Metal        %correctly classified
1            Al                25
2            Sb               75
3            Bi                89
4            Cr               100

I've worked up a trivial example below. Can anyone offer any suggestions on
how I might go about doing this in R?

I am working in a MAC OS environment with a current version of R.

Many thanks in advance!



Al <-runif(27, 0, 125)
QRBdfa <- as.data.frame(Al)
QRBdfa$LANDUSE <- factor(c("A","A","A","B","B","B","C","C","C"))
QRBdfa$Sb <- runif(27, 0, 1)
QRBdfa$Ba <- runif(27, 0, 235)
QRBdfa$Bi <- runif(27, 0, 0.11)
QRBdfa$Cr <- runif(27, 0, 65)

gw_obj <- greedy.wilks(LANDUSE ~ ., data = QRBdfa, niveau = 0.1)

fit <- lda(LANDUSE ~ Al + Sb + Bi + Cr + Ba, data = QRBdfa)

        [[alternative HTML version deleted]]

R-help@r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

R-help@r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to