If you want to do a stepwise selection there is a function in the klaR package to do it. This is not what you are asking for, though. You want a way of finding the successive error rates as additional variables are added in the forward selection process. As far as I can see you have to do this yourself and it is a mildly interesting little exercise in R programming. Here is one possible way to do it.
First you need a couple of functions: ############## errorRate <- function(object, ...) { if(!require(MASS)) stop("you need the MASS package installed") UseMethod("errorRate") } errorRate.lda <- function(object, data = eval.parent(object$call$data), type = "plug-in") { pred <- predict(object, data, type = type)$class actu <- eval(formula(object)[[2]], data) conf <- table(pred, actu) 1 - sum(diag(conf))/sum(conf) } eRates <- function(object, data = eval.parent(object$call$data), type = "plug-in") { f <- formula(object) r <- data.frame(formula = deparse(f[[3]]), Error = errorRate(object, data, type = type)) while(length(f[[3]]) > 1) { f[[3]] <- f[[3]][[2]] object$call$formula <- f object <- update(object, data = data) r <- rbind(data.frame(formula = deparse(f[[3]]), Error = errorRate(object, data, type = type)), r) } r } ############## (I have made errorRate generic as it is potentially a generic sort of operation.) Now look at your trivial example (extended a bit): ############## require(klaR) QRBdfa <- data.frame(LANDUSE = sample(c("A", "B", "C"), 270, rep = TRUE), Al = runif(270, 0, 125), Sb = runif(270, 0, 1), Ba = runif(270, 0, 235), Bi = runif(270, 0, 0.11), Cr = runif(270, 0, 65)) gw_obj <- greedy.wilks(LANDUSE ~ ., data = QRBdfa, niveau = 1) ## NB large 'niveau' ############## To use the functions you need an lda fit with the same formula as for the gw object and the same data argument as in the original call. (If you try to do this the way suggested in the help file for greedy.wilks the functions to be used here will not work. No dollars in formulae is always a good rule to follow.) The way greedy.wilks is written makes this a bit tricky, but unless you want to just type it in, here is a partly automated way of doing it: ############## require(MASS) fit <- do.call(lda, list(formula = formula(gw_obj), data = quote(QRBdfa))) ############## To use the functions: > errorRate(fit) ## for one error rate [1] 0.5962963 > eRates(fit) ## for a sequence of error rates. formula Error 1 Ba 0.6148148 2 Ba + Bi 0.6296296 3 Ba + Bi + Al 0.6074074 4 Ba + Bi + Al + Cr 0.5740741 5 Ba + Bi + Al + Cr + Sb 0.5962963 > Since this example uses very artificial random data, the output will be different every time you re-create the data. Note also that the error rates are not necessarily monotonically decreasing. Bill Venables. -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ty Smith Sent: Monday, 14 March 2011 3:51 AM To: r-help@r-project.org Subject: [R] Stepwise Discriminant... in R Hello R list, I'm looking to do some stepwise discriminant function analysis (DFA) based on the minimization of Wilks' lambda in R to end up with a composite signature (of metals "Al","Sb","Bi","Cr","Ba") capable of discriminating 100% of the source factors (LANDUSE: "A","B","C"). The Wilks' lambda portion seems straightforward. I am using the following: gw_obj <- greedy.wilks(LANDUSE ~ ., data = QRBdfa, niveau = 0.1) gw_obj Thus determining the stepwise order of metals.But I can't seem to figure out how to coerce the DFA to give me an output with the % of factors which each successive metal (variable) correctly classifies (discriminates). e.g. Step Metal %correctly classified 1 Al 25 2 Sb 75 3 Bi 89 4 Cr 100 I've worked up a trivial example below. Can anyone offer any suggestions on how I might go about doing this in R? I am working in a MAC OS environment with a current version of R. Many thanks in advance! Tyler #Example library(scatterplot3d) library(klaR) Al <-runif(27, 0, 125) QRBdfa <- as.data.frame(Al) QRBdfa$LANDUSE <- factor(c("A","A","A","B","B","B","C","C","C")) QRBdfa$Sb <- runif(27, 0, 1) QRBdfa$Ba <- runif(27, 0, 235) QRBdfa$Bi <- runif(27, 0, 0.11) QRBdfa$Cr <- runif(27, 0, 65) gw_obj <- greedy.wilks(LANDUSE ~ ., data = QRBdfa, niveau = 0.1) gw_obj fit <- lda(LANDUSE ~ Al + Sb + Bi + Cr + Ba, data = QRBdfa) [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.