On Wed, 1 Aug 2007, Upasna Sharma <[EMAIL PROTECTED]> wrote: > From: "Upasna Sharma" <[EMAIL PROTECTED]> > Subject: [R] Goodman Kruskal's tau > > I need to know which package in R calculates the Goodman Kruskal's > tau statistic for nominal data. Also is there any implementation for > multiple classification analysis (Andrews at al 1973) in R? Any > information on this would be greatly appreciated.
As far as I know, the Goodman-Kruskal tau has not yet been implemented in any package, but I've coded it as a function for my own research purposes following Liebetrau 1983, see the code attached below (save it and load it into R with the source-command. You simply provide the function with a matrix 'x' with the nominal data, and get the values of the G-K tau as well as the significance values and asymptotic standard errors, calculated in both directions, i.e. GK.tau(dat=x) or GK.tau(x), e.g. > source("GK.tau.R") > x [,1] [,2] [,3] [1,] 30 10 1 [2,] 2 20 5 [3,] 1 10 15 > GK.tau(x) tau.RC tau.CR p.tau.RC p.tau.CR var.tau.RC ASE.tau.RC 1 0.3522439 0.3219675 2.002024e-13 3.065436e-12 0.004584542 0.06770924 ASE.tau.CR 1 0.07004566 GK.tau <- function(dat) { N <- sum(dat); dat.rows <- nrow(dat); dat.cols <- ncol(dat); max.col <- sum.col <- L.col <- matrix(,dat.cols); max.row <- sum.row <- L.row <- matrix(,dat.rows); for(i in 1:dat.cols) { sum.col[i] <- sum(dat[,i]); max.col[i] <- max(dat[,i]); } for(i in 1:dat.rows) { sum.row[i] <- sum(dat[i,]); max.row[i] <- max(dat[i,]); } max.row.margin <- max(apply(dat,1,sum)); max.col.margin <- max(apply(dat,2,sum)); # Goodman-Kruskal tau # Tau Column|Row n.err.unconditional <- N^2; for(i in 1:dat.rows) n.err.unconditional <- n.err.unconditional-N*sum(dat[i,]^2/sum.row[i]); n.err.conditional <- N^2-sum(sum.col^2); tau.CR <- 1-(n.err.unconditional/n.err.conditional); v <- n.err.unconditional/(N^2); d <- n.err.conditional/(N^2); f <- d*(v+1)-2*v; var.tau.CR <- 0; for(i in 1:dat.rows) for(j in 1:dat.cols) var.tau.CR <- var.tau.CR + dat[i,j]*(-2*v*(sum.col[j]/N)+d*((2*dat[i,j]/sum.row[i])-sum((dat[i,]/sum.row[i])^2))-f)^2/(N^2*d^4); ASE.tau.CR <- sqrt(var.tau.CR); z.tau.CR <- tau.CR/ASE.tau.CR; U.tau.CR <- (N-1)*(dat.cols-1)*tau.CR; # Chi-squared approximation for H0 according to Margolin & Light 1974, see also Liebetrau 1983 p.tau.CR <- pchisq(U.tau.CR,df=(dat.rows-1)*(dat.cols-1),lower=FALSE); # Tau Row|Column n.err.unconditional <- N^2; for(j in 1:dat.cols) n.err.unconditional <- n.err.unconditional-N*sum(dat[,j]^2/sum.col[j]); n.err.conditional <- N^2-sum(sum.row^2); tau.RC <- 1-(n.err.unconditional/n.err.conditional); v <- n.err.unconditional/(N^2); d <- n.err.conditional/(N^2); f <- d*(v+1)-2*v; var.tau.RC <- 0; for(i in 1:dat.rows) for(j in 1:dat.cols) var.tau.RC <- var.tau.RC + dat[i,j]*(-2*v*(sum.row[i]/N)+d*((2*dat[i,j]/sum.col[j])-sum((dat[,j]/sum.col[j])^2))-f)^2/(N^2*d^4); ASE.tau.RC <- sqrt(var.tau.RC); z.tau.RC <- tau.CR/ASE.tau.RC; U.tau.RC <- (N-1)*(dat.rows-1)*tau.RC; # Chi-squared approximation for H0 according to Margolin & Light 1974, see also Liebetrau 1983 p.tau.RC <- pchisq(U.tau.RC,df=(dat.rows-1)*(dat.cols-1),lower=FALSE); results <- data.frame(tau.RC, tau.CR, p.tau.RC, p.tau.CR, var.tau.RC, ASE.tau.RC, ASE.tau.CR); return(results); } Hope this helps. -Antti -- ====================================================================== Antti Arppe - Master of Science (Engineering) Researcher & doctoral student (Linguistics) E-mail: [EMAIL PROTECTED] WWW: http://www.ling.helsinki.fi/~aarppe ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.