Not really a direct answer on your question, but: > system.time(replicate(10000,apply(as.matrix(theta), 1, rasch, b_vector))) user system elapsed 4.51 0.03 4.55
> system.time(replicate(10000,theta%*%t(b_vector))) user system elapsed 0.25 0.00 0.25 It does make a difference on large datasets... Cheers Joris On Wed, Jun 2, 2010 at 4:44 PM, Doran, Harold <hdo...@air.org> wrote: > I have a function that I am currently using very inefficiently. The > following are needed to illustrate the problem: > > set.seed(12345) > dat <- matrix(sample(c(0,1), 110, replace = TRUE), nrow = 11, ncol=10) > mis <- sample(1:110, 5) > dat[mis] <- NA > theta <- rnorm(11) > b_vector <- runif(10, -4,4) > empty <- which(is.na(t(dat))) > > So, I have a matrix (dat) with some values within the matrix missing. In my > real world problem, the matrix is huge, and most values are missing. The > function in question is called derivs() and is below. But, let me step > through the inefficient portions. > > First, I create a matrix of some predicted probabilities as: > > rasch <- function(theta,b) 1/ (1 + exp(b-theta)) > mat <- apply(as.matrix(theta), 1, rasch, b_vector) > > However, I only need those predicted probabilities in places where the data > are not missing. So, the next step in the function is > > mat[empty] <- NA > > which manually places NAs in places where the data are missing (notice the > matrix 'mat' is the transpose of the data matrix and so I get the empty > positions from the transpose of dat). > > Afterwards, the function computes the gradient and hessians needed to > complete the MLE estimation. > > All of this works in the sense that it yields the correct answers for my > problem. But, the glaring problem is that I create predicted probabilities > for every cell in 'mat' when in many cases they are not needed. I end up > replacing those values with NAs. In my real world problem, this is horribly > inefficient and slow. > > My question is then is there a way to use apply such that is computes the > necessary predicted probabilities only when the data are not missing to > yield the matrix 'mat'. My desired end result is the matrix 'mat' created > after the manually placing the NAs in the appropriate cells. > > Thanks > Harold > > > derivs <- function(dat, b_vector, theta){ > mat <- apply(as.matrix(theta), 1, rasch, > b_vector) > mat[empty] <- NA > gradient <- -(colSums(dat, na.rm = TRUE) - > rowSums(mat, na.rm = TRUE)) > hessian <- -(rowSums(mat * (1-mat), na.rm = > TRUE)) > list('gradient' = gradient, 'hessian' = > hessian) > } > > > > > sessionInfo() > R version 2.10.1 (2009-12-14) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252 LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C LC_TIME=English_United > States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] tools_2.10.1 > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.